Design of a Reactive DNS Measurement System
Total Page:16
File Type:pdf, Size:1020Kb
Design of a Reactive DNS Measurement System Jochem Braakhuis 13-07-2021, University of Twente [email protected] Abstract—For speed, robustness and safety of the Internet, it is will then delegate the query to the appropriate top-level domain important that information on name servers in every level of the name server dependent on the entered domain (.com, .org, .nl). hierarchical DNS structure is accurate and consistent. OpenIN- Then, the top-level domain name server will in turn delegate TEL, designed in 2016, measures about half of the global names- the query to the authoritative name server for the entered pace every day, but lacks flexibility in reactive measurement. To domain, which finally returns the IP address of the domain’s provide an easily scalable framework for performing research server to the user, or delegate it to authoritative name servers in DNS inconsistencies, we designed and tested a reactive DNS measurement system compatible with OpenINTEL, supported by for further sub-domains. Section II further elaborates on this Apache Kafka message broker functionality. The software can process. query multiple levels in the DNS hierarchy and supports 10 query In addition to storing the IP addresses of name servers types. Initial high-volume testing of the resolver authoritative level shows the software is able to handle large volumes of and domains, name servers also contain other records about queries, but only gives a lower bound on the performance of the domains in their zone. For example, MX records contain the software. Testing of more authoritative levels, testing on purpose- domain names of the servers that handle email traffic for the built hardware and implementation of extended decoder logic and domain, and their priority. A lot of records are related to performance optimisation is required to successfully deploy the security. DNSSEC is a system that was implemented to verify software. the integrity of DNS traffic and records, and the DNSKEY record for example contains the public key that can be used I. INTRODUCTION to verify signatures. [1] In total, there are over 45 different records that can be stored on a name server [2]. The Domain Name Service (DNS) is a very important part of the modern Internet, providing a service that can translate Figure 2 shows the response of a query for the IPv4 address human readable domain names into the IP address of their of utwente.nl. We provide a more detailed description of the server. DNS uses a hierarchical structure, starting at the Root. form of a DNS resource record and the DNS response in This hierarchical structure of DNS can be seen in figure 1. section II. Fig. 2. DNS Response for an A Query of utwente.nl To make sure that the Internet remains fast and secure, it is important to monitor DNS name servers. Since DNS is such an important link between users of the Internet and the domains they want to visit, impact of failure of this system can be huge. If a name server contains inaccurate information, it is possible the IP address of a domain becomes unresolvable, or hackers Fig. 1. The Hierarchy of DNS can take control of a misconfigured name server and lead users to the wrong domain. Research into the effects and prevalence of inconsistency between name servers in different levels of the All DNS queries, provided the resource record for that hierarchy has shown that this is a relevant problem, and while query is not cached, go to the Root servers first. The Root this research was in part performed by retrieving information from different levels of the DNS hierarchy, and retrieving and VII describes the setup used to perform high-volume testing comparing information form multiple name servers of the same of the software, the results of which are listed in section domain, the measurements were performed in a non-scalable VIII. Lastly, section IX contains the general conclusion, and way. To perform these types of research more easily, a scalable recommendations for future work. and flexible reactive DNS measurement system would be ideal. Reactive measurements in this sense are DNS measurements II. BACKGROUND performed on demand as a reaction to a trigger. Triggers can be for example the registration of a new domain, the issuing DNS Hierarchy Levels of a new certificate of a domain, etc. In this thesis, we will frequently discuss the different levels Multiple large scale active DNS measurement systems have of DNS hierarchy. In this subsection we will elaborate on the already been developed. For example in 2016, researchers at distinction between a so-called ”parent” name server and a Georgia University of Technology developed such a system ”child” name server. We define the ”child” name server as an [3], just like researchers at the University of Vienna in 2015 authoritative name server of a domain. The plural form of this [4], the latter of which focused on the security of email term, i.e. ”children”, refers to all of the authoritative name systems. Another one of these systems, and the system that we servers of a domain. Conversely, ”parent” name server refers will be focusing on most in this paper, is the system developed to the name server of the authoritative level above the domain. by researchers at the University of Twente in 2016 [6]. This For example, the parent of ewi.utwente.nl is utwente.nl, so system is called OpenINTEL, and it measures around 50% of the parent name server is the authoritative name server of the Internet’s Second Level Domain DNS records every day. utwente.nl. When the domain is a second-level domain, such OpenINTEL provides the ability and scalability to perform this as utwente.nl, its parent is the top level domain, and the parent task, however it has certain limitations. Mainly, the system name server is this TLD’s name server. lacks flexibility in being able to perform diverse reactive mea- surements. OpenINTEL takes a bulk of measurement requests The DNS Resolver for second level domains every day, and it requests the same The process by which a domain name is converted to set of predefined queries every day. It lacks any possibility of an IP address is called resolution of a domain name, and reacting to triggers or performing on-demand measurements. this function, as well as the lookup of other DNS records, The main objective of this thesis is to design a system is provided by a recursive resolver. A resolver takes a DNS that can be used in addition to OpenINTEL, sharing the same query and returns the answer, going through the hierarchical data formats and general setup, but that is more flexible in the levels as shown in figure 1. Web browsers mostly use resolver types of queries it can perform and is able to query multiple functionality provided by the ISP, but there are also public levels in the DNS hierarchy. The system should be dynamically DNS resolvers, such as the Google Public Resolver and the scalable and have a performance speed in the same order as Cloudflare DNS Resolver. It is also possible to install a OpenINTEL. recursive DNS resolver on local hardware, we call this a ”local resolver”. In order for our system to maintain compatibility with OpenINTEL, it will use the same data structure. The reasoning DNS Response Format behind maintaining compatibility with this data structure is to allow for the reuse of data analysis code developed for previous The format of a DNS resource record is described in works on OpenINTEL. Section II contains more information RFC1035 [5]. This format is shared between the answer, about the OpenINTEL data format. authority and additional sections. Table I shows the fields in a resource records, with a short description of what the field To accomplish this design, several research questions can contains. be formulated. The following questions will be discussed in this report: TABLE I. THE FORMAT OF A DNS RESOURCE RECORD Field Description 1) How can the system query different hierarchical lev- NAME The domain name the record belongs to els? TYPE The record type, e.g. A, NS, MX etc. 2) How can a DNS response be converted to the desired CLASS The class of data in the record TTL The time that the record may be cached data format? RDLENGTH The length in octets of the RDATA field 3) How can dynamic scalability be achieved? This contains the actual information contained 4) Can the performance in terms of measurements per RDATA in the record, e.g. the domain’s IP address, second be compared to OpenINTEL? its name servers, or its mail servers Structure of the Thesis When a name server is queried, it will return the appropri- ate resource record. It is important to note here that the data Section II lists important background concepts for the contained within the record is stored in just 1 field, RDATA. design of our software, in section III we list existing work This means we have to implement some sort of decoder logic related to the objective of this thesis, and start to form in our software to convert this single field into information that design objectives. Section V list the design objectives in a can be stored in the fields of our output data format. concise manner, so they can be referred back to. In section V we list and motivate design decisions, and in section VI Figure 3 shows the DNS response for an MX query of the actual implementation of the design is described. Section utwente.nl. This response contains in order the NAME, TTL, CLASS, TYPE and RDATA fields from the resource record.