DNS Consistency Model

Thesis submitted in partial fulfillment of the requirements for the degree of

Master of Science (by Research) in Computer Science

by

Manish Kumar Sharma 200502011 [email protected]

Center for Security, Theory & Algorithmic Research (CSTAR) International Institute of Information Technology Hyderabad - 500 032, INDIA April 2012 Copyright c Manish Kumar Sharma, April 2012 All Rights Reserved International Institute of Information Technology Hyderabad, India

CERTIFICATE

It is certified that the work contained in this thesis, titled “DNS Consistency Model” by Manish Kumar Sharma, has been carried out under my supervision and is not submitted elsewhere for a degree.

Date Adviser: Dr. Bruhadeshwar Bezawada The only place where success comes before work is in the dictionary. - Donald Kendall To my loving parents and grandparents. Acknowledgments

First of all, I would like to thank Dr. Bruhadeshwar Bezawada for his constant support and able guidance for the past three years. I gratefully acknowledge Dr. Bruhadeshwar for introducing me to the field of DNS security. I would like to thank my parents for constantly supporting me throughout my highs and lows. As a child, I was like a lump of clay. My parents and teachers moulded and shaped me into a beautiful pot and Dr. Bruhadeshwar crafted designs on the pot to enhance its beauty. In other words, my parents with their initial guidance helped me to reach a stage from where I could easily comprehend the teachings of my teachers and finally Dr. Bruhadeshwar guided me through the path to success. I would also like to thank my siblings and friends without whose motivation I could not have traveled such a long distance. I would like to thank Basant Sharma who helped me in solving my doubts. I would like to thank my sister late Vedika Saraswat who through her success inspired me and helped me in my studies. At the end, I would like to thank my college-mates and the dearest friends - ”Mohit Goyal, Hemant Dhingra, Nitin Jain, Subroto Sen, Yogesh Nautiyal, Abhishek Sainani and many others” who were there with me throughout my college life and are like a family to me.

vi Abstract

Domain Name System(DNS) is an inevitable component of the critical infrastructure of the . It is a hierarchical distributed database system which provides a crucial service for the internet i.e. the mapping of human-friendly domain names to their respective machine-friendly IP addresses and vice versa. Almost all internet-based applications including http, ftp and email, need to resolve a given to its respective IP address prior to establishing connections. DNS provides the mapping service which is fundamental not only to the health of the Internet but also to the protection and integrity of the data. In case, mapping of a domain name to an IP address in the system gets corrupted, the system would no longer be acceptable. Being probably the most valuable infrastructure in the Internet, its security is of utmost priority. The domain names in a DNS database are stored in the form a hierarchical tree structure which is known as domain name space. Each node in the tree contains zero or more resource records which hold information associated with a domain name. DNS cache these resource records for a specific time period, i.e., TTL (Time-To-Live). TTL too plays an important role in maintaining the consistency of the cached resource records. Short TTL reduces the likelihood of getting old information but increases the DNS utilization whereas long TTL decreases the DNS utilization but at the same time increases the chance of retrieving outdated information. DNS was not earlier designed to save itself against different kinds of attacks such as cache poisoning, rebinding attacks etc. This is the reason that DNS servers have been manipulated by attackers to launch attacks, to commit click-frauds and to drive traffic to malicious websites. Among all the different kinds of attacks on DNS, DNS cache poisoning is the prominent one. DNS cache poisoning refers to the cases where the cache of a DNS server gets corrupted due to the injection of false mapping in the server which affects the accuracy of DNS lookups. Consequently, when queries arrive at the DNS server, inaccurate and probably malicious replies are sent as the response. False mapping can be injected into a DNS in many ways i.e. by dns spoofing, dns forgery etc. After one poisoned record is injected into the cache, it can spread to other parts of the cache or other servers through query/response between servers. DNS cache poisoning could be used by an attacker to redirect the querier to a non-existent IP address, thus causing Denial-of Service or the querier can be redirected to a malicious website which drops Malware/Spyware or s/he could even be redirected to attacker’s website, causing phishing attack. Till date, many different solutions have been proposed to overcome the problem of cache poisoning but none has been deployed successfully. Certain proposed solutions like DNSSEC, DNSCurve etc, found

vii viii to be efficient against cache poisoning but they have not been successfully deployed primarily because of the complexity involved in key management. Certain other solutions were neglected because either they required changes in the DNS protocol or they introduced considerable latency in the system making them an undesirable solution. Hence, in spite of all the solutions proposed till date to mitigate cache poisoning attacks, the problem still persists. To mitigate cache poisoning attacks, we have proposed an approach - Domain Consistency Management system (DCMS) which makes the use of response delays for a specific resource-record type between a DNS client and a DNS server. Our approach is similar to stimulus-response model in which a response is expected for a specific stimulus within a specific period of time. If the response is received out of the period then it could be the result of some flaw in the system. Similarly in our approach we expect the response of a DNS query for a specific resource-record type to be received within a certain period of time. In case the DNS response is received out of period we suspect an attack on the system and perform a check to ensure the consistency of the response received. The significant feature of our approach lies in its self-learning that in parallel it updates its database of response delays for a specific resource- record type between specific DNS client and DNS server, thereby helping the system not only in proper functioning, even it enhances its performance. DCMS makes cache poisoning attacks, almost infeasible, even for motivated and powerful attackers. The biggest advantage of our approach is that it does not require any change in the DNS protocol, hence it could be deployed on a large scale within a short period of time. It does not even require any changes at the server side as in WSEC-DNS to ensure the consistency of the responses. Since it does not involve any cryptography technique, hence any sort of key management is not required. Even our approach does not introduce any significant amount of latency to the system. After the introduction of IPv6, other approaches may require certain changes but DCMS needs not to be reconfigured. Hence, our approach, Domain Consistency Management System (DCMS), proves to be effective and efficient against cache poisoning attacks. Contents

Chapter Page

1 Introduction ...... 1 1.1 DomainNameSystem ...... 1 1.2 Background...... 1 1.3 Overview ...... 2 1.4 Domain Name Resolution ...... 3 1.5 DNSMessagePacket ...... 4 1.5.1 Identification ...... 4 1.5.2 Flags ...... 5 1.5.3 NumberofQuestions...... 7 1.5.4 NumberofAnswerRRs ...... 8 1.5.5 Number of Authority RRs ...... 8 1.5.6 Number of Additional RRs ...... 8 1.5.7 Questions...... 8 1.5.8 AnswerResourceRecords ...... 8 1.5.9 Authority Resource Records ...... 9 1.5.10 Additional Resource Records ...... 9 1.6 ResourceRecordTypes...... 10 1.7 NegativeCaching...... 10 1.8 ReverseDNSlookup ...... 10

2 Problem Statement ...... 12 2.1 DNSCachePoisoning ...... 12 2.1.1 DNSSpoofingAttack ...... 13 2.1.2 DNSForgeryAttack ...... 14 2.1.3 Kaminsky-class attack ...... 15 2.2 DNS Redirection Attack ...... 16 2.3 DNS Rebinding Attack ...... 17

3 Related Work ...... 19

4 Our Approach & Results ...... 23 4.1 OurApproach...... 23 4.1.1 Query Type and Response Type ...... 24 4.1.2 Query...... 27 4.1.3 TransactionID ...... 27

ix x CONTENTS

4.1.4 ResponseDelay...... 27 4.2 Probabilistic proof of our approach ...... 34 4.3 Optimum Value of Median Response Deviation ...... 35 4.4 Algorithmic Complexity ...... 35 4.5 Results...... 36

5 Analysis of TTL-based Caching in DNS ...... 38 5.1 Analysis of DNS response packets & associated TTL values...... 38

6 Conclusion & Future Work ...... 44

Bibliography ...... 47 List of Figures

Figure Page

1.1 ADNSmodel...... 1 1.2 DNSArchitecture...... 3

2.1 DNSCachePoisoning ...... 13 2.2 DNSSpoofingAttack...... 14 2.3 DNSForgeryAttack ...... 14 2.4 Kaminsky-class Attack ...... 15 2.5 DNS Redirection Attack ...... 16 2.6 DNS Rebinding Attack ...... 18

4.1 DNSConsistencyModel ...... 23 4.2 Response Delays for Server I (192.26.92.30) ...... 28 4.3 Response Delays for Server II (192.31.80.30) ...... 28 4.4 Response Delays for Server III (192.33.14.30) ...... 29 4.5 Response Delays for Server IV (192.43.172.30) ...... 29 4.6 Response Delays for Server V (192.55.83.30) ...... 29

5.1 ResponseTypesandTTL...... 39

xi List of Tables

Table Page

1.1 DNSMessagePacket ...... 4 1.2 FlagsinDNSMessagePacket ...... 5 1.3 Opcodes and their operation ...... 5 1.4 RCodes and their meaning ...... 7 1.5 Question Section ...... 8 1.6 AnswerSection...... 9 1.7 Summary of Resource Record Types ...... 11

4.1 Probable Response RR types (in percentage) for a Query RR type ...... 24 4.2 Median Response Delay for Server I (192.26.92.30) ...... 30 4.3 Median Response Delay for Server II (192.31.80.30) ...... 30 4.4 Median Response Delay for Server III (192.33.14.30) ...... 30 4.5 Median Response Delay for Server IV (192.43.172.30) ...... 30 4.6 Median Response Delay for Server V (192.55.83.30) ...... 31 4.7 Response Delays (in secs) for Query Types ...... 31 4.8 Response Delays for given queries...... 32 4.9 Algorithm for DNS Consistency Management System ...... 33 4.10 Summary of packets for Five different Servers ...... 36

5.1 Zero TTL Responses(in percentage) for different Response Types ...... 41 5.2 Minimum TTL (in seconds) for Response Types ...... 42 5.3 Median TTL (in minutes) for Response Types ...... 42 5.4 Maximum TTL (in minutes) for Response Types ...... 43

xii Chapter 1

Introduction

1.1

DNS (Domain Name System/Service/Server) is a service that translates domain name into IP address. Since, the Internet is based on IP addresses and humans find it difficult to remember long, complicated sequence of numbers, DNS was invented. Humans find it easier to remember domain names since they are alphabetic. A model depicting translation of a domain name into its corresponding IP address has been shown in Fig. 1.1.

Connection Established. Google Server

74.125.67.100

Query google.com D Client Response N 74.125.67.100 S

Figure 1.1 A DNS model

1.2 Background

With the growth of Internet, IP addresses were increasing at a rapid rate. To connect with a computer on a network, you needed to know its IP address. Since IP addresses were no longer limited, humans found it difficult to remember the numerical IP addresses. To deal with the problem, computers on the network were supplied with hosts.txt [1,2] from a computer at SRI International which mapped names

1 to numerical addresses but the continuous rapid growth of Internet made it difficult to centrally maintain hand-crafted hosts.txt file. Hence, it became necessary to implement a more scalable system which can easily disseminate the mapping between names and IP addresses. The idea to create a structured topology where names would be organized into domains was first intro- duced by D. L. Mills in 1981 [3]. In 1983, Domain Name System was invented by who wrote its first implementation. The original specifications related to DNS were first published in 1983 [4,6] which were replaced by the specifications published in 1987 [7,8].

1.3 Overview

The Domain Name Server (DNS) [7,8] plays an essential role in the operation of Internet by trans- lating human-friendly domain names to machine-friendly IP addresses and vice versa. DNS is a hier- archical distributed database that uses intensive replication and caching to achieve high scalability and resiliency to server failures. The basic functions performed by Domain Name System are as follows:

• Domain Name Space: It is defined by DNS for the networking system upon which it runs. It defines the rules for structuring of names and their usage, the relation between the name of one device and the names of other devices in the system, and ensures that no invalid names are given as it would cause problems with the system as a whole. The domain name space consists of a tree of domain names. Each node in the tree contains zero or more resource records which hold information associated with a domain name. The tree, sub- divided into zones, begins at the root zone. A DNS zone may consist of one or more domains and sub-domains, depending on the administrative authority delegated to the manager. Administrative responsibility over a zone is divided by creating additional zones. Authority is said to be delegated in the form of sub-domains to another nameserver and administrative entity.

• Domain Name Registration: To have a domain name assigned to a specific IP address it is first registered with a domain name registrar. The installation of a domain name at the domain registry of a top level domain requires the assignment of a primary nameserver and at least one secondary nameserver. The secondary nameserver is required in order to keep the domain functional even if one nameserver becomes inoperable or inaccessible. While registering a domain name, it has to be made sure that the domain name being assigned to an IP address is unique otherwise it would lead to malfunctioning of DNS.

• Domain Name Resolution: This is the most important function of DNS which we would be focusing upon throughout our presented work. Whenever a client queries for a domain name, DNS server resolves the domain name, that is, translates the domain name into its IP address and responds to the client with the address.

2 1.4 Domain Name Resolution

DNS plays an important role in virtually every Internet transaction. Almost all internet-based appli- cations including http, ftp and email, need to resolve a given domain name to its respective IP address prior to establishing connections. When a client enters a URL in the browser’s address bar, the browser first queries the resolver residing in the operating system on the client’s machine. In case the resolver does not have the answer, the query is forwarded to the local DNS server for the IP address corre- sponding to the URL in order to make a HTTP connection to the desired machine. DNS server stores the mapping(Domain to IP address) as Resource Records. A Resource Record (RR) is the basic data element in the domain name system. Each record has a type (A, MX, etc.), an expiration time limit, a class, and some type-specific data. In case, the local DNS server does not have the resource record for the specific query, the local DNS iteratively/recursively queries the other DNS servers to fetch the desired resource record as shown in Fig. 1.2. The resource record, thus fetched, is stored in the local DNS as well as in the resolver’s cache so that if a client queries for the same query in future, it could receive the answer without query being forwarded to other DNS servers. To simplify, the process of

. Root DNS

DNS1 ABC Top−Level DNS HTTP D 2 a b c DNS FTP Resolver N DNS3 i ii iii Authoritative DNS Client S . Applications .

Client

Figure 1.2 DNS Architecture

name resolution could be seen being performed between two entities: name resolver and name server, which could also be termed as DNS client and DNS server respectively.

• Name Servers: Name servers contain resource records that describe names, addresses, and other characteristics of those portion of the namespace. Name servers may be categorized into three types [5]:

– Caching-Only Name Servers – Authoritative-Only Name Servers – Name Servers that perform caching and are authoritative for a zone

Since the authoritative-only name servers do not support recursive name resolution and are de- signed to respond to queries based on the DNS resource records for which they are authoritative, they are immune to cache poisoning attacks. The other types of name servers which allow caching are the ones which are prone to cache poisoning attacks.

3 • Name Resolvers: Name resolvers are the clients in name resolution process. A name resolver initiates and sequences the query which ultimately lead to the full resolution of a domain name. A DNS query can be of two types:

– Non-Recursive Query - In non-recursive query, name server provides resource record for the domain for which it itself is authoritative or it provides a partial result without querying other name servers. – Recursive Query - In recursive query, name server fully resolves the query on behalf of the resolver by querying other name servers as needed.

Name Resolvers cache the responses from the nameserver for future use.

1.5 DNS Message Packet

The information exchange between client and server DNS is facilitated using DNS message packet. The DNS message packet encompasses all the information required to be communicated between client and server DNS. The information in the packet is divided in different individual sections as shown in Table 1.1. Both queries and responses have the same message format. All different sections of DNS message

Identification Flags Number of Questions Number of Answer RRs Number of Authority RRs Number of Additional RRs Questions Answer Resource Records Authority Resource Records Additional Resource Records

Table 1.1 DNS Message Packet packet are found in both the queries and responses except the answer section which is only found in response message packet . In case of query message packet, the answer section is null. The details of different sections of DNS message packet is provided below.

1.5.1 Identification

Whenever a query is sent by a client, query message packet contains a Transaction ID or Identifica- tion. Similarly, the response message packet received by the client also contain a transaction ID. Since it is not necessary that the response for an earlier sent query will also arrive earlier, the transaction ID or Identification is required to map the query with its corresponding response. Transaction ID is a field with 16-bits, hence can have 216 = 65536 values. It is created by the query

4 initiator and is copied by the responder into its response message. When a DNS server does not have the response of a query requested by the client, it forwards the query to other DNS server with different transaction ID. Similarly, when DNS server sends the response (fetched from other DNS server) back to the client, it creates a new response packet with the same transaction ID as mentioned in the query packet sent by the client.

1.5.2 Flags

DNS Flags is a 16-bit field containing various service flags that are communicated between the DNS client and the DNS server. The service flags are shown in Table 1.2 and explained in later section.

QR Opcode AA TC RD RA Z AD CD RCode

Table 1.2 Flags in DNS Message Packet

• QR - It stands for Query/Response. It is a 1-bit field. If the value is 0, it means it is a query packet else it is a response packet.

• Opcode - It stands for Operation Code. It is a command given to the DNS server to perform some actions. It is a 4-bit field where each different value specifies different operation. Currently Opcodes [58] are assigned as given in Table 1.3.

Opcode Operation Description 0 Query Standard Query 1 IQuery Inverse Query (now obsolete). 2 Status A server status request. 3 Unassigned Reserved for future use. 4 Notify It is used by primary(authoritative) server to notify secondary servers that data for a zone has been changed and prompt them to request for a zone transfer [63]. 5 Update It is used to implement ”Dynamic DNS” [64]. 6-15 Unassigned Reserved for future use.

Table 1.3 Opcodes and their operation

• AA - It is a 1-bit field which stands for Authoritative Answer. If a response has the bit set to 1 then the response has come from authoritative server else from a non-authoritative one.

• TC - It is a 1-bit field which stands for Truncation. If set to 1, it indicates that the message has been truncated because of its length being larger than than the maximum permitted for the type

5 of transport mechanism used. TCP [60] doesn’t have a length limit for messages while UDP [59] messages are limited to 512 bytes, so this bit being sent usually is an indication that the message has been sent using UDP and was too long to fit. The client may need to establish a new TCP session to get the complete message. On the other hand, if the portion truncated belonged to Additional Section, it may choose not to bother.

• RD - It is a 1-bit field which stands for Recursion Desired. If a query packet has the bit set to 1, it implies that the server receiving the query has been asked to answer the query recursively. The server must support recursive resolution to answer the query recursively. And in case the bit is set to zero, the server will respond with the resource record only if it has it in its cache otherwise the server might respond by providing a referral, that is, a list of NS and A resource records for other DNS servers which are closer to the name queried by the client. The value of this bit is not changed in the response packet.

• RA - It is a 1-bit field which stands for Recursion Available. The bit set to 1 or 0 in a response packet indicates whether the server creating response packet supports recursive resolution or not. This can be noted by the resolver for future use.

• Z - It is a 1-bit field which is reserved for future use and set to zero for all queries and responses. Initially it used to be a 3-bit field out of which last two bits (AD & CD) have been put to use.

• AD - It is a 1-bit field which stands for Authentic Data. If this bit is on in a response packet, it indicates that the data included in the packet has been verified by the DNS server otherwise not. DNS client must not trust the AD bit unless it trusts the DNS server it is transacting with and have a secure path to it.

• CD - It is a 1-bit field which stands for Checking Disabled. If this bit is on in a query packet, it indicates that non-verified data is acceptable by the DNS client otherwise not.

• RCode - It is a 4-bit field and stands for Response Code. It is set to zero in a query packet. In a response packet, it is changed by the DNS server to convey the result of the processed query, that is, whether the query has been successfully answered or some error occurred. RCode can appear not only at the top level of a DNS response but also inside OPT RRs [61], TSIG RRs [12], and TKEY RRs [62]. Different values of RCode convey different meaning as given in the Table 1.4.

6 RCode Name Description 0 No error Query has been answered successfully. 1 Format error The server was unable to interpret the request due to error in its format. 2 Server failure The server encountered an internal fail- ure while processing the request, e.g., an operating system error. 3 No such domain The domain name specified in the query does not exist. 4 Not implemented The server does not support the type of the query received. 5 Refused The server refuses to perform the spec- ified operation for policy or security reasons. 6 YX Domain A domain name that should not exist, does exist. 7 YX RR Set A resource record that should not exist, does exist. 8 NX RR Set A resource record that should exist, does not exist. 9 Not auth The server is not authoritative for the zone specified. 10 Not zone A domain name specified in the mes- sage is not withing the zone specified in the message. 11-15 Unassigned Reserved for future use.

Table 1.4 RCodes and their meaning

1.5.3 Number of Questions

It is a 16-bit field representing the number of queries in the question section of the DNS message.

7 1.5.4 Number of Answer RRs

It is a 16-bit field representing the number of responses or resource records in the answer section of the DNS message.

1.5.5 Number of Authority RRs

It is a 16-bit field representing the number of authority resource records in the authority section of the DNS message.

1.5.6 Number of Additional RRs

It is a 16-bit field representing the number of additional resource records in the additional section of the DNS message.

1.5.7 Questions

DNS query packet contains at least one entry in the Question section which specifies what client is requesting for. The Question section is copied to the response message without any changes. The Question section has three fields shown in Table 1.5 and explained later.

QName QType QClass

Table 1.5 Question Section

• QName - It stands for Query Name and is a variable size field. Query Name is actually a domain name queried by a client. A domain name is represented as a sequence of labels where each label is separated by single dot.

• QType - It stands for Query Type and is a 2 bytes field. It specifies the type of the query being asked by the client from the server. This field contains the value corresponding to the particular type of resource record being requested. The resource record types with their value are explained in Section .

• QClass - It stands for Query Class and is a 2 bytes field. It specifies the class of the resource record being requested. In general cases, its value is 1 which stands for Internet (IN) class. In addition, classes such as Chaos(CH) and Hesoid(HS) exist.

1.5.8 Answer Resource Records

DNS response packet contains an Answer section which specifies the resource record what client requested for. The Answer section has six fields shown in Table 1.6 and explained later.

8 Name Type Class TTL RDLength Rdata

Table 1.6 Answer Section

• Name - It stands for Response Name and is a variable size field. It contains the response (resource record) of the query queried by the client.

• Type - It stands for Response Type and is a 2 bytes field. It contains the value for the resource record type of the response. The resource record types with their values are explained in Section 1.6.

• Class - It stands for the Response Class and is a 2 bytes field. It specifies the class of the data present in the response. Normally, its value is 1 which stands for Internet(IN). In addition, classes such as Chaos(CH) and Hesiod(HS) exist.

• TTL - It stands for Time-To-Live and is a 4 bytes unsigned integer that specifies the time interval (in seconds) that the response may be cached before it must be discarded. TTL with value zero is interpreted to mean that the response can only be used for the transaction in progress and should not be cached.

• RDLength - It stands for Response Data Length that specifies the length of the response data. It is a 2 bytes field.

• RData - It stands for the Response Data and is a variable size field. It contains the data corre- sponding the resource record type mentioned earlier. For example, in case a response type is A, RData contains the 4-byte IP address of the queried domain, in case the response type is CNAME, RData contains an alias name of the queried domain etc.

1.5.9 Authority Resource Records

Authority Resource Records have exactly the same format as of Answer RRs. When a DNS server does not have the information requested by a DNS client, it replies back to the client that it does not have the desired resource record and client can direct his query to authority name server mentioned in the authority section.

1.5.10 Additional Resource Records

Additional Resource Records too have exactly the same format as of Answer RRs. This section contains RRs which are related to the query but may not be the answer of the query mentioned in question section. For example, When DNS server does not have the queried resource record, it replies to the client with authority name server(s) mentioned in authority section and it also provides one or more IP addresses for the authority name server(s). This information associated to the authority name server(s) is mentioned in the additional section of the response and is known as glue record.

9 1.6 Resource Record Types

In DNS servers, information specific to a domain name is stored in the form of resource records. Each resource record has a certain type which specifies the nature of the record. A number of resource record types have been defined in several DNS standards and their list is maintained in a file at IANA (Internet Assigned Numbers Authority). During our research, we captured DNS packets of specific types which are mentioned in Table 1.7.

1.7 Negative Caching

Negative Caching implies the caching of the negative responses in the DNS. By negative responses, we mean the responses which indicates that specific resource records do not exist in the DNS. Negative Caching is useful as it reduces the response time for negative responses. It even reduces the number of messages that needs to be sent between the DNS client and DNS server, hence reduces overall network traffic. A large portion of DNS traffic can be reduced on the internet if all DNS implement negative caching. DNS must limit the period for which it will cache a negative response as DNS supports caching up to 68 years. Negative responses must be cached for the TTL which is the minimum of the MINIMUM field and the TTL field of the SOA record mentioned in the negative response. In case, the negative responses do not contain SOA record then such responses must not be cached as there is no way to prevent the negative responses looping forever between a pair of servers even with a short TTL. Caching of negative responses with high TTL values may even lead to a Denial of Service (DoS) whereas without negative caching it would be much harder to achieve but no negative caching would hamper the performance of DNS. Hence, determining a proper TTL value for caching negative responses plays a crucial role in improving the performance of DNS [28].

1.8 Reverse DNS lookup

Reverse DNS lookup (also known as rDNS) is a process by which someone can determine the host- name associated with a given IP address. Reverse DNS is sometimes also called as reverse resolving. It can be used to check the consistency of the resource records stored in the DNS server. This technique is known as the Forward Confirmed reverse DNS (FCrDNS) which could be used to avoid spammers and phishers. In our work, by Authoritative reverse DNS lookup, we mean the response for reverse DNS lookup is fetched from the authoritative server. Several tools such as dig, nslookup etc, could be used to perform authoritative reverse DNS lookup. Even certain websites such as http://www.ipchecking.com/, http://remote.12dt.com/ etc, do provide the authoritative reverse DNS lookup.

10 Resource RR Type Value Meaning Description Record Type A 1 IPv4 Address Record It returns a 32-bit IPv4 address. AAAA 28 IPv6 Address Record It returns a 128-bit IPv6 address. A6 38 IPv6 Address Record(Experimental) It returns a 128-bit IPv6 address. ANY 255 All cached records It returns all records of all types of a domain name known to the name server. CNAME 5 Canonical Name It returns an alias of a domain name. MX 15 Mail Exchange Record It returns the mail servers of a domain name. NS 2 Name Server Record It returns the authoritative name servers for a domain name. PTR 12 Domain Name Pointer Record It returns the pointer to the do- main name. These records are used for reverse DNS lookups (explained in Sec.) SOA 6 Start of Authority Record It returns the authoritative infor- mation about a DNS zone as de- fined by the zone administrator. It includes primary name server, mail address of the domain ad- ministrator, domain serial num- ber and several timers relating to refreshing the zone. SPF 99 Sender Policy Framework Record It returns the servers which are authorized to send mail for a do- main. It is primarily used to pre- vent identity theft by spammers. SRV 33 Service Selection It returns the service available in the zone, for example, ldap etc. TXT 16 Text Record It returns the textual description of a domain name. Unknown(99) Not defined Unrecognised resource record The RDATA format of such records is not known to DNS im- plementation.

Table 1.7 Summary of Resource Record Types

11 Chapter 2

Problem Statement

DNS server is regarded as distributed, hierarchical, and redundant database for information associ- ated with Internet domain names and addresses. Since long time, DNS servers have been manipulated by attackers to launch phishing attacks. Currently DNS servers are being manipulated not only for phishing attacks, but also for committing click-fraud and to drive traffic to malicious websites. In this paper, we have dealt with the problem which can be stated as follows: Whether the response received by the DNS client is coherent with the query sent by it or not. The re- sponse to a DNS query can be maligned by launching many different kinds of attacks against DNS such as DNS redirection attack or DNS forgery. These attacks are different in the way they are launched but the aim of all attacks is same i.e. to provide wrong IP address to a DNS query.

2.1 DNS Cache Poisoning

DNS cache poisoning is a maliciously created or unintended situation that provides data to a Do- main Name Server that did not originate from authoritative DNS sources. Since at least 1993 [45], cache poisoning has been a known vulnerability in DNS and it is still prevalent. It can happen through improper software design, misconfiguration of name servers, and maliciously designed scenarios ex- ploiting the traditionally open-architecture of the DNS system. Once a DNS server has received such non-authentic data and caches it for increase in future performance, it is considered poisoned, supplying the non-authentic data to the clients of the server as shown in Fig 2.1. The response to a DNS query can be maligned by either of the following attacks which in turn leads to DNS cache poisoning:

• DNS Spoofing - It is a very lethal form of Man-in-the-middle(MITM) attack which could lead to deceptively stealing of credentials, installation of malwares with a drive-by exploit or even Denial-of-Service(DoS) condition.

12 xyz.com Attacker’s server server

21.10.19.13 17.0.9.22

(Query) xyz.com Local Client DNS 17.0.9.22 (Poisoned) (Response)

Figure 2.1 DNS Cache Poisoning

• DNS Forgery - This form of attack is a packet race as it involves beating the real answer to a DNS query back to the DNS server. The DNS server accepts whichever response arrives first as long as the arriving response matches a few transactional requirements.

• Kaminsky Class Attack - This attack is similar to DNS forgery attack but to launch it requires a little effort as compared to the forgery attack. The main strength of this attack lies in its attack time as it is considerably low i.e. a few minutes.

2.1.1 DNS Spoofing Attack

DNS Spoofing Attack is a MITM (Man In The Middle) technique used to supply false DNS resource records to a client. A client can be a DNS server or a host machine. Every query packet being sent over the network contain a unique Transaction ID which is used for mapping the queries to their correspond- ing responses. If an attacker can intercept the query packets being sent to a DNS server, he would get the Transaction ID mentioned in the packet and the source port from which the query was generated. To intercept the query packet of a client sitting in a local area network, the attacker could ARP cache poison the target client to re-route its traffic through the attacker’s server [37]. Interception of packets being sent from a local DNS server to a global DNS server is not generally observed as it is a quite difficult task to achieve but it could be achieved with the help of certain attacks similar to Coremelt Attack [87]. Using the desired information from the intercepted packet, the attacker would create a fake response packet with the same Transaction ID and would sent it to the client on the same port from which query was generated. The packet would be accepted by the client and would lead to DNS cache poisoning. DNS Spoofing Attack is different from DNS redirection attack (covered in Section 2.2) as in redirec- tion attack the client’s DNS settings are changed whereas DNS spoofing is a MITM attack. To depict difference between the two attacks, DNS spoofing attack is shown on a DNS server in Fig. 2.2.

13 Attacker’s DNS

Response Query

Diverted Client Query Local Global DNS Normal DNS Response

Figure 2.2 DNS Spoofing Attack

2.1.2 DNS Forgery Attack

DNS query sent by a DNS server contains a 16-bit nonce or Transaction ID which is used to identify the response associated with a given query. To launch DNS forgery attack, an attacker has to guess the 16-bit nonce and 16-bit port number from which the query was generated. If the attacker can success- fully guess the value of the transaction ID and source port and returns a response to the DNS server before the authority(or genuine) server does, the server will accept the attacker’s response as valid and will cache the malicious response as shown in the Fig. 2.3.

Query Query ID=0x90bd ID=0x83ac A? xyz.com A? xyz.com

Local Client Global DNS DNS

Response Response ID=0x90bd ID=0x83ac IN A xyz.com IN A xyz.com 16.92.100.23 103.19.10.53

Attacker

Response Response Response ID=0x65ad ID=0x83ac ID=0x1db9 IN A xyz.com IN A xyz.com IN A xyz.com 16.92.100.23 16.92.100.23 16.92.100.23 Figure 2.3 DNS Forgery Attack

An attacker can be very effective if he can easily guess the transaction ID and the source port. Early versions of DNS servers were more prone to this attack as they deterministically incremented the ID field. Even the source port of the query used to be constant for a complete session i.e. from startup to shutdown. It was D. J. Bernstein who first suggested the use of random source port to make it difficult for an attacker to forge DNS responses as he has to guess the random 16-bit source port, also along with the transaction ID [52]. Since, first 1024 ports i.e. from 0 to 1023, are well known ports [53] and are

14 reserved for specific protocols, the attacker needs to guess the source port out of 216 − 1024 = 64512 values. In [46][47], Klein demonstrated that if the ID field and source port are not securely randomized, a DNS server can be attacked successfully after a few interactions with the server. Several techniques such as birthday attacks have been published to exploit weak random number generation [46][47][48][49][50][51].

2.1.3 Kaminsky-class attack

Recently, published a novel technique to replace cached NS records in a DNS server by performing a series of nonce queries [54]. Kaminsky-class attack (as shown in Fig. 2.4) is similar to DNS forgery attack but it allows an attacker to perform successful cache poisoning with little effort compared to forgery attack.

Query Query ID=0x90bd ID=0x83ac A? 001.xyz.com A? 001.xyz.com

Local Client Global (under the control DNS DNS of attacker)

Response Response ID=0x90bd ID=0x83ac NXDOMAIN NXDOMAIN IN NS xyz.com IN A 6.92.100.23

Attacker

Response Response Response ID=0x65ad ID=0x1db9 ID=0x83ac NXDOMAIN NXDOMAIN NXDOMAIN IN NS xyz.com IN NS xyz.com IN NS xyz.com IN A 6.92.100.23 IN A 6.92.100.23 IN A 6.92.100.23 Figure 2.4 Kaminsky-class Attack

The way Kaminsky-class attack is launched is explained below: -

• Assume the attacker has a control over a stub-resolver(client) behind the local DNS. The client initiates a query for the IP address of a non-existing domain, say 001.xyz.com

• The local DNS will contact the global DNS servers to get the resource record for 001.xyz.com.

• In the meantime, the attacker will send lots of spoofed responses to the local DNS, which will pretend to come from the IP address of the contacted DNS server, and try to guess the correct transaction ID and UDP port. The attacker’s response will be ”No such name exists but you could get the desired resource record from the name server xyz.com whose IP address is 6.92.100.23.” The name server information will be provided in the authority section whereas the IP address would be provided in the additional section of the response.

• Eventually, a matching transaction ID and UDP port is generated by the attacker and the at- tacker uses the authority section of the matched packet to evict the previous cached NS record (if

15 any). The local DNS will update its cache with the information i.e. ”IP address of xyz.com is 6.92.100.23” and therefore the attack succeeds.

This novel approach reduces the attack time from weeks to a few minutes. If the spoofed attack fails to match the transaction ID and UDP port the first time, the attacker can quickly retry querying for another non-existing domain, say 002.xyz.com and so on, without the need to wait for the TTL of the domain which the attacker wants to poison to expire. The greatest advantage of Kaminsky-class attacks is that they can be used to replace genuine cache entries in a DNS without the need to wait for their TTL to expire. DNS vendors have modified their glue handling policies to better validate or reject the rogue NS update but Kaminsky-class attacks still pose a threat and continue to be target for persistent, ongoing, low-grade attacks [55].

2.2 DNS Redirection Attack

In DNS redirection or DNS Hijacking attack, a client is directed to use a rogue DNS server, which provides incorrect answers to queries or selective manipulation of answers as shown in Fig. 2.5. The DNS redirection attack aims at commercial gain, phishing or other abuses. DNS Redirection attack is primarily targeted at a host computer rather than a DNS server. Even if a

Rogue DNS Response

Query (Redirected) Client Genuine DNS

Figure 2.5 DNS Redirection Attack host caches the wrong mapping for a specific amount of time, no one else is going to be affected by it except the host as no other clients would contact the host for a domain name resolution. Hence, cache poisoning is restricted to the host for a limited time duration and would not spread. Study given in [78] confirms the presence of a class of infections that force victims to use remote, rogue DNS services. There have been many different malwares which lead to change in the host DNS settings. There are broadly three ways which lead to alteration of DNS settings of a client.

• Opt-in Model Clients are informed of the benefits of altering DNS settings and they self-select for the service.

16 • Opt-out Model Certain technologies adopted by ISP force adversely affected to opt for alternative DNS services.

• No-option Model Without notifying a client, DNS settings of the client are forcibly changed for malicious purposes.

In 2003, qhost trojan [38] altered host DNS settings and browser settings. After the initial DNS-altering virus in 2003, DNSChanger [41], a family of trojans, appeared in 2005 which did the same in a few lines of code. Attack by Zlob Trojan (or Trojan.zlob) did the similar DNS alterations [39]. The most famous example of malware changing the host DNS settings was the zcodec trojan which enticed users to install a ” video player” or codec. Actually, the host ”Name Server” registry key, which takes precedence over all DHCP-assigned DNS settings was altered by the trojan and the host redirected to a rogue DNS server [40]. In many cases, users do not even have a slight indication that the DNS answers are coming from a rogue DNS, and not a genuine DNS. Since, in most of the networks, users are free to select a DNS server of their choice and network administrators lack means to validate or monitor the answers from a resolver, DNS redirection attacks are difficult to detect.

2.3 DNS Rebinding Attack

Web browsers have same-origin policy which prevents a document or script loaded from one origin from getting or setting properties of a document from another origin [44]. DNS rebinding attacks sub- vert the browser’s same-origin policy by confounding the browser into aggregating network resources under the control of distinct entities into one origin, effectively converting browsers into open proxies. Using DNS rebinding, an attacker can circumvent firewalls to spider corporate intranets, exfiltrate sen- sitive documents, and compromise unpatched machines. An attacker can also hijack the IP address of innocent clients to send spam e-mails, commit click fraud and frame clients for misdeeds. DNS re- binding vulnerabilities permit the attacker to read and write directly on network sockets, subsuming the attacks possible with existing Javascript-based botnets, which can send HTTP requests but cannot read back the responses [25]. Attacks using DNS Rebinding are varied but they can be broadly classified into two categories:

• Firewall Circumvention: - DNS rebinding can be used by an attacker to access machines behind firewalls which he or she cannot access directly. The attacker, with direct socket access, can even interact with a number of internal services besides HTTP.

• IP Hijacking: - The attacker can also make use of DNS rebinding to access publicly available servers from a client’s IP address. This way the attacker can take the advantage of the target’s implicit or explicit trust in the client’s IP address.

17 DNS rebinding attack is very different from cache poisoning attacks and DNS redirection attack be- cause: -

• There is no attack on any genuine DNS, hence no cache poisoning occurs.

• Client’s DNS settings are not at all altered, hence no redirection occurs.

In fact, to launch a DNS rebinding attack, an attacker needs to register a domain name, such as at- tacker.com, where he can attract traffic by running certain advertisements. In the basic DNS rebinding attack, the attacker responds to the DNS queries for attacker.com with IP address of his or her own server with a short time-to-live (TTL) and serves visiting clients with certain malicious Javascript. When the attacker.com opens up in the client’s browser, Javascript issues a second request to attacker.com after a gap of a few minutes. Since the time-to-live for the domain was very short, the IP address for at- tacker.com would no longer be present in any cache. The DNS query would have to travel again all the way down to the attacker’s nameserver to fetch the IP address of attacker.com. This time the attacker binds the domain name to the IP address of a target server which is inaccessible from the public inter- net. The browser believes the two servers belong to the same origin because they share the same domain name, and it allows the script to read back the response. The script can easily exfiltrate the response, enabling the attacker to read arbitrary documents present in the internal server as shown in Fig. 2.6.

F I Client attacker.com R Attacker’s at time t1 E Web Server W A attacker.com at time t2 Information L L Target Server

Figure 2.6 DNS Rebinding Attack

18 Chapter 3

Related Work

Numerous solutions have been proposed in the past to prevent DNS cache poisoning. Most of the defense mechanisms proposed in past has focused on applying cryptography techniques to authenticate DNS records. Secret Key transaction authentication for DNS (TSIG) [12][13] uses symmetric key cryp- tography to authenticate client-server communications. However, the key distribution is not automatic and is impractical to establish trust relationships with every DNS server. To ensure the accuracy of DNS lookups defense mechanisms such as DNSSEC [14,15] have a few potential problems and can- not be relied upon. Firstly, DNSSEC requires a major overhaul to the current DNS system and global cooperation from all parties involved. Although it has been in development for more than 15 years, its wide deployment still remains to be seen. Secondly, the success of DNSSEC requires modifications to the DNS servers at every level, the availability of security-aware resolver for every operating system, and complex key management. Considering the fact that many DNS servers are currently running with known vulnerabilities, one probably cannot expect the complete deployment of DNSSEC in a short time frame. Thirdly, DNSSEC incurs a significant performance penalty because PKI (Public Key Infrastruc- ture) operations are known to be very computationally intensive and the signed DNS packets could be substantially larger than original [16]. Even sometimes, size of the DNSSEC key gets greater than Path Maximum Transmission Unit (PMTU) and hence the whole DNS packet gets dropped [26]. As an alternative to DNSSEC, DNSCurve [76] was recently proposed by Bernstein. DNSCurve uses high-speed elliptical curve cryptography, and simplifies the key management problem that affects DNSSEC. The main criticism against DNSCurve comes from the fact that no detailed specifications have yet been written (in the form of RFCs or Internet drafts) and even no implementation is publicly available. From the limited documentation currently available, DNSCurve seems superior to DNSSEC, specifi- cally in terms of key management, however, latter has been specified in great detail in several RFCs [69][70][71][72]. Also similar to DNSSEC, DNSCurve too requires significant changes at the root and top-level domain name servers. DNS Cookies [56], proposed by Bernstein, provide weak authentication of queries and responses and also create a new option (COOKIE OPT) to OPT RR. The DNS cookie is basically an HMAC [57] of the requester’s IP address and transaction. As compared to the other DNS transaction protection systems,

19 e.g. TSIG, DNS cookies require substantially more implementation. Specifically, it requires DNS query initiators and responders to make changes in the code to handle DNS cookies. Peer-to-Peer domain name cross-referencing (DoX) [16] technique requires a number of peers to check the consistency of a DNS response and it could be dangerous to rely on peers if they are subverted. Proactive caching of DNS records was proposed by Cohen and Kaplan in which they used renewal policies to refresh selected cache entries by issuing unsolicited queries. In addition, they proposed a simultaneous-validation (SV) in which the end-host would use an expired DNS entry to connect to the web server but the content will be served only if the DNS entry is validated by the SV queries. They focused primarily on improving cache hit rate [19]. Several approaches were studied by Cao and Liu to achieve strong cache consistency for the web, includ- ing adaptive TTL, polling-every-time, and proactive server invalidation. A server invalidation scheme can also be used for improving DNS cache consistency; however, the main difficulty is that the DNS server would have to maintain states for known clients. Invalidation, of course, does not protect against cache poisoning [20]. The common aim of Overlook (a name service) [21], DDNS (a P2P lookup service) [22], CoDNS [23] and CoDoNS [24] is to provide an alternative for the current aged DNS system. Such DNS protocols aim to achieve faster lookups and fewer failures whereas none of them address the accuracy of DNS lookups. The approach named SCOLD (Secure Collective Defense system) proposed by David Wilkin- son et al. aims at defending DNS against the DDoS attack. The key idea of SCOLD is to follow intrusion tolerance and network reconfiguration paradigm by providing alternate routes via a set of proxy servers and alternate gateways when the normal route for DNS update is unavailable or unstable due to network failure, congestion or DDoS attack. The SCOLD system defends against DDoS attacks by setting up in- direct routes between clients and target server via a collection of geographically separated proxy servers and alternate gateways [35]. The impact of DNS cache poisoning attacks and DNS performance with respect to QoDNS (Quality of Domain Name Service) metrics have been studied in [36]. These metrics of QoDNS are accuracy, availability, latency and overhead of DNS service. UDP source port randomization [17] was first proposed and implemented by Bernstein in djbdns [10]. The randomization of UDP port significantly increased the difficulty in launching poisoning attacks as the attacker has to guess the correct source port along with the correct transaction ID. Unfortunately, UDP source port randomization might not be effective in certain scenarios such as where network ad- dress and port translation (NAT/PAT) are placed in front of DNS resolvers as such translation reduces the randomness of the UDP source ports generated by DNS resolvers to a guessable port [68]. Dagon et al. proposed 0x20-bit encoding technique [55] which uses a random combination of lower and upper case letters to generate domain name queries and works independently from the presence of NAT/PAT placed in front of DNS resolvers. Unfortunately, the amount of additional entropy introduced by the technique is a function of the length of the queried domain name, i.e. in case the length of the domain name is short the entropy too will be small, e.g., for short popular domain names like hp.com,

20 hi5.com, cnn.com, etc, the 0x20-bit encoding technique only adds 5 or 6 bits of entropy. This tech- nique makes poisoning attacks a little harder but surely not infeasible. For several other popular domain names(according to Alexa) which contain even less than 6 alphabetic characters e.g. 56.com, 163.com, 126.com etc. 0x20-bit encoding technique offers only 3 additional bits of entropy [77]. WSEC DNS, an approach proposed by Robert et al., is based on wild-card domain names. Those who want to protect their domain names against possible cache poisoning using WSEC DNS, must make configurational changes to their name servers, by editing their configuration zone-files [77]. As this solution require changes to the DNS protocol, it is not widely deployed. Even WSEC DNS adds latency to the system as it requires handshake to find out whether a zone is wsecdns enabled or not. Recently, several fast poisoning techniques (such as Kaminsky-class attack) which allow the trivial cor- ruption of DNS records have been identified [67]. Certain existing DNS protection systems such as bailiwick-checking [66] and IDS-style filtration are not completely successful against such DNS cache poisoning attacks. Nessus [75] and several other DNS related tools such as DNSStuff [73] and Porkbind [74], detect whether DNS servers are vulnerable to specific kind of cache poisoning attacks or not but they do not detect actual cache poisoning. Anax, a DNS protection system, detects poisoned records in cache by collecting IP information about a set of ”domain names of interest” and analyse whether the collected information is poisoned or not. The set contains 131 unique domains which they consider likely to be attacked. The amount of fi- nancial transactions conducted through these sites also makes them very tempting targets for phishing attacks [86], that potentially could be launched via DNS poisoning. Anax initially determines whether the resource record (RR) is poisoned or not on the basis of CIDR-based whitelist and blacklists which is generated after consulting numerous IP blacklists [79][80][81], do-not-route-lists [82], dynamic IP space [83] and passive DNS databases [84]. If ANAX is not able to make decision about the nature of Resource Record, the record is passed for fine-grained analysis to Anax 2-Class classifier. Passive DNS data traces are used to produce six statistical features for Anax’s 2-class classifier which help in labeling the records as poisonous or benign. All these benign and poisonous classes of RRs are then stored in Anax DB to classify new unknown RRs. Anax is largely an IP based classification technique. It is largely based on limited whitelist (consisting of benign IPs) and blacklists. Anax relies on an as- sumption that benign DNS records generally direct users to a known, usually stable set of NS-type and A-type records whereas poisoning generally points victims to new IP addresses. ANAX utilizes passive DNS data for computing its statistical features, hence it is sensitive to how the DNS data are aggregated and how long they are retained. The utilization of past-CDN IP address space for poisoning could be a significant evasion threat for Anax if the passive DNS data are retained for more than a few weeks. If the data retained is on the order of several months, then any past-CDN IP address space will still contain past-CDN signal i.e. considered benign by Anax. It increases the difficulty in identifying poisoning attempts with IPs originating from such addresses [85]. TTL-based consistency is used in the Domain Name System where each resource record in the DNS database has a lifetime duration assigned to it by its authoritative server. An authoritative server indi-

21 cates its status by setting a field called the Authoritative Answer (1-bit) to one in its responses. Some- times, DNS resolving leads to non-meaningful responses (or negative responses). In order to reduce costs of repeated failures, negative responses must also be cached with positive responses. Initially, negative caching was an optional part of the DNS specification. Since, negative caching is useful as it reduces the response time for negative answers, negative caching for DNS queries was re-described in [28]. J. Jung, Emil Sit and Hari [29] conducted an extensive trace-driven study of DNS cache performance driven by large client-side TCP and DNS traces. They found a surprising result that the cache hit rate was over 80% for a TTL of 15 minutes and that the hit rate improved by only 17%(to 97%) when the TTL was 24 hours as opposed to 15 minutes. The small difference in hit rates for the large increase in TTL is counterintuitive since it seems reasonable to expect many more accesses to any given origin site in 24 hours than in 15 minutes, and since (only) the first access after the TTL expires actually cause a cache miss. Other researchers have similarly observed dramatically diminishing marginal returns from increasing TTLs in a separate but related domain: Web Caching [30][31][32][33][34].

22 Chapter 4

Our Approach & Results

Our approach is similar to stimulus-response model in which a response is expected for a specific stimulus within a specific period of time. If the response is received out of the period then it could be the result of some flaw in the system. Similarly in our approach we expect the response of a DNS query to be received within a certain period of time. In case the DNS response is received out of period we suspect an attack on the system and perform a check to ensure the consistency of the response received.

4.1 Our Approach

To overcome the problem of the cache poisoning caused by DNS spoofing or DNS forgery or Kaminsky-class attack, we have come up with consistency management system which will prevent such attacks to occur or would produce an alert on detecting any such attack. The idea of a DNS Consistency Model which will regulate the consistency of a DNS server is shown in Fig. 4.1.

D Queries Interacts Client N Global DNS

S Responds

Manages

DNS Consistency Management System

Figure 4.1 DNS Consistency Model

Data Consistency Management System (DCMS) is an offline system to maintain the consistency of the responses obtained for the sent queries. DCMS is quite handy as it does not demand any changes in the current architecture of the DNS. It even does not hamper the performance of the DNS. The DNS sends the queries and get their responses and the consistency of the responses is evaluated in the

23 background by the DNS Consistency Management System (DCMS). DCMS measures the consistency between queries and their responses on the basis of following parameters -

4.1.1 Query Type and Response Type

Every query sent by a DNS server has a Resource Record type (QType) associated with it. Initially we trained DCMS with a large number of DNS packets which we collected at Internet gateway at Inter- national Institute of Technology, Hyderabad, where we mapped different RR (Resource Record) types of DNS queries to the RR types of their respective responses. Hence, we found out the possible response RR types for the specific query RR type. We have considered ”No such name”, ”Destination Unreach- able”, ”Server failure”, ”Format error”, ”Blank” as response types even if they do not belong to the category of Resource Record Types. In addition to all these record types, we obtained one unrecognised resource record type: - Unknown(99) [9]. Whenever the query had the RR type as Unknown (99), the response from the DNS did not contain anything substantial. Hence, if a query generated from the client has RR type as Unknown(99), it means to be discarded. Percentage of occurrence of different response RR types for different query RR types is shown in Table 4.1. As you could make out from the table itself, certain query types such as A6, SOA, and SPF too did not receive any positive response and could be discarded straightaway but DCMS does not completely discard them as it utilizes them to train itself. And in case, it finds that there is a change in the mapping of query type and response type, it updates its knowledge and takes action accordingly.

Table 4.1: Probable Response RR types (in percentage) for a Query RR type

Query RR Types Response RR Types (Percentage) A(40.0) ANY No such name(40.0) SOA(20.0) A(60.3) CNAME(11.6) Blank(9.8) No Such Name(16.9) A Server failure(0.6) Destination Unreachable(0.4) Refused(0.3)

Continued on next page ..

24 Table 4.1 – Continued from previous page Query RR Types Response RR Types (Percentage) Format error(0.1) Blank(90.7) Server failure(2.9) AAAA(2.8) CNAME(2.0) AAAA Destination unreachable(0.7) No such name(0.6) Refused(0.2) Format error(0.1) A6 Blank(100) A(61.0) Blank(25.9) CNAME CNAME(13.0) AAAA(0.1) MX(78.7) Blank(8.6) Server failure(4.1) No such name(6.7) MX Refused(0.9) Destination unreachable(0.8) CNAME(0.1) Format error(0.1) NS(46.5) Blank(29.3) Server failure(12.6) Destination unreachable(7.2) NS Refused(2.6)

Continued on next page ..

25 Table 4.1 – Continued from previous page Query RR Types Response RR Types (Percentage) No such name(1.4) CNAME(0.3) Format error(0.1) PTR(40.9) No such name(27.4) Blank(19.7) Server failure(6.7) PTR Destination unreachable(2.4) Refused(2.2) CNAME(0.6) Format error(0.1) No such name(98.0) SOA Destination unreachable(2.0) Blank(88.4) Server failure(1.7) SPF No such name(7.6) CNAME(0.3) Refused(2.0) Blank(56.7) Server failure(16.4) SRV SRV(20.9) CNAME(3.0) Refused(3.0) No such name(71.0) TXT(14.7) Blank(7.8) TXT Destination unreachable(2.6)

Continued on next page ..

26 Table 4.1 – Continued from previous page Query RR Types Response RR Types (Percentage) Server failure(3.7) Format error(0.1) Refused(0.1) Blank(94.2) Server failure(2.2) No such name(2.2) Unknown(99) Not implemented(0.7) ”Unknown(99)”(0.5) CNAME(0.2)

4.1.2 Query

Every response packet must contain the query name, including the chosen case variation. In case the response packet does not contain the query for which it is meant, the packet could be malicious and is to be discarded.

4.1.3 Transaction ID

Since the DNS client itself would drop the response packet not received onto the desired UDP source port, an explicit check for source port is not performed in the DCMS. In case the packet passes the afore- mentioned parameters, its ”Transaction ID” is considered. Response’s ”Transaction ID” is compared with the ”Transaction ID” of those queries which are alive i.e. queries which are not more than two minutes old as they would timeout after that. In case a response packet is obtained with a Transaction ID which does not match with any of the earlier sent queries’ Transaction ID, it implies that the response could be malicious and may be the result of a DNS forgery attack or Kaminsky class attack and is to be discarded.

4.1.4 Response Delay

Whenever you query a DNS, you mostly receive the response of a specific query type from a specific DNS server within a certain period of time. Response delay can simply be calculated by calculating the time difference between the query sent and the response received. If the response is received beyond a certain period, you may attribute that delay to the following reasons: -

• The network is congested.

27 • An attack has happened i.e. an adversary has either redirected your query or stopped it from reaching the genuine DNS or it could be the result of DNS forgery (or Kaminsky class) attack.

In our research, initially we trained our DCMS by capturing a large number of DNS query packets and noted down the response delays from different DNS servers for different query types. We observed response delays between more than 5000 source-destination pair for different query types. Response delays between single source and five different destinations for different query types are shown as graphs in Fig. 4.2, Fig. 4.3, Fig. 4.4, Fig. 4.5 and Fig. 4.6.

4.5 4.5 'A' 'AAAA' 4 4 3.5 3.5 3 3 2.5 2.5 2 2 1.5 1 1.5 Response Time (msec) Response Time (msec) 0.5 1 0 0.5 Packets Packets

2.6 1.29 'MX' 'NS' 2.4 1.288 2.2 1.286 2 1.284 1.8 1.282 1.6 1.28 1.4 1.278 1.2 1.276 Response Time (msec) Response Time (msec) 1 1.274 0.8 1.272 Packets Packets

0.9 1.585 'SPF' 'TXT' 0.898 1.58 0.896 1.575 0.894 0.892 1.57 0.89 1.565 0.888 1.56 0.886 1.555 0.884 Response Time (msec) Response Time (msec) 0.882 1.55 0.88 1.545 Packets Packets

Figure 4.2 Response Delays for Server I (192.26.92.30)

5 4.5 'A' 'AAAA' 4.5 4 4 3.5 3.5 3 3 2.5 2.5 2 2 1.5 1.5 Response Time (msec) Response Time (msec) 1 1 0.5 0.5 Packets Packets

3.2 4.5 'MX' 'NS' 3 4 2.8 2.6 3.5 2.4 3 2.2 2 2.5 1.8 2 1.6 Response Time (msec) Response Time (msec) 1.5 1.4 1.2 1 Packets Packets

Figure 4.3 Response Delays for Server II (192.31.80.30)

28 4.5 4 'A' 'AAAA' 4 3.5 3.5 3 3 2.5 2.5 2 2 1.5 1.5 Response Time (msec) Response Time (msec) 1 1 0.5 0.5 Packets Packets

2.8 2.2 3 'MX' 'NS' 2.8 'SPF' 2.6 2 2.6 2.4 2.4 1.8 2.2 2.2 2 2 1.6 1.8 1.8 1.4 1.6 1.6 1.4

Response Time (msec) Response Time (msec) 1.2 Response Time (msec) 1.2 1.4 1 1.2 1 0.8 Packets Packets Packets

Figure 4.4 Response Delays for Server III (192.33.14.30)

4.5 4 2.1 'A' 'AAAA' 'NS' 4 3.5 2 3.5 3 1.9 3 2.5 1.8 2.5 2 1.7 2 1.5 1.6 1.5 1 1 1.5 Response Time (msec) Response Time (msec) Response Time (msec) 0.5 0.5 1.4 0 0 1.3 Packets Packets Packets

Figure 4.5 Response Delays for Server IV (192.43.172.30)

4.5 4.5 'A' 'AAAA' 4 4 3.5 3.5 3 3 2.5 2.5 2 2 1.5 1.5 1 1 Response Time (msec) Response Time (msec) 0.5 0.5 0 0 Packets Packets

3 1.9727 3.5 'MX' 'NS' 'SPF' 2.8 1.9726 3 1.9725 2.6 1.9724 2.5 2.4 1.9723 2.2 1.9722 2 2 1.9721 1.5 1.972 1.8 1.9719 Response Time (msec) Response Time (msec) Response Time (msec) 1 1.6 1.9718 1.4 1.9717 0.5 Packets Packets Packets

Figure 4.6 Response Delays for Server V (192.55.83.30)

Median Response Delays between the source and the five destination servers for different query types are given in Table 4.2, 4.3, 4.4, 4.5 and 4.6.

29 Query Types Median Response Delay(secs) A 1.876245000 AAAA 2.249264000 MX 1.985029000 NS 1.273094000 SPF 0.889175000 TXT 1.564807000

Table 4.2 Median Response Delay for Server I (192.26.92.30)

Query Types Median Response Delay(secs) A 1.981187000 AAAA 2.372438000 MX 2.319803000 NS 1.499938000

Table 4.3 Median Response Delay for Server II (192.31.80.30)

Query Types Median Response Delay(secs) A 1.504986000 AAAA 1.619370000 MX 1.210365000 NS 1.122380000 SPF 0.917112000

Table 4.4 Median Response Delay for Server III (192.33.14.30)

Query Types Median Response Delay(secs) A 1.623544000 AAAA 1.564255000 NS 1.348282000

Table 4.5 Median Response Delay for Server IV (192.43.172.30)

30 Query Types Median Response Delay(secs) A 1.846113000 AAAA 2.156682000 MX 2.031046000 NS 1.971751000 SPF 0.895713000

Table 4.6 Median Response Delay for Server V (192.55.83.30)

Considering the response delays, we figured out that the responses which were received within a specific period of time from a specific DNS were not malicious. It does not imply that all packets received out of that period must be malicious but they could be malicious. That’s why we had to do Authoritative Reverse DNS Lookup for the packets which were received out of a specific period. On the basis of Authoritative Reverse DNS Lookup, we discarded the malicious packets.

Response Delay(in secs) Query Types Minimum Average Maximum∗ A 0.000101 0.308030 30.00139 A6 0.000190 0.000327 0.000516 AAAA 0.000105 0.768061 30.00191 ANY 0.000294 0.000336 0.000379 CNAME 0.000193 0.000365 0.00053 MX 0.000209 0.309814 30.000774 NS 0.000269 3.705701 30.000879 PTR 0.000220 0.697093 30.000781 SOA 0.109378 0.215344 0.330904 SPF 0.000182 1.636030 30.006190 SRV 0.153384 0.202081 0.286267 TXT 0.000256 0.888009 30.000881 Unknown(99) 0.000167 0.505797 30.00092 (*) - Cases in which Timeout occurred are ignored. Table 4.7 Response Delays (in secs) for Query Types

31 Overall Response Delay(in secs) Minimum Average (or Mean) Maximum∗ 0.000101 0.452563401168 30.00619 (*) - Cases in which Timeout occurred are ignored. Table 4.8 Response Delays for given queries.

The minimum, average and maximum response delays for different query types are shown in Table 4.7 whereas Overall Response Delays are given in Table 4.8. The cases in which ”timeout” occurred are ignored.

Considering the above parameters, we came up with the algorithm which we named as Domain Con- sistency Management Algorithm (DCMA) as shown in Table 4.9. Whenever a DNS client sends a query Q to the DNS server, DCMS stores the query in its database. While the query is on its way to the DNS server, DCMS in the background looks at the query resource record type, Qt. In case it finds Qt to be ”Unknown(99)”, it pushes it straightaway into the discarded set without waiting for the response R from the DNS server. When the response R is received by the DNS client, DCMS fetches it and checks whether the query is present in the response packet or not. In case the query is not found out in the response packet, the response is pushed into the discarded set without performing any further checks on it. Then the transaction ID of the response is matched with the transaction ID of the live queries i.e. which are not more than 2 minutes old. In case no match is found, the response is pushed to the discarded set.

Then the response delay Rd of the response is compared with the median response delay from the DNS

server for a specific query type. In case, Rd does not lie within the expected limits or there does not exist any entry of median response delay from the DNS server for a specific query type in the database, the

authoritative reverse DNS lookup, Ra(R), is performed to ensure the consistency of the response. And if the result of authoritative reverse DNS lookup turns out to be positive then the response is accepted and the median response deviation is updated for the DNS server for specific query type in the database.

Similarly, if Rd falls within the expected limits, mostly the response is accepted without performing au- thoritative reverse DNS lookup but sometimes DCMS does perform authoritative reverse DNS lookup for a random accepted response every few minutes in order to make sure that it is functioning properly and it even further reduces the chances of any kind of cache poisoning attacks to succeed. Along with checking the consistency of a random correct response, DCMS also uses that data for self-learning and hence, keeps on improving the performance.

32 Algorithm DCMA

Qt ← Query Type

Rn ← Response with no query

T ransid ← Transaction ID

Rd ← Response Delay of a response having specific query type from a specific DNS server

Ra ← Authoritative Reverse DNS Lookup

T1 ← Lower Limit for Response Delay for specific query type to ensure valid response from a specific DNS server

T2 ← Upper Limit for Response Delay for specific query type to ensure valid response from a specific DNS server For a Query Q and Response R {

if (Qt == ”Unknown(99)” or R == Rn) { Discard; } else {

if (T ransid(R) == T ransid(Q)) {

if (T1 6 Rd 6 T2) { Accept; } else {

if (Ra(R) == Q) { Accept; } else { Discard; } } } else { Discard; } } }

Table 4.9 Algorithm for DNS Consistency Management System

33 4.2 Probabilistic proof of our approach

Consider,

16 Ntid = Number of different Transaction IDs (universally 2 or 65,535 values) 16 Nsp = Number of different source ports (universally 2 − 1024 values. Since, 1024 are well-known ports hence excluded).

Rd = Response delay in receiving the response for a specific query type from a specific DNS server.

Md = Median response delay for the specific query type from a specific DNS server (stored in our database).

Td = Median response deviation.

Where,

Td is given by:

M ∗ < Deviation% > T = d d 100

Threshold T1 given in our DCMA, can be calculated as follows:

T1 = Md − Td

Threshold T2 given in our DCMA, can be calculated as follows:

T2 = Md + Td

Threat Model 1: In case of brute-force attacks like DNS forgery and Kaminsky-class attack

The probability of success to poison the DNS cache, when only Transaction ID used to be random was:

1 Psuccess = Ntid After the source port randomization implemented in DNS, the probability of success to poison the DNS cache was:

1 1 Psuccess = ∗ Ntid Nsp After the introduction of DCMS, the probability of success to poison the DNS cache is:

1 1 T2 − T1 Psuccess = ∗ ∗ Ntid Nsp Rd

34 which can be further simplified to

1 1 2 ∗ Td Psuccess = ∗ ∗ Ntid Nsp Rd Threat Model 2: In case of MITM attacks like DNS spoofing attack, where an attacker can intercept the query packets being sent to the DNS server, he no longer needs to guess the transaction ID and source port as both would be present in the query packet.

The probability of success to poison the DNS cache, when Transaction ID and source port are ran- domized is:

Psuccess = 1

After the introduction of DCMS, the probability of success to poison the DNS cache is:

2 ∗ Td Psuccess = Rd Less the median response deviation, less would be the probability of success to poison the DNS cache. Since, a very less median response deviation may lead to more number of authoritative reverse DNS lookups, an optimum value of median response deviation is chosen to mitigate the chances of cache poisoning keeping the number of authoritative reverse DNS lookups in check.

4.3 Optimum Value of Median Response Deviation

To calculate the optimum value of Median Response Deviation, we conducted experiments with dif- ferent possible values and found out the value which gave the best results in terms of ”no malicious record” and lesser number of authoritative reverse DNS lookup. Through our experiments, we found the optimum value of Median Response Deviation to be 20%.

4.4 Algorithmic Complexity

In DCMS, we maintain all the median response delays from different DNS servers for different query types. All this information is stored in the form of a data structure - Trie as it is the most efficient struc- ture to store the desired information. The three actions which are performed on the data structure are namely - Insert, Update and Lookup of the median response delay for a specific RR type from different DNS servers. The time complexity for each respective action is given below: -

35 • Time complexity to insert a new median response delay from a DNS server for a specific query type = O(1)

• Time complexity to update the median response delay from a DNS server for a specific query type = O(1)

• Time complexity to lookup median response delay from a DNS server for a specific query type = O(1)

Note: - The time complexity of any of the aforementioned operations is O(1) since the size of the keys present in trie is constant.

4.5 Results

Our result comprises of the trained system which could check the consistency of DNS responses on the basis of our algorithm - DCMA. Initially, we trained the system for more than 5000 source- destination pairs and then run the system to check the performance of our system. We run the system on the same source-destination pairs. The results obtained were completely satisfactory. As mentioned in the algorithm, we discarded the query packets with ”Unknown(99)” RR type and the response packets which did not contain the query in it. After discarding such packets, we mapped the responses to their respective queries on the basis of Transaction ID. While mapping the response with its respective query, we also calculated the response delay (the time difference between the query sent and the response received). Now, as we have already trained our system of the median response delays from different DNS servers for different query RR types, we compared the new calculated response delay with the already stored response delay. In case, the calculated response fall within the limits, the response was accepted otherwise authoritative server lookup was done to verify the authenticity of the response. In our research, all responses accepted on the basis of response delays were found to be correct. Table 4.10 lists the summary of packets for five different DNS servers to which queries were sent and responses were received from a specific DNS server.

# of Packets Servers Authoritative Reverse DNS Lookup Unknown(99) No Query Accepted Accepted Discarded Server I 4 0 468 27 98 Server II 6 0 460 41 109 Server III 11 0 391 13 185 Server IV 0 0 434 40 113 Server V 7 1 487 29 134

Table 4.10 Summary of packets for Five different Servers

36 During our research a lot of inconsistent responses were captured. The reason behind considering them inconsistent is that they could not pass Reverse DNS lookup and hence, discarded. When we did the rDNS for an IP address, say, ”64.136.25.253”, received PTR record did not match the queried do- main name. Internet official documents such as RFC 1033 and RFC 1912 clearly specify that ”Domain names must match with a reverse pointer (PTR) record”. Section 2.1 of RFC 1912 clearly states, under the heading ”Inconsistent, Missing or Bad Data”, ” Make sure your PTR and A records match. For every IP address, there should be a matching PTR record in the in-addr.arpa domain. If a host is multi-homed, (more than one IP address) make sure that all IP addresses have a corresponding PTR record (not just the first one). Failure to have matching PTR and A records can cause loss of Internet services similar to not being registered in the DNS at all. Also, PTR records must point back to a valid A record, not a alias defined by a CNAME. It is highly recommended that you use some software which automates this checking, or generate your DNS data from a database which automatically creates consistent data.” Those ISPs that will not or cannot configure reverse DNS will generate problems for hosts on their net- works, by virtue of RFCs being contravened when communicating with hosts that do follow the RFC guidelines. From a technical perspective reverse DNS is trivial to implement correctly and there is no reason not to implement it for hosts providing regular internet services. ISPs that cannot or will not provide reverse DNS ultimately will be limiting the ability of their client base to use internet services they provide effectively and securely.

37 Chapter 5

Analysis of TTL-based Caching in DNS

DNS is a hierarchical distributed database which uses intensive replication and caching to achieve high scalability and resiliency to server failures. Caching in DNS plays a major role as it improves its performance by reducing the access latency to data source and the bandwidth requirement at the data source. Every resource record retrieved by DNS client is cached in it for a specific time period - TTL(Time-To-Live). TTL plays an important role in maintaining the consistency of the cached resource records. Short TTL reduces the likelihood of getting old information but increases the DNS utilization whereas long TTL decreases the DNS utilization but at the same time increases the chance of retrieving outdated information. TTL values are mentioned in the resource records which indicates for how long a resource record can be cached before it should be discarded. TTL, mentioned in the resource record, ensures that a DNS client does not keep the information for so long that it becomes outdated. A signifi- cant number of cache misses in Internet systems that employ TTL-based caching occur because of TTL expiration. For DNS caches, essentially all misses occur because of TTL expiration or first access to a domain name [27]. Even then, TTL of a resource record cached at the DNS client should not be set to a large value in order to avoid any cache miss as the information may outdate after a period. TTL should not even be set to zero because it adversely effects the performance of DNS as DNS has to constantly query for the expired data.

5.1 Analysis of DNS response packets & associated TTL values.

In our research, we captured a large number of DNS packets at Internet gateway at International Institute of Technology, Hyderabad and analyzed the different kinds of TTL values associated with the response packets. The captured response packets had following response RR types - A, AAAA, Blank, CNAME, Format error, MX, No such name, NS, PTR, Refused, Server failure, SOA, SRV and TXT. We have considered ”Blank”, ”Format error”, ”No such name”, ”Server failure”, and ”Refused” as response types even if they do not belong to the category of Resource Record Types. ”Format error”, ”No such name”, ”Server failure” and ”Refused” are the negative responses whereas ”Blank” specifies that that name is valid but there is no cached response corresponding to the name.

38 12000 3000 'A' 'AAAA' 10000 2500

8000 2000

6000 1500

4000 1000 TTL (minutes) TTL (minutes)

2000 500

0 0 Packets Packets

60000 6000 'Blank' 'CNAME' 50000 5000 40000 4000 30000 3000 20000 2000 TTL (minutes) 10000 TTL (minutes)

0 1000

-10000 0 Packets Packets

-1010 3000 'Format_error' 'MX' 2500 -1005 2000

-1000 1500

1000 TTL (minutes) TTL (minutes) -995 500

-990 0 Packets Packets

12000 7000 'No_such_name' 'NS' 10000 6000

8000 5000

6000 4000

4000 3000

TTL (minutes) 2000 TTL (minutes) 2000

0 1000

-2000 0 Packets Packets

12000 -1010 'PTR' 'Refused' 10000 -1005 8000

6000 -1000

4000 TTL (minutes) TTL (minutes) -995 2000

0 -990 Packets Packets

-1010 2 'Server_failure' 'SOA' 1.8 -1005 1.6 1.4 -1000 1.2 1 TTL (minutes) TTL (minutes) -995 0.8 0.6 -990 0.4 Packets Packets

5 1600 'SRV' 'TXT' 4.5 1400 4 1200 3.5 1000 3 800 2.5 600 TTL (minutes) TTL (minutes) 2 400 1.5 200 1 0 Packets Packets

Figure 5.1 Response Types and TTL

39 We have plotted the TTL mentioned in the response packets against the different response resource- record types as shown in Fig. 5.1. Negative TTL values in the plots specify that TTL values were not mentioned in the resource records. As we could figure out from the plots, response RR types namely ”Blank”, ”Format error”, ”No such name”, ”Refused” and ”Server failure” had negative TTL values. Even among them, ”Format error”, ”Refused” and ”Server failure” always found to have negative TTL values i.e. TTL was never mentioned in the packets with such response types. However, ”Blank” and ”No such name” response types had both positive and negative TTL. On the basis of TTL values, we have divided responses as follows: -

• Positive TTL Responses - Responses with TTL mentioned inside the response packet.

• Negative TTL Responses - Responses with no TTL mentioned inside the response packet.

Even Positive TTL responses could be broadly classified into following categories: -

• Zero TTL Responses - Certain responses when received had TTL mentioned in it but it was equal to zero time.

• Normal TTL Responses - Responses with minimal deviation from Median TTL.

• Large TTL Responses - Responses with considerably large deviation from Median TTL.

Zero TTL responses were not prevalent among all the response types and we calculated the percentage of Zero TTL responses for different response types using the following formula -

# of ZeroTTLresponsesfor givenresponsetype Zero TTL(%)= # of PositiveTTLresponsesfor givenresponsetype ∗ 100 Table 5.1 gives the percentage of Zero TTL Responses for each recorded response type. Similarly, we also calculated the minimum, median and maximum TTL values for each recorded response type. Table 5.2, 5.3, and 5.4 lists the minimum, median and maximum TTL values for different response types respectively. Since ”Format error”, ”Refused” and ”Server failure” never had any positive TTL value, minimum, median and maximum TTL value can not be defined for them whereas in case of ”No such name” and ”Blank”, minimum, medium and maximum TTL value are calculated on the responses with positive TTL (Negative TTL responses were neglected). Among the aforementioned category of responses, Normal TTL responses are not found to be prob- lematic and can be cached for the TTL values mentioned in them. The problem arises in caching Negative TTL, Zero TTL and Large TTL responses because of the following reasons: -

• Negative TTL responses do not have any TTL values associated with them.

• Zero TTL responses adversely affects the performance of DNS as DNS has to constantly query for the expired data.

• Large TTL responses with negative response types may cause Denial-of-service(DOS) attacks to propagate.

40 Response Type Zero TTL (%) A .4 AAAA 1 Blank .2 CNAME 0 Format error 0 MX .45 No such name 6 NS .3 PTR 0 Refused 0 Server failure 0 SOA 0 SRV 0 TXT 0

Table 5.1 Zero TTL Responses(in percentage) for different Response Types

Our analysis of TTL values mentioned in responses leads us to a few questions: - ”Should we cache Negative TTL, Zero TTL and Large TTL responses? And if yes, then how to cache such responses?” Our view on the aforementioned questions is as follows: - Surely, all the responses must be cached as it improves the DNS performance. For Negative TTL responses, as given in RFC 2308, TTL of the record must be set from the minimum of the MINIMUM field and TTL field of the SOA record. Negative responses without SOA records should not be cached as there is no way to prevent the negative responses looping forever between a pair of servers [28]. Zero TTL responses must be cached for the minimum TTL corresponding to the response type in order to avoid any adverse effects on the performance of DNS. For example, if a response of response type - ”AAAA” is received with Zero TTL, it must be cached in the DNS for one second (minimum TTL value for response type ”AAAA” from Table 5.2). If we reduce the TTLs of responses of ’A’ response type to a few hundred seconds, it has a little adverse effect on cache-hit rates [29]. Similarly, if we reduce large TTL values to median TTL value for a given response type it is going to have very less adverse effect on cache-hit rate and on other side, it would avoid denial-of-service attacks. Hence, Large TTL responses should be cached for the Median TTL value for the given response type (mentioned in Table 5.3).

41 Response Type Minimum TTL (seconds) A 1 AAAA 1 Blank 3 CNAME 1 Format error Not Defined MX 2 No such name 2 NS 5 PTR 20 Refused Not Defined Server failure Not Defined SOA 30 SRV 60 TXT 1

Table 5.2 Minimum TTL (in seconds) for Response Types

Response Type Median TTL (minutes) A 15 AAAA 15 Blank 1440.0 CNAME 20.0 Format error Not Defined MX 2.0 No such name 2.5 NS 70.17 PTR 120.0 Refused Not Defined Server failure Not Defined SOA 2.0 SRV 5.0 TXT 24.57

Table 5.3 Median TTL (in minutes) for Response Types

42 Response Type Maximum TTL (minutes) A 10080 AAAA 2880 Blank 60000 CNAME 5760 Format error Not Defined MX 2880 No such name 10080 NS 6252.7 PTR 10080 Refused Not Defined Server failure Not Defined SOA 2 SRV 5 TXT 1440

Table 5.4 Maximum TTL (in minutes) for Response Types

43 Chapter 6

Conclusion & Future Work

DNS cache poisoning attacks present a persistent, ongoing threat to the Internet’s critical infrastruc- ture. Many solutions have been proposed to deal with the attack but the lack of adoption and delays in deployment suggest the need for a light weight, practical improvements to DNS security. Our approach meets these requirements as it can be easily adopted and deployed and enhances DNS security against cache poisoning attacks. The significant feature of DCMS is self-learning which helps in improving the performance. DCMS does an Authoritative Reverse DNS Lookup for every random correct response to verify its proper functioning and it even reduces further the chance of any cache poisoning attack to succeed. Along with that it uses that data to train itself. DCMS maintains a database of the response delays between DNS client and DNS servers whose IP ad- dresses generally do not change pretty often. Even if the response delays between two machines change because of the change in either of their locations (or IP address), the system may get slow for a time being as in that case the authoritative reverse DNS lookup would occur more frequently to ensure the consistency of the record. As our system is a self-learning or training system, it gets trained for the new response delay in a short period of time and continue operating efficiently. DCMS even protects DNS from the scare of a recent class of attack on DNS i.e. Kaminsky-class attack [54] as our approach adds the feature of response delay along with transaction ID and source-port ran- domization to ensure the consistency of the DNS response. If an attacker has to launch Kaminsky-class attack, the attacker has to send the forged packet with correct transaction ID and correct source port at proper delay. If the forged packet is received out of time, a check is performed to ensure its consistency. Hence the probability of successfully launching this class of attack immensely reduces. The biggest advantage of our approach is that it does not demand any change in the existing DNS architecture and works efficiently with minimum infrastructure. Even after IPv6 would be introduced, DCMS would function the same way without requiring any changes. It even does not require any changes at the server side as in WSEC-DNS [77] to ensure the consistency of the responses. DCMS have no whitelists or blacklists to rely upon and even it does not hold assumptions such as poisoning generally points victims to new IP addresses as in case of Anax [85] since an attacker can get the better of such assumptions. Since it does not involve any cryptography technique, hence any sort of key management is not required.

44 Even our approach does not introduce any significant amount latency to the system, hence turns out to be an effective and efficient approach against cache poisoning attacks. Currently, DCMS is an offline system to ensure the consistency of the responses and works for specific resource record types such as A, AAAA, A6 and PTR records. Our future work focuses on integrating it with the DNS to make it an online system to ensure the consistency of the responses on the fly and extend it for the other resource record types. Our approach could even be extended to handle other MITM(Man-In-The-Middle) [11] related attacks where a delivered message is expected to reach its destination within a certain period of time. In case the delivered message fails to reach its destination within a specific period of time, a certain assurance check would be performed to ensure the correctness of the message. A primary system to deal MITM related attacks could be built after training it with the delivery delays between two machines. The sys- tem would be a self-learning one to enable it to update its database on its own. The self-learning feature would help the system in proper functioning and even it also enhances the performance of system as in case of DCMS. As of now, DNS Rebinding attack has not been tackled in DCMS. Using DNS rebinding attack, an at- tacker can circumvent firewalls and hijack IP addresses. Basic DNS rebinding attacks have been known for over a decade [42][43], and several solutions to mitigate the attack, such as DNS pinning [18] have been published. Modern multi-pin attacks defeat pinning in just hundreds of milliseconds, granting the attacker direct socket access from the client’s machine. Even DNSSEC provides no protection against DNS rebinding attacks, an attacker can legitimately sign all DNS records provided by his or her DNS server in the attack [25]. These rebinding attacks are a highly cost-effective technique for hijacking hundreds of thousands of IP addresses for sending spam e-mail and committing click fraud. Our future work focuses on addressing the ways to mitigate DNS rebinding attack which is still not completely checked and hence, making DCMS more effective in handling DNS-related attacks.

45 Related Publications

46 Bibliography

[1] M. D. Kudlick, ”Host Names On-Line”, RFC 608, January 1974. [2] Elizabeth Feinler, Ken Harrenstien, Zaw-Sing Su and Vic White,”DoD Internet Host Table Specification”, RFC 810, March 1982. [3] D. L. Mills, ”Internet Name Domains”, RFC 799, September 1981. [4] P. Mockapetris, ”Domain Names - Concepts and Facilities”, RFC 882, November 1983. [5] Steven Cheung , ”Denial of Service against the Domain Name System”, IEEE Security and Privacy, 2006. [6] P. Mockapetris, ”Domain Names - Implementation and Specification”, RFC 883, November 1983. [7] P. Mockapetris, ”Domain Names - Concepts and Facilities”, RFC 1034, November 1987. [8] P. Mockapetris, ”Domain Names - Implementation and Specification”, RFC 1035, November 1987. [9] A. Gustafsson, ”Handling of Unknown DNS Resource Record (RR) Types”, RFC 3597, September 2003. [10] D. J. Bernstein, ”djbdns”, http://cr.yp.to/djbdns.html. [11] A. Ornaghi, M. Valleri, ”Man in the middle attacks Demos”, In BlackHat Conference, 2003. [12] P. Vixie, O. Gudmundsson, D. Eastlake, and B. Wellington, ”Secret key transaction authentication for DNS (TSIG)”, RFC 2845, May 2000. [13] S. Kwan, P. Garg, J. Gilroy, L. Esibov, J. Westhead and R. Hall, ”Generic Security Service Algorithm for Secret Key Transaction Authentication for DNS (GSS-TSIG)”, 2003. [14] Guiseppe Ateniese and Stefan Mangard: A New approach to DNS Security(DNSSEC), In ACM Conference on Computer and Communications Security, December 2002. [15] Reza Curtomla, aniello Del Sorbo and Giuseppe Ateniese: On the Performance and Analysis of DNS Secu- rity Extensions, In CANS, November 2005. [16] Linhua Yuan, Krishna Kant, Prasant Mohapatra and Chen-Nee Chuah: DoX:A Peer-to-Peer Antidote for DNS Cache Poisoning Attacks, In ICC, 2006. [17] A. Hubert and R. van Mook, ”Measures for making DNS more resilient against forged answers, 2008, http://tools.ietf.org/html/draft-ietf-dnsext-forgery-resilience-06. [18] Dafydd Stuttard: DNS Pinning and Web Proxies, 2007. [19] E. Cohen and H. Kaplan, ”Proactive caching of DNS records: Addressing a performance bottleneck”, In Proceedings of the Symposium on Applications and the Internet, 2001.

47 [20] P. Cau and C. Liu, ”Maintaining strong cache consistency in the world wide web”, IEEE Transactions on Computers, April 1998. [21] M. Theimer and M. B. Jones, ”Overlook: Scalable name service on an overlay network”, In Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS), Vienna, Austria, 2002. [22] R. Cox, A. Muthitacharoen, and R. T. Morris, ”Serving DNS using a peer-to-peer lookup service”, In Pro- ceedings of IPTPS 02, 2002. [23] K. Park, V. Pai, L. Peterson, and Z. Wang, ”CoDNS: Improving DNS performance and reliability via co- operative lookups”, In Proceedings of the Sixth Symposium on Operating Systems Design and Implementa- tion(OSDI 04), 2004. [24] V. Ramasubramanian and E. G. Sirer, ”The design and implementation of a next generation name service for the internet”, In Proceedings of SIGCOMM, Portland, Oregon, August 2004. [25] Collin Jackson and Adam Barth and Andrew Bortz and Weidong Shao and Dan Boneh: Protecting browsers from DNS rebinding attacks, In TWEB, January 2009. [26] Eric Osterweil, Dan Massey, Lixia Zhang, ”Deploying and Monitoring DNS Security (DNSSEC)” ASCAC 2009. [27] Jaeyeon Jung and Arthur W. Berger and Hari Balakrishnan, ”Modeling TTL-based Internet Caches”,IEEE Infocom 2003. [28] M. Andrews, ”Negative Caching of DNS Queries (DNS NCACHE)”, RFC 2308, March 1998. [29] Jaeyeon Jung, E. Sit, Hari Balakrishnan and R. Morris, ”DNS Performance and the Effectiveness of Caching,” IEEE/ACM Transactions on Networking, Oct. 2002. [30] S. Williams, M. Abrams, C. R. Standridge, G. Abdulla, and E. A. Fox, ”Removal Policies in Network Caches for World-Wide Web Documents”, in Proceedings of the ACM SIGCOMM’96 Conference,1996. [31] M. Arlitt, R. Friedrich and T. Jin, ”Performance Evaluation of Web Proxy Cache Replacement Policies” in Performance Tools ’98, 1998. [32] L. Breslau, P. Cao, L. Fan, G. Philips and S. Shenker, ”Web caching and zipf-like distributions:Evidence and Implications”, in Proceedings of the INFOCOM’99 conference, 1999. [33] E. Cohen and H. Kaplan, ”Aging Through Cascaded Caches: Performance Issues in the Distribution of Web Content,” in Proceedings of the ACM SIGCOMM 2001 Conference, 2001. [34] E. Cohen and E. Halperin and H. Kaplan, ”Performance Aspects of Distributed Caches Using TTL-Based Consistency”, in Proceedings of ICALP’01 conference, 2001. [35] D. Wilkinson, C. E. Chow, and Y. Cai, ”Enhanced secure dynamic dns update with indirect route”, In Proceedings of IEEE Information assurance workshop, 2004. [36] L. Yuan, K. Kant, P. Mohapatra, and C.-N. Chuah, A Proxy View of Quality of Domain Name Service, in Infocom 2007. [37] DNS Spoofing. http://www.windowsecurity.com/articles/Understanding-Man-in-the-Middle-Attacks-ARP- Part2.html

48 [38] Sophos. Troj/qhosts-1. http://www.sophos.com/virusinfo/analyses/trojqhosts1.html, 2003. [39] Zlob Trojan, http://en.wikipedia.org/wiki/Zlob trojan, 2007. [40] B. Eckman and L. Zelster. An overview of the free video player trojan. http://isc.sans.org/diary.html?storyid=1872, 2006. [41] An overview of Trojan.DNSChanger, http://www.precisesecurity.com/blogs/2007/02/19/trojan-dnschanger/, 2007. [42] D. Dean, E. W. Felten, and D. S. Wallach. Java security: from HotJava to Netscape and beyond. In IEEE Symposium on Security and Privacy: Oakland, California, May 1996. [43] J. Roskind. Attacks against the Netscape browser. In RSA Conference, April 2001. [44] J. Ruderman. JavaScript Security: Same Origin. http://www.mozilla.org/projects/security/components/same- origin.html. [45] Christoph L. Schuba, ”Addressing Weakness in the Domain Name System Protocol”, Master’s thesis, Purdue University, 1993. [46] A. Klein, ”BIND 9 DNS cache poisoning”, http://www.trusteer.com/docs/bind9dns.html, 2007. [47] A. Klein, ”PowerDNS recursor DNS cache poisoning”, http://www.trusteer.com/docs/powerdnsrecursor.html, 2008. [48] A. Klein, ”BIND 8 DNS cache poisoning”, http://www.trusteer.com/docs/bind8dns.html, 2007. [49] A. Klein, ”OpenBSD DNS cache poisoning and multiple OS predictable IP ID vulnerability”, http://www.trusteer.com/docs/dnsopenbsd.html, 2007. [50] A. Klein, ”Windows DNS server cache poisoning”, http://www.trusteer.com/docs/microsoftdns.html, 2007. [51] J. Stewart, ”DNS cache poisoning the next generation”, http://www.secureworks.com/research/articles/dns- cache-poisoning/, 2003. [52] D. J. Bernstein, ”The dns random library interface”, http://cr.yp.to/djbdns/dns random.html, 2008. [53] Internet Assigned Numbers Authority, ”Port numbers”, http://www.iana.org/assignments/port-numbers, 2008. [54] Dan Kaminsky, ”Black Ops 2008: It’s the end of the cache as we know it”, http://s3.amazonaws.com/dmk/DMK BO2K8.ppt, 2008. [55] David Dagon, Manos Antonakakis and Paul Vixie, Tatuya Jinmei, and Wenke Lee ”Increased DNS forgery resistance through 0x20-bit encoding”, In proceedings of the 15th ACM conference on Computer and Com- munications Security, 2008. [56] Donald E. Eastlake, ”Domain Name System (DNS) Cookies”, http://tools.ietf.org/html/draft-eastlake- dnsext-cookies-03, 2008. [57] H. Krawczyk, M. Bellare, and R. Canetti, ”HMAC: Keyed-Hashing for Message Authentication”, RFC 2104, 1997. [58] D. Eastlake, E. Brunner-Williams, and B. Manning, ”Domain Name System (DNS) IANA Considerations”, RFC 2929, September 2000.

49 [59] J. Postel, ”User Datagram Protocol”, RFC 768, August 1980. [60] Information Sciences Institute, ”Transmission Control Protocol”, RFC 793, September 1981. [61] P. Vixie, ”Extension Mechanisms for DNS (EDNS0)”, RFC 2671, August 1999. [62] D. Eastlake, ”Secret Key Establishment for DNS (TKEY RR)”, RFC 2930, September 2000. [63] P. Vixie, ”A Mechanism for Prompt Notification of Zone Changes (DNS NOTIFY)”, RFC 1996, August 1996. [64] P. Vixie, S. Thomson, Y. Rekhter and J. Bound, ”Dynamic Updates in the Domain Name System (DNS UPDATE)”, RFC 2136, April 1997. [65] D. Barr, ”Common DNS Operational and Configuration Errors”, RFC 1912, February 1996. [66] Computer Academic Underground, bailiwicked domain.rb, 2008, http://www.caughq.org/exploits/CAU- EX-2008-0003.txt [67] David Dagon, Manos Antonakakis, Kevin Day, Xiapu Luo, Christopher P. Lee, Recursive DNS Architec- tures and Vulnerability Implications, In Proceedings of the 16th NDSS, 2009. [68] C. R. Dougherty, ”Vulnerability Note VU#80013”, 2008, https://www.kb.cert.org/vuls/id/800113. [69] R. Arends, R. Austein, M. Larson, D. Massey and S. Rose, ”DNS security introduction and requirements”, 2005, http://www.ietf.org/rfc/rfc4033.txt. [70] R. Arends, R. Austein, M. Larson, D. Massey and S. Rose, ”Resource records for the DNS security exten- sions”, 2005, http://www.ietf.org/rfc/rfc4034.txt. [71] R. Arends, R. Austein, M. Larson, D. Massey and S. Rose, ”Protocol modifications for the DNS security extensions”, 2005, http://www.ietf.org/rfc/rfc4035.txt. [72] D. Eastlake and C. Kaufman, ”Domain name system security extensions”, 1997, http://www.ietf.org/rfc/rfc2065.txt [73] DNSStuff, DNS Network Tools: Network Monitoring and DNS Monitoring, 2008, http://www.dnsstuff.com [74] D. Callaway, Porkbind - Recursive multi-threaded nameserver security scanner, 2008, http://www.securityfocus.com/archive/1/495539/30/8730/threaded [75] Nessus: The network vulnerability scanner, http://www.nessus.org/nessus [76] D.J. Bernstein, Introduction to DNSCurve, 2008, http://dnscurve.org/ [77] Roberto Perdisci and Manos Antonakakis and Xiapu Luo and Wenke Lee, WSEC DNS: Protecting Recursive DNS Resolvers from poisoning Attacks, In Proceedings of DSN-DCCS, 2009. [78] David Dagon, Niels Provos, Christopher P. Lee, Wenke Lee: Corrupted DNS Resolution Paths: The Rise of a Malicious Resolution Authority. NDSS 2008. [79] Team Cymru. The Darknet Project (2004), http://www.team-cymru.org/Services/darknets.html [80] Karmasphere. The open reputation network (2006), https://dnsparse.insec.auckland.ac.nz/dns [81] The Spamhaus Project. XBL: Exploits block list (2008), http://www.spamhaus.org/xbl [82] The Spamhaus Project. Lasso: The Spamhaus Don’t Route Or Peer List (2008), http://www.spamhaus.org/drop/drop.lasso

50 [83] The Spamhaus Project. PBL: The Policy Block List (2008), http://www.spamhaus.org/pbl [84] ISC. SIE@ISC, http://sie.isc.org [85] Manos Antonakakis, David Dagon, Xiapu Luo, Roberto Perdisci, Wenke Lee, Justin Bellmor: A Centralized Monitoring Infrastructure for Improving DNS Security, RAID 2010. [86] D. Ulevitch, Phishtank: Out of the Net into the Tank (2009), http://www.phishtank.com/ [87] Ahren Studer and Adrian Perrig, The Coremelt Attack, In the proceedings of the 14th European Symposium on Research in Computer Security, 2009.

51