A Virtual Honeypot Framework
Total Page:16
File Type:pdf, Size:1020Kb
A Virtual Honeypot Framework Niels Provos∗ Google, Inc. [email protected] Abstract One way to get early warnings of new vulnerabil- ities is to install and monitor computer systems on a network that we expect to be broken into. Every A honeypot is a closely monitored network decoy attempt to contact these systems via the network is serving several purposes: it can distract adversaries suspect. We call such a system a honeypot. If a hon- from more valuable machines on a network, pro- eypot is compromised, we study the vulnerability vide early warning about new attack and exploita- that was used to compromise it. A honeypot may tion trends, or allow in-depth examination of ad- run any operating system and any number of ser- versaries during and after exploitation of a honey- vices. The configured services determine the vectors pot. Deploying a physical honeypot is often time in- an adversary may choose to compromise the system. tensive and expensive as different operating systems require specialized hardware and every honeypot re- A physical honeypot is a real machine with its quires its own physical system. This paper presents own IP address. A virtual honeypot is a simulated Honeyd, a framework for virtual honeypots that sim- machine with modeled behaviors, one of which is the ulates virtual computer systems at the network level. ability to respond to network traffic. Multiple vir- The simulated computer systems appear to run on tual honeypots can be simulated on a single system. unallocated network addresses. To deceive network Virtual honeypots are attractive because they re- fingerprinting tools, Honeyd simulates the network- quirer fewer computer systems, which reduces main- ing stack of different operating systems and can pro- tenance costs. Using virtual honeypots, it is possible vide arbitrary routing topologies and services for an to populate a network with hosts running numerous arbitrary number of virtual systems. This paper dis- operating systems. To convince adversaries that a cusses Honeyd’s design and shows how the Honeyd virtual honeypot is running a given operating sys- framework helps in many areas of system security, tem, we need to simulate the TCP/IP stack of the e.g. detecting and disabling worms, distracting ad- target operating system carefully, in order to deceive versaries, or preventing the spread of spam email. TCP/IP stack fingerprinting tools like Xprobe [1] or Nmap [9]. This paper describes the design and implemen- 1 Introduction tation of Honeyd, a framework for virtual honey- pots that simulates computer systems at the network level. Honeyd supports the IP protocol suites [26] Internet security is increasing in importance as and responds to network requests for its virtual hon- more and more business is conducted there. Yet, eypots according to the services that are configured despite decades of research and experience, we are for each virtual honeypot. When sending a response still unable to make secure computer systems or even packet, Honeyd’s personality engine makes it match measure their security. the network behavior of the configured operating As a result, exploitation of newly discovered vul- system personality. nerabilities often catches us by surprise. Exploit au- To simulate real networks, Honeyd creates virtual tomation and massive global scanning for vulnerabil- networks that consist of arbitrary routing topologies ities enable adversaries to compromise computer sys- with configurable link characteristics such as latency tems shortly after vulnerabilities become known [25]. and packet loss. When networking mapping tools like traceroute are used to probe the virtual network, ∗This research was conducted by the author while at the Center for Information Technology Integration of the Univer- they discover only the topologies simulated by Hon- sity of Michigan. eyd. Our performance evaluation of Honeyd shows hand, honeypots can detect vulnerabilities that are that a 1.1 GHz Pentium III can support 30 MBit/s not yet understood. For example, we can detect aggregate bandwidth and that it can sustain over compromise by observing network traffic leaving the two thousand TCP transactions per second. The honeypot even if the means of the exploit has never experimental evaluation of Honeyd verifies that fin- been seen before. gerprinting tools are deceived by the simulated sys- Because a honeypot has no production value, any tems and shows that our virtual network topologies attempt to contact it is suspicious. Consequently, seem realistic to network mapping tools. forensic analysis of data collected from honeypots is To demonstrate the power of the Honeyd frame- less likely to lead to false positives than data col- work, we show how it can be used in many areas lected by NIDS. of system security. For example, Honeyd can help Honeypots can run any operating system and any with detecting and disabling worms, distracting ad- number of services. The configured services deter- versaries, or preventing the spread of spam email. mine the vectors available to an adversary for com- The rest of this paper is organized as follows. Sec- promising or probing the system. A high-interaction tion 2 presents background information on honey- honeypot simulates all aspects of an operating sys- pots. In Section 3, we discuss the design and imple- tem. A low-interaction honeypots simulates only mentation of Honeyd. Section 4 presents an evalua- some parts, for example the network stack [24]. A tion of the Honeyd framework in which we analyze high-interaction honeypot can be compromised com- the performance of Honeyd and verify that finger- pletely, allowing an adversary to gain full access to printing and network mapping tools are deceived to the system and use it to launch further network at- report the specified system configurations. We de- tacks. In contrast, low-interaction honeypots sim- scribe how Honeyd can help to improve system se- ulate only services that cannot be exploited to get curity in Section 5 and present related work in Sec- complete access to the honeypot. Low-interaction tion 6. We summarize and conclude in Section 7. honeypots are more limited, but they are useful to gather information at a higher level, e.g., learn about network probes or worm activity. They can also be 2 Honeypots used to analyze spammers or for active countermea- sures against worms; see Section 5. This section presents background information on We also differentiate between physical and virtual honeypots and our terminology. We provide moti- honeypots. A physical honeypot is a real machine on vation for their use by comparing honeypots to net- the network with its own IP address. A virtual hon- work intrusion detection systems (NIDS) [19]. The eypot is simulated by another machine that responds amount of useful information provided by NIDS is to network traffic sent to the virtual honeypot. decreasing in the face of ever more sophisticated eva- When gathering information about network at- sion techniques [21, 28] and an increasing number of tacks or probes, the number of deployed honeypots protocols that employ encryption to protect network influences the amount and accuracy of the collected traffic from eavesdroppers. NIDS also suffer from data. A good example is measuring the activity high false positive rates that decrease their useful- of HTTP based worms [23]. We can identify these ness even further. Honeypots can help with some of worms only after they complete a TCP handshake these problems. and send their payload. However, most of their con- A honeypot is a closely monitored computing re- nection requests will go unanswered because they source that we intend to be probed, attacked, or contact randomly chosen IP addresses. A honeypot compromised. The value of a honeypot is deter- can capture the worm payload by configuring it to mined by the information that we can obtain from it. function as a web server. The more honeypots we Monitoring the data that enters and leaves a honey- deploy the more likely one of them is contacted by pot lets us gather information that is not available a worm. to NIDS. For example, we can log the key strokes Physical honeypots are often high-interaction, so of an interactive session even if encryption is used allowing the system to be compromised completely, to protect the network traffic. To detect malicious they are expensive to install and maintain. For large behavior, NIDS require signatures of known attacks address spaces, it is impractical or impossible to de- and often fail to detect compromises that were un- ploy a physical honeypot for each IP address. In known at the time it was deployed. On the other that case, we need to deploy virtual honeypots. Figure 1 shows a conceptual overview of the framework’s operation. A central machine intercepts network traffic sent to the IP addresses of configured honeypots and simulates their responses. Before we describe Honeyd’s architecture, we explain how net- work packets for virtual honeypots reach the Honeyd host. 3.1 Receiving Network Data Honeyd is designed to reply to network packets whose destination IP address belongs to one of the Figure 1: Honeyd receives traffic for its virtual honey- simulated honeypots. For Honeyd, to receive the pots via a router or Proxy ARP. For each honeypot, correct packets, the network needs to be configured Honeyd can simulate the network stack behavior of a appropriately. There are several ways to do this, different operating system. e.g., we can create special routes for the virtual IP addresses that point to the Honeyd host, or we can 3 Design and Implementation use Proxy ARP [3], or we can use network tunnels. Let A be the IP address of our router and B the In this section, we present Honeyd, a lightweight IP address of the Honeyd host. In the simplest case, framework for creating virtual honeypots.