Diversity-Based Defense of Windows Systems*

Diversity-Based Defense of Windows Systems? Lixin Li1, James E. Just1, and R. Sekar2 Abstract. The prevalence of identical vulnerabilities across software monocultures has emerged as the biggest challenge for protecting the Internet from large-scale attacks. Arti¯cially introduced software diversity provides a promising defense against this threat, since it can po- tentially eliminate common-mode vulnerabilities across these systems. Several diversity-based defenses have since been proposed for Linux, but to the best of our knowledge, no such techniques have been described for the largest monoculture on the Internet, namely, the Wintel plat- form. This is largely because of particular challenges posed by Microsoft Windows: (a) it is a proprietary system whose source code isn't read- ily available, and cannot be modi¯ed to incorporate diversity features, (b) it is signi¯cantly more complex than Linux, and (c) its architecture incorporates several features that complicate diversity-based defenses, such as the lack of UNIX-style shared libraries, and storage of process and thread control data within the address space of a process. In this paper, we present a solution that overcomes these challenges to support address-space randomization of Windows. We present an experimental evaluation of our technique that demonstrates its e®ectiveness against a wide range of attacks. In addition, we provide analytical results that estimate the probability of successful attacks. 1 Introduction Software monocultures represent one of the greatest Internet threats, since they enable construction of attacks that can succeed against a large fraction of the hosts on the Internet. Automated introduction of software diversity has been suggested as a method to address this challenge. In addition to providing a defense against attacks due to worms and botnets, automated diversity generation is a necessary building block for construction of practical intrusion-tolerant systems, i.e., systems that use multiple instances of COTS software/hardware to ward o® attacks, and continue to provide their critical services. Such systems cannot be built without diversity, since all constituent copies will otherwise share common vulnerabilities, and hence can all be brought down using a single attack; and they can't be built economically without arti¯cial diversity techniques, since manual developmental of diversity can be prohibitively expensive. An obvious approach for automated introduction of diversity is that of a random (yet systematic) software transformation. Such a transformation needs to preserve the functional behavior of the software as expected by its program- mer, but break the behavioral assumptions made by attackers. If formal behav- ? This work was partially funded by Defense Advanced Research Project Agency under con- tract FA8750-04-C-0244. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the o±cial policies, either expressed or implied, of the Defense Advanced Research Project Agency or the U.S. Gov- ernment. 1 Global Infotek, Reston, VA. 2 Stony Brook University, Stony Brook, NY. 1 ior speci¯cations of the software were available, one could use it as a basis to identify transformations that ensure conformance with these speci¯cations. How- ever, in practice, such speci¯cations aren't available. An alternative is to focus on transformations that preserve the semantics of the underlying programming language. Unfortunately, the semantics of the C-programming language, which has been used to develop the vast majority of security-sensitive software in use today, imposes tight constraints on implementation, leaving only a few sources for diversity introduction: { Randomization of memory locations where program objects (code or data) are stored [20, 3, 26, 4]. Such randomization can defeat pointer corruption attacks, since the attacker no longer knows the \correct" value to be used in corruption. It may also defeat overow attacks, since an attacker is no longer able to predict the object that will be overwritten. { Randomization of the representation used for code [2, 14]. This randomization defeats injected code attacks, since the attacker no longer knows the representation used for valid code. Fortunately, these randomization techniques seem adequate to handle the most popular attacks today, which rely on memory corruption and/or code injection. Over 75% of the US-CERT advisories in recent years, and almost every known worm on the Internet, have been based on such attacks. The availability of hardware/software support for enforcing non-executability of data (e.g., the NX feature of Win XP SP2 and above), which defeats all injected code attacks, has obviated the need for instruction set randomization to some extent. Address space randomization, on the other hand, protects against several other classes of attacks that aren't addressed by NX, e.g., existing code attacks (also called return-to-libc attacks), and attacks on security critical data. The importance of data attacks was demonstrated in [5], where the authors showed that it was relatively easy to exploit memory corruption attacks to alter security sensitive data to achieve administrator or user-level access on target system. 1.1 Bene¯ts of Automated Diversity In general, automated diversity provides probabilistic (rather than determinis- tic) protection against attacks. As such, researchers have developed attacks on address space randomization [24] as well as instruction set randomization [25] that can succeed after several tens of thousands of attempts. Nevertheless, we believe that automated diversity is very valuable for protecting systems: { Only the most determined attackers will succeed in their e®ort, while others are likely to give up after several unsuccessful attempts { Even against the most determined adversary, the technique buys valuable time: rather than having to deal with attacks that succeed in tens of milliseconds, attacks take several minutes or more, which gives ample time for responding to attacks. Such responses may include: ² ¯ltering out the source(s) of attacks by recon¯guring ¯rewalls 2 ² synthesizing and deploying a signature to block out attack-bearing requests after witnessing the ¯rst few. Several such automatic signature generation techniques have been developed recently [17, 16, 27, 7, 19]. { On an Internet-scale, rapidly spreading worms such as hit-list worms are con- sidered to pose the greatest challenge, as they can propagate through the In- ternet within a fraction of a second, before today's worm defense technologies can respond. Diversity-based defenses can slow down the propagation sub- stantially, since each infection step would take of the order of minutes rather than milliseconds, thus giving the time needed for the defensive technologies to respond. In addition to time delays, note that the need for repetition of attacks makes them very noisy, and hence easier to be spotted by worm-defense (or other defensive) technologies. { In an intrusion tolerant system consisting of k copies of a vulnerable server, the likelihood of simultaneous compromise of all copies decreases exponentially with k. If the probability of successful attack on a single server instance is 10¡4, this probability reduces to the order of 10¡12 with 3 copies of the server. 1.2 Contributions of this paper Prompted by the above bene¯ts of automated diversity, several researchers have developed and implemented techniques for address-space randomization on UNIX- systems. However, the true potential of automated diversity in protecting against Internet-wide threats won't be realized unless randomization solutions can be developed for the Windows operating system, which accounts for over 90% of the computers on the Internet. To the best of our knowledge, this is the ¯rst paper to present techniques for address-space randomization on Windows. This is not just an implementation exercise, as the architecture of Windows is quite di®erent from UNIX, and poses several unique challenges that necessitate the develop- ment of new techniques for realizing randomization. Some of these challenges are: { Lack of UNIX-style shared libraries. In UNIX, dynamically loaded libraries contain position-independent code, which means that they can be shared across multiple processes even if they are loaded at di®erent virtual memory addresses for each process. In contrast, Windows DLLs are not position- independent. Hence, all programs that use a DLL need to load it at the same address in their virtual memory, or else, no sharing is possible. Since lack of sharing can seriously impair performance, we needed to develop techniques that can randomize locations of libraries without duplicating the code. { Di±culty of relocating critical DLLs. Security-critical DLLs such as ntdll and Kernel32 are mapped to a ¯xed memory location by Windows very early in the boot process. These libraries are used by every Windows application, and hence get mapped into this ¯xed location determined by Windows. Since most of the APIs targeted by attack code, including all of the system calls, reside in these DLLs, we needed to develop techniques to relocate these DLLs. 3 { Storage of process-control data within user space. Unlike UNIX, which keeps all process control data within the kernel, Windows stores a lot of process control data in user space in structures such as Process Environment Block (PEB) and Thread Environment Block (TEB). These structures

Load more