Beyond Stack Smashing: Recent Advances in Exploiting Buffer Overruns

Attacking Systems Beyond Stack Smashing: Recent Advances in Exploiting Buffer Overruns This article describes three powerful general-purpose families of exploits for buffer overruns: arc injection, pointer subterfuge, and heap smashing. These new techniques go beyond the traditional “stack smashing” attack and invalidate traditional assumptions about buffer overruns. JONATHAN ecurity vulnerabilities related to buffer overruns fact practical for PINCUS account for the largest share of CERT advi- real-world vulnera- Microsoft sories, as well as high-profile worms—from the bilities. The Apache/Open_SSL Slapper worm1 was the Research S original Internet Worm in 1987 through first high-profile worm to employ heap smashing. Blaster’s appearance in 2003. When malicious crackers The wide variety of exploits published for the stack BRANDON discover a vulnerability, they devise exploits that take ad- buffer overrun that Microsoft Security Bulletin MS03- BAKER vantage of the vulnerability to attack a system. 0262 addressed (the vulnerability exploited by Blaster) Microsoft The traditional approach to exploiting buffer over- further illustrates the viability of these new techniques. runs is stack smashing: modifying a return address saved Table 1 shows a few examples. on the stack (the region of memory used for parame- These new approaches typically first surface in non- ters, local variables, and return address) to point to code traditional hacker publications. (For more discussion of the attacker supplies that resides in a stack buffer at a this, including references to the origins of many of these known location. Discussions of buffer overrun ex- techniques, see the “Nontraditional literature” sidebar on ploitation in software engineering literature typically page 26.) A few brief summaries in the mainstream liter- concentrate on stack-smashing attacks. As a result, ature,3,4 and recent books (such as Exploiting Software: many software engineers and even security profession- How to Break Code5 and The Shellcoder’s Handbook: Discov- als seemingly assume that all buffer overrun exploits ering and Exploiting Security Holes6), also discuss some of operate in a similar manner. these techniques. During the past decade, however, hackers have de- Understanding new approaches to exploitation is vital veloped several additional approaches to exploit buffer for evaluating countermeasures’ effectiveness in address- overruns. The arc injection technique (sometimes re- ing the buffer overrun problem. Failure to consider them ferred to as return-into-libc) involves a control transfer leads people to suggest overly simplistic and worthless to code that already exists in the program’s memory “solutions” (for example, “move all the buffers to the space. Various pointer subterfuge techniques change the heap”) or to overstate the value of useful improvements program’s control flow by attacking function pointers (for instance, valuable mitigation methods such as a (pointer variables whose value is used as an address in a nonexecutable stack, which many mistakenly view as a function call) as an alternative to the saved return ad- silver bullet). dress, or modify arbitrary memory locations to create more complex exploits by subverting data pointers. Exploiting buffer overruns Heap smashing allows exploitation of buffer overruns A buffer overrun occurs when a program attempts to read in dynamically allocated memory, as opposed to on or write beyond the end of a bounded array (also known the stack. as a buffer). In runtime environments for languages such as While these techniques appear esoteric, they are in Pascal, Ada, Java, and C#, the runtime environment de- 20 PUBLISHED BY THE IEEE COMPUTER SOCIETY I 1540-7993/04/$20.00 © 2004 IEEE I IEEE SECURITY & PRIVACY Authorized licensed use limited to: IEEE Xplore. Downloaded on March 17, 2009 at 14:46 from IEEE Xplore. Restrictions apply. Attacking Systems Table 1. Different exploits of Microsoft Security Bulletin MS03-026. AUTHOR TECHNIQUE COMMENTS Last Stage of Delirium Unknown Original report; specific exploit was never published XFocus Stack smashing Apparently used by Blaster author Litchfield Pointer subterfuge Cigital Pointer subterfuge Unpublished K-otic Arc injection Authors of this article Pointer subterfuge and arc injection Unpublished tects buffer overruns and generates an exception. In the runtime environ- (a) ment for C and C++, however, no void f1a(void * arg, size_t len) { such checking is performed. Attack- char buff[100]; ers can often exploit the defect by memcpy(buff, arg, len); /* buffer overrun if len > 100 */ using the buffer overrun to change the /* ... */ program’s execution. Such exploits return; can in turn provide the infection vec- } tor for a worm or a targeted attack on a given machine. (b) A buffer overrun is characterized void f1b(void * arg, size_t len) { as a stack buffer overrun or heap buffer char * ptr = malloc(100); overrun depending on what memory if (ptr == NULL) return; gets overrun. C and C++ compilers memcpy(ptr, arg, len); /* buffer overrun if len > 100 */ typically use the stack for local vari- /* ... */ ables as well as parameters, frame return; pointers, and saved return addresses. } Heaps, in this context, refer to any dynamic memory implementations Figure 1. Code samples with traditional (simple) buffer overrun defects. (a) A stack such as the C standard library’s mal- buffer overrun and (b) a heap buffer overrun. loc/free, C++’s new/delete, or the Microsoft Windows APIs Heap- Alloc/HeapFree. Figure 1 provides examples of causes the buffer overrun, but this need not be the case. functions containing a stack buffer overrun (a) and heap All that is required is that the payload be at a known or buffer overrun (b). discoverable location in memory at the time that the un- Published general-purpose exploits for buffer over- expected control-flow transfer occurs. runs typically involve two steps: Stack smashing 1. Change the program’s flow of control. (Pure data ex- Stack smashing, a technique first described in detail in the ploits in which the buffer happens to be adjacent to a mid ‘90s by such hackers as AlephOne and DilDog, illus- security-critical variable operate without changing trates these two steps: changing the flow of control to ex- the program’s flow of control; these are relatively rare, ecute attacker-supplied code. Stack smashing relies on and to date no general-purpose techniques have been the fact that most C compilers store the saved return ad- published relating to them.) dress on the same stack used for local variables. 2. Execute some code (potentially supplied by the at- To exploit the buffer overrun in f1a via a classic tacker) that operates on some data (also potentially stack-smashing attack, the attacker must supply a value supplied by the attacker). for arg (for example, as received from a network packet) that contains both an executable payload and the address The term payload refers to the combination of code or at which the payload will be loaded into the program’s data that the attacker supplies to achieve a particular goal memory—that is, the address of buff. This value must (for example, propagating a worm). The attacker some- be positioned at a location within arg that corresponds times provides the payload as part of the operation that to where f1a’s return address is stored on the program’s www.computer.org/security/ I IEEE SECURITY & PRIVACY 21 Authorized licensed use limited to: IEEE Xplore. Downloaded on March 17, 2009 at 14:46 from IEEE Xplore. Restrictions apply. Attacking Systems stack when the buffer overrun occurs; depending on the lowed the attacker to run an arbitrary program. The specific compiler used to compile f1a, this will typically term “arc injection” refers to how these exploits oper- be somewhere around 108 bytes into arg. This new ate: the exploit just inserts a new arc (control-flow value for the saved return address changes the program’s transfer) into the program’s control-flow graph, as op- control flow: when the return instruction executes, con- posed to code injection-exploits such as stack smash- trol is transferred to buff instead of returning to the call- ing, which also insert a new node. ing procedure. A straightforward version of arc injection is to use a Stack smashing is well described in mainstream liter- stack buffer overrun to modify the saved return address ature,1 so we won’t discuss it in detail in this article. to point to a location already in the program’s address However, two important enhancements to the standard space—more specifically, to a location within the sys- stack-smashing technique are often used in conjunction tem function in the C standard library. The system with some of the other approaches, and therefore are function takes an arbitrary command line as an argu- worth describing. ment, checks the argument’s validity, loads it into a reg- Trampolining lets an attacker apply stack smashing in ister R, and makes a system call to create the process. situations in which buff’s absolute address is not known Eliding some details, pseudocode for the system ahead of time. The key insight here is that if a program function is: register R contains a value relative to buff, we can transfer control to buff by first transferring control to a se- void system(char * arg) { quence of instructions that indirectly transfers via R. check_validity(arg); When an attacker can find such a sequence of instructions R = arg; (known as the trampoline) at a well-known or predictable target: address, this provides a reliable mechanism for transfer- execl(R, ...) ring control to buff. Alternatively, a pointer to attacker- } supplied data can often be found somewhere on the stack, and so exploits often use sequences of instructions such as If an attacker can arrange for R to point to an attacker- pop/pop/ret to trampoline without involving regis- supplied string and then jump directly to the location ters.

Beyond Stack Smashing: Recent Advances in Exploiting Buffer Overruns

Declaring Pointers in Functions C

Using the GNU Compiler Collection (GCC)

Bash Shell Scripts

Introduction to Linux by Lars Eklund Based on Work by Marcus Lundberg

Refining Indirect-Call Targets with Multi-Layer Type Analysis

SI 413, Unit 3: Advanced Scheme

Object-Oriented Programming in C

Conda-Build Documentation Release 3.21.5+15.G174ed200.Dirty

Call Graph Generation for LLVM (Due at 11:59Pm 3/7)

Lambda Expressions

Metal C Programming Guide and Reference

Pointer Subterfuge Secure Coding in C and C++