Buffer Overflows Public Enemy Number 1

Software Security Buffer Overflows public enemy number 1 Erik Poll Digital Security Radboud University Nijmegen The good news C is a small language that is close to the hardware • you can produce highly efficient code • compiled code runs on raw hardware with minimal infrastructure C is typically the programming language of choice • for highly efficient code • for embedded systems (which have limited capabilities) • for system software (operating systems, device drivers,...) 2 The bad news : using C(++) is dangerous 3 Essence of the problem Suppose in a C program we have an array of length 4 char buffer[4]; What happens if we execute the statement below ? buffer[4] = ‘a’; This is UNDEFINED! ANYTHING can happen ! If the data written (ie. the “a”) is user input that can be controlled by an attacker, this vulnerability can be exploited: anything that the attacker wants can happen. 4 Solution to this problem • Check array bounds at runtime – Algol 60 proposed this back in 1960! • Unfortunately, C and C++ have not adopted this solution, for efficiency reasons. (Perl, Python, Java, C#, and even Visual Basic have) • As a result, buffer overflows have been the no 1 security problem in software ever since 5 Problems caused by buffer overflows • The first Internet worm, and all subsequent ones (CodeRed, Blaster, ...), exploited buffer overflows • Buffer overflows cause in the order of 50% of all security alerts – Eg check out CERT, cve.mitre.org, or bugtraq • Trends – Attacks are getting cleverer • defeating ever more clever countermeasures – Attacks are getting easier to do, by script kiddies 6 Any C(++) code acting on untrusted input is at risk • code taking input over untrusted network – eg. sendmail, web browser, wireless network driver,... • code taking input from untrusted user on multi-user system, – esp. services running with high privileges (as ROOT on Unix/Linux, as SYSTEM on Windows) • code acting on untrusted files – that have been downloaded or emailed • also embedded software - eg. in devices with (wireless) network connections such as mobile phones, RFID card, airplane navigation systems, ... 7 How does buffer overflow work? Memory management in C/C++ • A program is responsible for its memory management • Memory management is very error-prone – Who here has had a C(++) program crash with a segmentation fault? Technical term: C and C++ do not offer memory-safety • Typical bugs: – Writing past the bound of an array – Pointer trouble • missing initialisation, bad pointer arithmetic, use after de- allocation (use after free), double de-allocation, failed allocation, forgotten de-allocation (memory leaks)... • For efficiency, these bugs are not detected at run time: – behaviour of a buggy program is undefined 9 Process memory layout High Arguments/ Environment Stack grows addresses down, Stack by procedure calls Unused Memory Heap (dynamic data) Heap grows up, eg. by malloc Static Data .data and new Low addresses Program Code .text 10 Stack overflow The stack consists of Activation Records: x AR main() return address AR f() buf[4..7] buf[0..3] void f(int x) { Stack grows Buffer grows char[8] buf; downwards upwards gets(buf); } void main() { f(…); … } void format_hard_disk(){…} 11 Stack overflow What if gets() reads more than 8 bytes ? : x AR main() return address AR f() buf[4..7] buf[0..3] void f(int x) { Buffer grows char[8] buf; upwards gets(buf); } void main() { f(…); … } void format_hard_disk(){…} 12 Stack overflow What if gets() reads more than 8 bytes ? x AR main() return address AR f() buf[4..7] buf[0..3] void f(int x) { Buffer grows char[8] buf; upwards gets(buf); } void main() { f(…); … } void format_hard_disk(){…} 13 Stack overflow What if gets() reads more than 8 bytes ? x AR main() return address AR f() buf[4..7] buf[0..3] void f(int x) { Stack grows Buffer grows char[8] buf; downwards upwards gets(buf); } never use void main() { gets()! f(…); … } void format_hard_disk(){…} 14 Stack overflow • How the attacks works: overflowing buffers to corrupt data • Lots of details to get right: – No nulls in (character-)strings – Filling in the correct return address: • Fake return address must be precisely positioned • Attacker might not know the address of his own string – Other overwritten data must not be used before return from function – … • Variant: Heap overflow of a buffer allocated on the heap instead of the stack 15 What to attack? Fun with the stack void f(char* buf1, char* buf2, bool b1) { int i; bool b2; void (*fp)(int); char[] buf3; .... } Overflowing stack-allocated buffer buf3 to • corrupt return address – ideally, let this return address point to another buffers where the attack code is placed • corrupt function pointers, such as fp • corrupt any other data on the stack, eg. b2, i, b1, buf2,.. ... What to attack? Fun on the heap struct account { int number; bool isSuperUser; char name[20]; int balance; } overrun name to corrupt the values of other fields in the struct What causes buffer overflows? Example: gets char buf[20]; gets(buf); // read user input until // first EoL or EoF character • Never use gets • Use fgets(buf, size, stdin) instead 19 Example: strcpy char dest[20]; strcpy(dest, src); // copies string src to dest • strcpy assumes dest is long enough , and assumes src is null-terminated • Use strncpy(dest, src, size) instead 20 Spot the defect! (1) char buf[20]; char prefix[] = ”http://”; ... strcpy(buf, prefix); // copies the string prefix to buf strncat(buf, path, sizeof(buf)); // concatenates path to the string buf 21 Spot the defect! (1) char buf[20]; char prefix[] = ”http://”; ... strcpy(buf, prefix); // copies the string prefix to buf strncat(buf, path, sizeof(buf)); // concatenates path to the string buf strncat’s 3rd parameter is number of chars to copy, not the buffer size Another common mistake is giving sizeof(path) as 3rd argument... 22 Spot the defect! (2) char src[9]; char dest[9]; char* base_url = ”www.ru.nl”; strncpy(src, base_url, 9); // copies base_url to src strcpy(dest, src); // copies src to dest 23 Spot the defect! (2) base_url is 10 chars long, incl. its char src[9]; null terminator, so src will not be char dest[9]; null-terminated char* base_url = ”www.ru.nl”; strncpy(src, base_url, 9); // copies base_url to src strcpy(dest, src); // copies src to dest 24 Spot the defect! (2) base_url is 10 chars long, incl. its char src[9]; null terminator, so src will not be char dest[9]; null-terminated char* base_url = ”www.ru.nl”; strncpy(src, base_url, 9); // copies base_url to src strcpy(dest, src); // copies src to dest so strcpy will overrun the buffer dest 25 Example: strcpy and strncpy • Don’t replace strcpy(dest, src) by strncpy(dest, src, sizeof(dest)) but by strncpy(dest, src, sizeof(dest)-1) dst[sizeof(dest)-1] = `\0`; if dest should be null-terminated! • Btw: a strongly typed programming language could of course enforce that strings are always null-terminated... 26 Spot the defect! (3) char *buf; int i, len; read(fd, &len, sizeof(int)); // read sizeof(int) bytes, ie. an int, // and store these in len buf = malloc(len); read(fd,buf,len); // read len bytes into buf 27 Spot the defect! (3) char *buf; len might become negative int i, len; read(fd, &len, sizeof(int)); // read sizeof(int) bytes, ie. an int, // and store these in len buf = malloc(len); read(fd,buf,len); // read len bytes into buf len cast to unsigned, so negative length overflows read then goes beyond the end of buf 28 Spot the defect! (3) char *buf; int i, len; read(fd, &len, sizeof(len)); if (len < 0) {error ("negative length"); return; } buf = malloc(len); read(fd,buf,len); Remaining problem may be that buf is not null-terminated 29 Spot the defect! (3) char *buf; May result in integer overflow; int i, len; we should check that len+1 is positive read(fd, &len, sizeof(len)); if (len < 0) {error ("negative length"); return; } buf = malloc(len+1); read(fd,buf,len); buf[len] = '\0'; // null terminate buf 30 Spot the defect! (5) #define MAX_BUF 256 void BadCode (char* input) { short len; char buf[MAX_BUF]; len = strlen(input); if (len < MAX_BUF) strcpy(buf,input); } 31 Spot the defect! (5) #define MAX_BUF 256 What if input is longer than 32K ? void BadCode (char* input) { short len; len will be a negative number, char buf[MAX_BUF]; due to integer overflow len = strlen(input); hence: potential buffer overflow if (len < MAX_BUF) strcpy(buf,input); } The integer overflow is the root problem, but the (heap) buffer overflow that this enables make it exploitable 32 Spot the defect! (4) #ifdef UNICODE #define _sntprintf _snwprintf #define TCHAR wchar_t #else #define _sntprintf _snprintf #define TCHAR char #endif TCHAR buff[MAX_SIZE]; _sntprintf(buff, sizeof(buff), ”%s\n”, input); [slide from presentation by Jon Pincus] 33 Spot the defect! (4) #ifdef UNICODE #define _sntprintf _snwprintf #define TCHAR wchar_t #else #define _sntprintf _snprintf #define TCHAR char nd #endif _sntprintf’s 2 param is # of chars in buffer, not # of bytes TCHAR buff[MAX_SIZE]; _sntprintf(buff, sizeof(buff), ”%s\n”, input); The CodeRed worm exploited such an mismatch: code written under the assumption that 1 char was 1 byte allowed buffer overflows after the move from ASCI to Unicode [slide from presentation by Jon Pincus] 34 Spot the defect! (6) bool CopyStructs(InputFile* f, long count) { structs = new Structs[count]; for (long i = 0; i < count; i++) { if !(ReadFromFile(f,&structs[i]))) break; } } 35 Spot the defect! (6) bool CopyStructs(InputFile* f, long count) { structs = new Structs[count]; for (long i = 0; i < count; i++) { if !(ReadFromFile(f,&structs[i]))) break; } } effectively does a malloc(count*sizeof(type)) which may cause integer overflow And this integer overflow can lead to a (heap) buffer overflow.

Buffer Overflows Public Enemy Number 1

Tesi Final Marco Oliverio.Pdf

An In-Depth Study with All Rust Cves

Ensuring the Spatial and Temporal Memory Safety of C at Runtime

Improving Memory Management Security for C and C++

Buffer Overflows Buffer Overflows

Memory Vulnerability Diagnosis for Binary Program

A Buffer Overflow Study

Buffer Overflow and Format String Overflow Vulnerabilities 3

Chapter 10 Buffer Overflow Buffer Overflow

Memory Management

Table of Contents

Diehard: Probabilistic Memory Safety for Unsafe Languages