
CSE 127 Computer Security Stefan Savage, Spring 2019, Lecture 6 Low Level Software Security IV: Heap Corruption Memory management in C ▪ The C programming language uses explicit memory management – Data is allocated and deallocated dynamically from the heap by the program – Dynamic memory is accessed via pointers – The programmer is responsible for the details – The system does not track memory liveness (no GC) – The system does not ensure that pointers are live or valid (no smart pointers) ▪ C++ has the same issues Heap Corruption high address ▪ Additional dynamically-allocated data is stored on the “heap” ▪ Heap manager provides ability to allocate stack and deallocate memory buffers – malloc() and free() – Every block of memory that is allocated by mem mapped malloc() has to be released by a corresponding call to free() – As an attacker, what is your reaction when you see heap “has to be”? bss data text low address Heap Corruption ▪ Heap-based buffer overflows l Missing Checks l Flawed Checks l Integer Overflow l Etc. ▪ Pointer types, like other types in C, can also have casting and arithmetic problems. l Be very careful when manipulating pointers. l Keep in mind that pointer arithmetic is scaled by the size of the pointed-to type Heap Corruption ▪ What if the attacker is able to compromise data on the heap? ▪ What can be overwritten? – Program data stored on the heap – Heap metadata (i.e., for organizing the heap itself) ▪ What if the attacker can cause the program to use malloc() and free() in unexpected combinations? – Remember, the input frequently controls which path your program takes, and the attacker controls the input Heap Corruption via overwriting program data ▪ As is the case with stack buffer overflows, effect depends on variable semantics and usage. ▪ Generally anything that influences future execution path is a promising target. ▪ Typical problem cases: – Variables that store result of a security check ▪ Eg. isAutheticated, isValid, isAdmin, etc. – Variables used in security checks ▪ Eg. buffer_size, etc. – Data pointers ▪ Potential for further memory corruption – Function pointers (as good as a return address) ▪ Direct transfer of control when function is called through overwritten pointer ▪ vTables Aside: vTables ▪ How do virtual function calls class Base { public: virtual void foo() work in Object-Oriented {cout << "Hi\n";} }; languages? class Derived: public Base { public: void foo() ▪ What’s the abstraction? {cout << "Bye\n";} }; ▪ How is it implemented in void bar(Base* obj) reality? { obj->foo(); } – When obj->foo() is called from int main(int argc, char* argv[]) within bar(), how is control { transferred to the correct Base *b = new Base(); Derived *d = new Derived(); implementation of foo()? bar(b); bar(d); } Aside: vTables ▪ Each object contains a pointer to class Base its virtual function table (aka { public: virtual void foo() vtable) {cout << "Hi\n";} }; ▪ Vtable is an array of function class Derived: public Base pointers { public: void foo() – One entry per virtual function of the {cout << "Bye\n";} }; object’s class void bar(Base* obj) ▪ Based on the class and function, { obj->foo(); } the compiler knows which offset in which vtable to use int main(int argc, char* argv[]) – Interfaces and multiple inheritance { complicates this quite a bit, but the Base *b = new Base(); issue of relevance to us remains the Derived *d = new Derived(); same: in an OO language the heap will be full of function pointers bar(b); bar(d); } Recall: Weird Machines ▪ Complex systems almost always contain unintended functionality – “weird machines” ▪ An exploit is a mechanism by which an attacker triggers unintended functionality in the system – Programming of the weird machine ▪ Security requires understanding not just the intended, but also the unintended functionality present in the implementation – Developers’ blind spot – Attackers’ strength https://en.wikipedia.org/wiki/Weird_machine#/media/File:Weird_machine.png Heap Corruption via corrupting heap metadata ▪ An attacker can also try to attack the heap’s own implementation ▪ Overflow into heap data structures ▪ Get the victim to invoke malloc() and free() in unintended sequences or with invalid arguments. – Programming the heap weird machine – What heap behaviors are undefined? (and hence might trigger unintended functionality) ▪ Using a pointer to an object after you’ve free it (use after free) ▪ Freeing an object multiple times (double free) ▪ To understand the implications, we need to dive deeper into how the heap is managed Heap Basics ▪ Abstraction vs Reality – malloc() and free() (also new and delete) ▪ Abstraction – Dynamically allocate and release memory buffers as needed. Magic! ▪ ptr=malloc(20): Give me 20 bytes. ▪ free(ptr): I don’t need my 20 bytes any more. Send them back. ▪ Reality – Where does this memory come from? – How does “the system” know how much memory to reclaim when free() is called? – Details vary by heap implementation, but all follow a common pattern Heap Management ▪ The heap is managed by the aptly-named heap manager ▪ There are many different heap managers, even within a single system – Some are optimized for allocations of a certain size, for speed, for space efficiency, etc. – We will focus on part of Doug Lea’s heap, aka glibc dlmalloc Heap Management (glibc dlmalloc) ▪ The heap manager maintains contiguous “chunks” of available memory ▪ The heap layout evolves when malloc() and free() functions are called – Chunks may get allocated, freed, split, or coalesced. ▪ These chunks are stored in doubly linked lists (called bins) – Grouped (roughly) by chunk size ▪ When a new chunk becomes free, it is inserted into one of these lists. ▪ When a chunk gets allocated, it is removed from the list. Heap Management (glibc dlmalloc) ▪ Chunks – Basic units of memory on the heap chunk – Either Free or In Use ptr – User data + heap metadata ▪ Size: chunk size size|N|M|P ▪ Flags: chunk – N, M: not relevant for us data data – P: previous chunk is in use ▪ Note that the last word of the data block is the first word of the next chunk chunk In Use Chunk ▪ Malloc finds an appropriate size|N|M|P sized free chunk and returns a in use pointer to the start of the data data block ptr chunk size|N|M|P ▪ Adjacent chunks can be either “in use” or “free” in use data chunk size|N|M|P in use data chunk Free Chunk ▪ Free chunks are kept in doubly- size|N|M|P linked lists (bins) – Some of the unused data area of in use data each free chunk is used to store chunk forward and back pointers size|N|M|P ▪ Consecutive free chunks are fd coalesced during free() bk free – No two free chunks can be adjacent unused chunk to each other prev size size|N|M|P ▪ Additionally, the last word of in use unused data (first word of next chunk) contains a copy of the size data chunk of the free chunk. Free List ▪ Free chunks are kept in circular doubly-linked lists (bins) data first size|N|M|P last fd bk bin head unused prev…size data size|N|M|P fd bk unused prev…size data size|N|M|P fd bk unused prev size Free List ▪ Chunks are inserted into the list when they are freed. data first size|N|M|P last fd ▪ Chunks are removed from the bk bin head unused list if they get allocated, or if prev size they need to be combined with … data a newly-freed adjacent chunk size|N|M|P fd bk unused prev…size data size|N|M|P fd bk unused prev size Free List ▪ Unlink operation to remove a chunk from the free list: data first size|N|M|P last fd #define unlink(P, BK, FD) bk bin head unused { prev size FD = P->fd; … data BK = P->bk; size|N|M|P fd FD->bk = BK; bk unused BK->fd = FD; prev size } … data size|N|M|P fd bk unused prev size Free List ▪ Unlink operation to remove a chunk from the free list: data first size|N|M|P last fd #define unlink(P, BK, FD) bk bin head unused { prev size FD = P->fd; … data BK = P->bk; size|N|M|P fd BK FD->bk = BK; bk unused BK->fd = FD; prev size } FD … data size|N|M|P fd bk unused prev size Heap Corruption ▪ What can we do if we manage to get free() to act on data we control? – What if we can cause the heap manager to act on fake chunks? – For example: due to a buffer overflow on the heap, the heap metadata is overwritten… Heap Corruption ▪ Look at the unlink macro again #define unlink(P, BK, FD) { ▪ What can an attacker do if she FD = P->fd; manages to control what’s in BK = P->bk; the fd and bk fields of a free FD->bk = BK; chunk? BK->fd = FD; ▪ Write an attacker chosen value } to an attacker-chosen address – Write-what-where primitive Heap Corruption ▪ How can attacker corrupt the metadata in a free chunk? – Simple overflow – Indirect overwrite – Use after free – Double free Use-After-Free and Double Free ▪ Use-after-free ▪ free(p); p→foo(); ▪ free(p); q = malloc(n); memcpy(p, buf, k); ▪ Double free ▪ free(p); free(p); q = malloc(n); r = malloc(n); ▪ free(p); q = malloc(n); free(p); Use After Free ▪ Take a look at the this code char *p, *q; – Ignore the missing checks for result of malloc() p = malloc(20); ▪ This is how NULL pointer errors snprintf(p, 20, "Hi"); happen printf("%s\n", p); ▪ What will it print? free(p); ▪ What if p referenced a q = malloc(20); structure with function snprintf(q, 20, "Bye"); pointers? – Or a C++ object? printf("%s\n", p); free(q); Use After Free ▪ Note that in a multi-threaded char *p; application, the second malloc() may not be obvious.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages35 Page
-
File Size-