Introduction to Computer Systems 15-213/18-243, Spring 2009

Today Explicit free lists Dynamic Memory Allocation: Segregated free lists Advanced Concepts Garbage collection Memory-related perils and pitfalls CSci 2021: Machine Architecture and Organization Lecture #32, April 13th, 2015 Your instructor: Stephen McCamant Based on slides originally by: Randy Bryant, Dave O’Hallaron, Antonia Zhai 1 2 Keeping Track of Free Blocks Explicit Free Lists Method 1: Implicit free list using length—links all blocks Allocated (as before) Free Size a Size a 5 4 6 2 Next Payload and Prev padding Method 2: Explicit free list among the free blocks using pointers Size a Size a 5 4 6 2 Method 3: Segregated free list Maintain list(s) of free blocks, not all blocks . Different free lists for different size classes . The “next” free block could be anywhere . So we need to store forward/back pointers, not just sizes Method 4: Blocks sorted by size . Still need boundary tags for coalescing . Can use a balanced tree (e.g. Red-Black tree) with pointers within each free block, and the length used as a key . Luckily we track only free blocks, so we can use payload area 3 4 Explicit Free Lists Allocating From Explicit Free Lists conceptual graphic Logically: Before A B C Physically: blocks can be in any order After (with splitting) Forward (next) links A B 4 4 4 4 6 6 4 4 4 4 C Back (prev) links = malloc(…) 5 6 1 Freeing With Explicit Free Lists Freeing With a LIFO Policy (Case 1) conceptual graphic Insertion policy: Where in the free list do you put a newly freed block? Before free( ) . LIFO (last-in-first-out) policy . Insert freed block at the beginning of the free list . Pro: simple and constant time Root . Con: studies suggest fragmentation is worse than address-ordered . Address-ordered policy Insert the freed block at the root of the list . Insert freed blocks so that free list blocks are always in address After order: addr(prev) < addr(curr) < addr(next) . Con: requires search Root . Pro: studies suggest fragmentation is lower than LIFO 7 8 Freeing With a LIFO Policy (Case 2) Freeing With a LIFO Policy (Case 3) conceptual graphic conceptual graphic Before free( ) Before free( ) Root Root Splice out predecessor block, coalesce both memory blocks, Splice out successor block, coalesce both memory blocks and and insert the new block at the root of the list insert the new block at the root of the list After After Root Root 9 10 Freeing With a LIFO Policy (Case 4) Explicit List Summary conceptual graphic Before free( ) Comparison to implicit list: . Allocate is linear time in number of free blocks instead of all blocks . Much faster when most of the memory is full Root . Slightly more complicated allocate and free since needs to splice blocks in and out of the list . Some extra space for the links (2 words needed for each block) . Does this increase internal fragmentation? Splice out predecessor and successor blocks, coalesce all 3 memory blocks and insert the new block at the root of the list Most common use of linked lists is in conjunction with After segregated free lists . Keep multiple linked lists of different size classes, or possibly for different types of objects Root 11 12 2 Keeping Track of Free Blocks Today Method 1: Implicit list using length—links all blocks Explicit free lists Segregated free lists 5 4 6 2 Garbage collection Memory-related perils and pitfalls Method 2: Explicit list among the free blocks using pointers 5 4 6 2 Method 3: Segregated free list . Different free lists for different size classes Method 4: Blocks sorted by size . Can use a balanced tree (e.g. Red-Black tree) with pointers within each free block, and the length used as a key 13 14 Segregated List (Seglist) Allocators Seglist Allocator Each size class of blocks has its own free list Given an array of free lists, each one for some size class 1-2 To allocate a block of size n: . Search appropriate free list for block of size m > n 3 . If an appropriate block is found: . Split block and place fragment on appropriate list (maybe) 4 . If no block is found, try next larger class . Repeat until block is found 5-8 If no block is found: 9-inf . Request additional heap memory from OS (using sbrk()) . Allocate block of n bytes from this new memory Often have separate classes for each small size . Place remainder as a single free block in largest size class. For larger sizes: One class for each power-of-two size 15 16 Seglist Allocator (cont.) More Info on Allocators To free a block: D. Knuth, “The Art of Computer Programming”, 2nd edition, . Coalesce and place on appropriate list (optional) Addison Wesley, 1973 . The classic reference on dynamic storage allocation Advantages of seglist allocators . Higher throughput Wilson et al, “Dynamic Storage Allocation: A Survey and . log time for power-of-two size classes Critical Review”, Proc. 1995 Int’l Workshop on Memory . Better memory utilization Management, Kinross, Scotland, Sept, 1995. First-fit search of segregated free list approximates a best-fit . Comprehensive survey search of entire heap. Available from CS:APP student site (csapp.cs.cmu.edu) . Extreme case: Giving each block its own size class is equivalent to best-fit. 17 18 3 Today Implicit Memory Management: Garbage Collection Explicit free lists Garbage collection: automatic reclamation of heap-allocated Segregated free lists storage—application never has to free Garbage collection void foo() { Memory-related perils and pitfalls int *p = malloc(128); return; /* p block is now garbage */ } Common in functional languages, scripting languages, and modern object oriented languages: . Lisp, ML, Java, Perl, Python, Mathematica, etc. Variants (“conservative” garbage collectors) exist for C and C++ . However, cannot necessarily collect all garbage 19 20 Garbage Collection Classical GC Algorithms Mark-and-sweep collection (McCarthy, 1960) How does the memory manager know when memory can be freed? . Does not move blocks (unless you also “compact”) . In general we cannot know what is going to be used in the future since it Reference counting (Collins, 1960) depends on conditionals . Does not move blocks (not discussed) . But we can tell that certain blocks cannot be used if there are no Copying collection (Minsky, 1963) pointers to them . Moves blocks (not discussed) Must make certain assumptions about pointers Generational Collectors (Lieberman and Hewitt, 1983) . Memory manager can distinguish pointers from non-pointers . Collection based on lifetimes . All pointers point to the start of a block . Most allocations become garbage very soon . Cannot hide pointers . So focus reclamation work on zones of memory recently allocated (e.g., by coercing them to an int, and then back again) For more information: Jones and Lin, “Garbage Collection: Algorithms for Automatic Dynamic Memory”, John Wiley & Sons, 1996. 21 22 Memory as a Graph Mark and Sweep Collecting We view memory as a directed graph Can build on top of malloc/free package . Each block is a node in the graph . Allocate using malloc until you “run out of space” . Each pointer is an edge in the graph When out of space: . Locations not in the heap that contain pointers into the heap are called root nodes (e.g. registers, locations on the stack, global variables) . Use extra mark bit in the head of each block . Mark: Start at roots and set mark bit on each reachable block Root nodes . Sweep: Scan all blocks and free blocks that are not marked root Note: arrows Heap nodes reachable here denote Before mark memory refs, not Not-reachable free list ptrs. (garbage) After mark Mark bit set A node (block) is reachable if there is a path from any root to that node. free free Non-reachable nodes are garbage (cannot be needed by the application) After sweep 23 24 4 Assumptions For a Simple Implementation Mark and Sweep (cont.) Application Mark using depth-first traversal of the memory graph . new(n): returns pointer to new block with all locations cleared ptr mark(ptr p) { if (!is_ptr(p)) return; // do nothing if not pointer . read(b,i): read location i of block b into register if (markBitSet(p)) return; // check if already marked . write(b,i,v): write v into location i of block b setMarkBit(p); // set the mark bit for (i=0; i < length(p); i++) // call mark on all words mark(p[i]); // in the block return; Each block will have a header word } . addressed as b[-1], for a block b . Used for different purposes in different collectors Sweep using lengths to find next block ptr sweep(ptr p, ptr end) { while (p < end) { Instructions used by the Garbage Collector if markBitSet(p) . is_ptr(p): determines whether p is a pointer clearMarkBit(); . length(b): returns the length of block b, not including the header else if (allocateBitSet(p)) free(p); . get_roots(): returns all the roots p += length(p) + 1; } 25 26 Conservative Mark & Sweep in C Today A “conservative garbage collector” for C programs Explicit free lists . is_ptr() determines if a word is a pointer by checking if it points to an allocated block of memory Segregated free lists . But, in C pointers can point to the middle of a block Garbage collection ptr Header Memory-related perils and pitfalls So how to find the beginning of the block? . Can use a balanced binary tree to keep track of all allocated blocks (key is start-of-block) . Balanced-tree pointers can be stored in header (use two additional words) Head Data Size Left: smaller addresses Right: larger addresses Left Right 27 28 Memory-Related Perils and Pitfalls C operators Operators Associativity Dereferencing bad pointers () [] -> .

Introduction to Computer Systems 15-213/18-243, Spring 2009

Tesi Final Marco Oliverio.Pdf

An In-Depth Study with All Rust Cves

Ensuring the Spatial and Temporal Memory Safety of C at Runtime

Improving Memory Management Security for C and C++

Buffer Overflows Buffer Overflows

Memory Vulnerability Diagnosis for Binary Program

A Buffer Overflow Study

Buffer Overflow and Format String Overflow Vulnerabilities 3

Chapter 10 Buffer Overflow Buffer Overflow

Memory Management

Table of Contents

Diehard: Probabilistic Memory Safety for Unsafe Languages