CMSC 330: Organization of Programming Languages Memory

Total Page:16

File Type:pdf, Size:1020Kb

CMSC 330: Organization of Programming Languages Memory Memory Attributes CMSC 330: Organization of Memory to store data in programming Programming Languages languages has the following lifecycle • Allocation ! When the memory is allocated to the program • Lifetime ! How long allocated memory is used by the program Memory Management and • Recovery Garbage Collection ! When the system recovers the memory for reuse The allocator is the system feature that performs allocation and recovery CMSC 330 1 CMSC 330 2 Memory Attributes (cont.) Memory Classes Most programming languages are concerned Static memory – Usually at a fixed address with some subset of the following 4 memory • Lifetime – The execution of program classes • Allocation – For entire execution 1. Static (or fixed) memory • Recovery – By system when program terminates 2. Automatic memory • Allocator – Compiler 3. Dynamically allocated memory Automatic memory – Usually on a stack 4. Persistent memory • Lifetime – Activation of method using that data • Allocation – When method is invoked • Recovery – When method terminates • Allocator – Typically compiler, sometimes programmer CMSC 330 3 CMSC 330 4 Memory Classes (cont.) Memory Classes (cont.) Dynamic memory – Addresses allocated on Persistent memory – Usually the file system demand in an area called the heap • Lifetime – Multiple executions of a program • Lifetime – As long as memory is needed ! E.g., files or databases • Allocation – Explicitly by programmer, or implicitly by • Allocation – By program or user compiler ! Often outside of program execution • Recovery – Either by programmer or automatically • Recovery – When data no longer needed (when possible and depends upon language) • Dealing with persistent memory → databases • Allocator – Manages free/available space in heap ! CMSC 424 CMSC 330 5 CMSC 330 6 Memory Management in C Memory Management Errors Local variables live on the stack May forget to free memory (memory leak) • Allocated at function invocation time { int *x = (int *) malloc(sizeof(int)); } • Deallocated when function returns May use freed memory (dangling pointer) • Storage space reused after function returns { int *x = ...malloc(); Has security free(x); Implications! *x = 5; /* oops! */ Space on the heap allocated with malloc() } • Must be explicitly freed with free() May try to free something twice • Called explicit or manual memory management { int *x = ...malloc(); free(x); free(x); } ! Deletions must be done by the user ! This may corrupt the memory management data structures • E.g., the memory allocator maintains a free list of space on the heap that’s available CMSC 330 7 CMSC 330 8 Ways to Avoid Mistakes in C Fragmentation Don’t allocate memory on the heap Another memory management problem • Could lead to confusing code; or may not work at all Example sequence of calls Never free memory allocate(a); • OS will reclaim process’s memory anyway at exit allocate(x); • Memory is cheap; who cares about a little leak? allocate(y); free(a); But: Both of the above two may be impractical allocate(z); free(y); Can avoid all three problems by using automatic memory management allocate(b); • Though it does not prevent all leaks, as we will see ⇒ Not enough contiguous space for b CMSC 330 9 CMSC 330 10 Automatic memory management Strategy Primary goal: automatically reclaim dynamic memory • Secondary goal: also avoid fragmentation At any point during execution, can divide the objects in the heap into two classes • Live objects will be used later • Dead objects will never be used again ! They are garbage Insight: You can do reclamation and avoid fragmentation if you can identify every pointer in a program Thus we need garbage collection (GC) • You can move the allocated storage, then redirect pointers to it algorithms that can ! Compact it, to avoid fragmentation • Compiler ensures perfect knowledge LISP, OCAML, Java, Prolog 1. Distinguish live from dead objects (with caveats), but not in C, C++, Pascal, Ada 2. Reclaim the dead objects and retain the live ones CMSC 330 11 CMSC 330 12 Determining Liveness Liveness by Reachability In most languages we can’t know for sure which An object is reachable if it can be accessed by objects are really live or dead dereferencing (“chasing”) pointers from live data • Undecidable, like solving the halting problem Safe policy: delete unreachable objects Thus we need to make a safe approximation • An unreachable object can never be accessed again • OK if we decide something is live when it’s not by the program • But we’d better not deallocate an object that will be ! The object is definitely garbage used later on • A reachable object may be accessed in the future ! The object could be garbage but will be retained anyway ! Could lead to memory leaks CMSC 330 13 CMSC 330 14 Roots Reference Counting At a given program point, we define liveness as Idea: Each object has count of number of being data reachable from the root set pointers to it from the roots or other objects • Global variables • When count reaches 0, object is unreachable ! What are these in Java? Ruby? OCaml? Count tracking code may be manual or automatic • Local variables of all live method activations In regular use ! I.e., the stack • C++ (smart pointer library), Cocoa (manual), Python At the machine level Method doesn’t address fragmentation problem • Also consider the register set ! Usually stores local or global variables Invented by Collins in 1960 Next • “A method for overlapping and erasure of lists.” • Techniques for determining reachability Communications of the ACM, December 1960 CMSC 330 15 CMSC 330 16 Reference Counting Example Reference Counting Example (cont.) stack stack 1 2 1 2 1 1 1 1 CMSC 330 17 CMSC 330 18 Reference Counting Example (cont.) Reference Counting Example (cont.) stack stack 1 2 1 1 2 1 1 1 0 1 1 CMSC 330 19 CMSC 330 20 Reference Counting Example (cont.) Reference Counting Example (cont.) stack stack 1 2 1 1 2 1 0 1 1 CMSC 330 21 CMSC 330 22 Reference Counting Example (cont.) Reference Counting Tradeoffs stack Advantage • Incremental technique ! Generally small, constant amount of work per memory write ! With more effort, can even bound running time Disadvantages • Cascading decrements can be expensive 1 • Requires extra storage for reference counts • Need other means to collect cycles, for which counts never go to 0 CMSC 330 23 CMSC 330 24 Tracing Garbage Collection Mark and Sweep GC Idea: Determine reachability as needed, rather Two phases than by stored counts • Mark phase: trace the heap and mark all reachable objects Every so often, stop the world and • Follow pointers from live objects (starting at roots) to • Sweep phase: go through the entire heap and expand the live object set reclaim all unmarked objects ! Repeat until no more reachable objects • Deallocate any non-reachable objects Two main variants of tracing GC • Mark/sweep (McCarthy 1960) and stop-and-copy (Cheney 1970) CMSC 330 25 CMSC 330 26 Mark and Sweep Example Mark and Sweep Example (cont.) stack stack CMSC 330 27 CMSC 330 28 Mark and Sweep Example (cont.) Mark and Sweep Example (cont.) stack stack CMSC 330 29 CMSC 330 30 Mark and Sweep Example (cont.) Mark and Sweep Example (cont.) stack stack CMSC 330 31 CMSC 330 32 Mark and Sweep Example (cont.) Mark and Sweep Advantages stack No problem with cycles Memory writes have no cost Non-moving • Live objects stay where they are • Makes conservative GC possible ! Used when identification of pointer vs. non-pointer uncertain ! More later CMSC 330 33 CMSC 330 34 Mark and Sweep Disadvantages Copying GC Fragmentation Like mark and sweep, but only touches live • Available space broken up into many small pieces objects ! Thus many mark-and-sweep systems may also have a • Divide heap into two equal parts (semispaces) compaction phase (like defragmenting your disk) • Only one semispace active at a time Cost proportional to heap size • At GC time, copy live data, and flip semispaces • Sweep phase needs to traverse whole heap – it 1. Trace the live data starting from the roots touches dead memory to put it back on to the free list 2. Copy live data into other semispace 3. Declare everything in current semispace dead 4. Flip to other semispace CMSC 330 35 CMSC 330 36 Copying GC Example Copying GC Example (cont.) stack stack ① ① CMSC 330 37 CMSC 330 38 Copying GC Example (cont.) Copying GC Example (cont.) stack stack ① ① ③ ② ② ③ ② ② ① ① CMSC 330 39 CMSC 330 40 Copying GC Tradeoffs Stop the World: Potentially Long Pause Advantages Both of the previous algorithms “stop the world” • Only touches live data by prohibiting program execution during GC • No fragmentation (automatically compacts) • Ensures that previously processed memory is not ! Will probably increase locality changed or accessed, creating inconsistency Disadvantages But the execution pause could be too long • Requires (at most) twice the memory space • Bad if your car’s braking system performs GC while you are trying to stop at a busy intersection! How can we reduce the pause time of GC? • Don’t collect the whole heap at once (incremental) CMSC 330 41 CMSC 330 42 The Generational Principle Generational Collection Long lived objects visited multiple times • Idea: Have more than one heap region, divide into Young ⇒ objects generations die quickly; ! Older generations collected less often old objects ! Objects that survive many collections get promoted into keep living older generations ! Need to track pointers from old to young generations to use as roots for young generation collection • Tracking one in the remembered set More objects live live objects More One popular setup: Generational, copying GC Object lifetime increases ⇒ CMSC 330 43 CMSC 330 44 Conservative Garbage Collection (for C) Memory Management in Ruby For C, we cannot be sure which elements of an Local variables live on the stack object are pointers • Storage reclaimed when method returns • Because of incomplete type information, the use of unsafe casts, etc.
Recommended publications
  • Memory Safety Without Garbage Collection for Embedded Applications
    Memory Safety Without Garbage Collection for Embedded Applications DINAKAR DHURJATI, SUMANT KOWSHIK, VIKRAM ADVE, and CHRIS LATTNER University of Illinois at Urbana-Champaign Traditional approaches to enforcing memory safety of programs rely heavily on run-time checks of memory accesses and on garbage collection, both of which are unattractive for embedded ap- plications. The goal of our work is to develop advanced compiler techniques for enforcing memory safety with minimal run-time overheads. In this paper, we describe a set of compiler techniques that, together with minor semantic restrictions on C programs and no new syntax, ensure memory safety and provide most of the error-detection capabilities of type-safe languages, without using garbage collection, and with no run-time software checks, (on systems with standard hardware support for memory management). The language permits arbitrary pointer-based data structures, explicit deallocation of dynamically allocated memory, and restricted array operations. One of the key results of this paper is a compiler technique that ensures that dereferencing dangling pointers to freed memory does not violate memory safety, without annotations, run-time checks, or garbage collection, and works for arbitrary type-safe C programs. Furthermore, we present a new inter- procedural analysis for static array bounds checking under certain assumptions. For a diverse set of embedded C programs, we show that we are able to ensure memory safety of pointer and dy- namic memory usage in all these programs with no run-time software checks (on systems with standard hardware memory protection), requiring only minor restructuring to conform to simple type restrictions. Static array bounds checking fails for roughly half the programs we study due to complex array references, and these are the only cases where explicit run-time software checks would be needed under our language and system assumptions.
    [Show full text]
  • Memory Leak Or Dangling Pointer
    Modern C++ for Computer Vision and Image Processing Igor Bogoslavskyi Outline Using pointers Pointers are polymorphic Pointer “this” Using const with pointers Stack and Heap Memory leaks and dangling pointers Memory leak Dangling pointer RAII 2 Using pointers in real world Using pointers for classes Pointers can point to objects of custom classes: 1 std::vector<int> vector_int; 2 std::vector<int >* vec_ptr = &vector_int; 3 MyClass obj; 4 MyClass* obj_ptr = &obj; Call object functions from pointer with -> 1 MyClass obj; 2 obj.MyFunc(); 3 MyClass* obj_ptr = &obj; 4 obj_ptr->MyFunc(); obj->Func() (*obj).Func() ↔ 4 Pointers are polymorphic Pointers are just like references, but have additional useful properties: Can be reassigned Can point to ‘‘nothing’’ (nullptr) Can be stored in a vector or an array Use pointers for polymorphism 1 Derived derived; 2 Base* ptr = &derived; Example: for implementing strategy store a pointer to the strategy interface and initialize it with nullptr and check if it is set before calling its methods 5 1 #include <iostream > 2 #include <vector > 3 using std::cout; 4 struct AbstractShape { 5 virtual void Print() const = 0; 6 }; 7 struct Square : public AbstractShape { 8 void Print() const override { cout << "Square\n";} 9 }; 10 struct Triangle : public AbstractShape { 11 void Print() const override { cout << "Triangle\n";} 12 }; 13 int main() { 14 std::vector<AbstractShape*> shapes; 15 Square square; 16 Triangle triangle; 17 shapes.push_back(&square); 18 shapes.push_back(&triangle); 19 for (const auto* shape : shapes) { shape->Print(); } 20 return 0; 21 } 6 this pointer Every object of a class or a struct holds a pointer to itself This pointer is called this Allows the objects to: Return a reference to themselves: return *this; Create copies of themselves within a function Explicitly show that a member belongs to the current object: this->x(); 7 Using const with pointers Pointers can point to a const variable: 1 // Cannot change value , can reassign pointer.
    [Show full text]
  • CSE 220: Systems Programming
    Memory Allocation CSE 220: Systems Programming Ethan Blanton Department of Computer Science and Engineering University at Buffalo Introduction The Heap Void Pointers The Standard Allocator Allocation Errors Summary Allocating Memory We have seen how to use pointers to address: An existing variable An array element from a string or array constant This lecture will discuss requesting memory from the system. © 2021 Ethan Blanton / CSE 220: Systems Programming 2 Introduction The Heap Void Pointers The Standard Allocator Allocation Errors Summary Memory Lifetime All variables we have seen so far have well-defined lifetime: They persist for the entire life of the program They persist for the duration of a single function call Sometimes we need programmer-controlled lifetime. © 2021 Ethan Blanton / CSE 220: Systems Programming 3 Introduction The Heap Void Pointers The Standard Allocator Allocation Errors Summary Examples int global ; /* Lifetime of program */ void foo () { int x; /* Lifetime of foo () */ } Here, global is statically allocated: It is allocated by the compiler It is created when the program starts It disappears when the program exits © 2021 Ethan Blanton / CSE 220: Systems Programming 4 Introduction The Heap Void Pointers The Standard Allocator Allocation Errors Summary Examples int global ; /* Lifetime of program */ void foo () { int x; /* Lifetime of foo () */ } Whereas x is automatically allocated: It is allocated by the compiler It is created when foo() is called ¶ It disappears when foo() returns © 2021 Ethan Blanton / CSE 220: Systems Programming 5 Introduction The Heap Void Pointers The Standard Allocator Allocation Errors Summary The Heap The heap represents memory that is: allocated and released at run time managed explicitly by the programmer only obtainable by address Heap memory is just a range of bytes to C.
    [Show full text]
  • Memory Allocation Pushq %Rbp Java Vs
    University of Washington Memory & data Roadmap Integers & floats C: Java: Machine code & C x86 assembly car *c = malloc(sizeof(car)); Car c = new Car(); Procedures & stacks c.setMiles(100); c->miles = 100; Arrays & structs c->gals = 17; c.setGals(17); float mpg = get_mpg(c); float mpg = Memory & caches free(c); c.getMPG(); Processes Virtual memory Assembly get_mpg: Memory allocation pushq %rbp Java vs. C language: movq %rsp, %rbp ... popq %rbp ret OS: Machine 0111010000011000 100011010000010000000010 code: 1000100111000010 110000011111101000011111 Computer system: Winter 2016 Memory Allocation 1 University of Washington Memory Allocation Topics Dynamic memory allocation . Size/number/lifetime of data structures may only be known at run time . Need to allocate space on the heap . Need to de-allocate (free) unused memory so it can be re-allocated Explicit Allocation Implementation . Implicit free lists . Explicit free lists – subject of next programming assignment . Segregated free lists Implicit Deallocation: Garbage collection Common memory-related bugs in C programs Winter 2016 Memory Allocation 2 University of Washington Dynamic Memory Allocation Programmers use dynamic memory allocators (such as malloc) to acquire virtual memory at run time . For data structures whose size (or lifetime) is known only at runtime Dynamic memory allocators manage an area of a process’ virtual memory known as the heap User stack Top of heap (brk ptr) Heap (via malloc) Uninitialized data (.bss) Initialized data (.data) Program text (.text) 0 Winter 2016 Memory Allocation 3 University of Washington Dynamic Memory Allocation Allocator organizes the heap as a collection of variable-sized blocks, which are either allocated or free . Allocator requests pages in heap region; virtual memory hardware and OS kernel allocate these pages to the process .
    [Show full text]
  • Unleashing the Power of Allocator-Aware Software
    P2126R0 2020-03-02 Pablo Halpern (on behalf of Bloomberg): [email protected] John Lakos: [email protected] Unleashing the Power of Allocator-Aware Software Infrastructure NOTE: This white paper (i.e., this is not a proposal) is intended to motivate continued investment in developing and maturing better memory allocators in the C++ Standard as well as to counter misinformation about allocators, their costs and benefits, and whether they should have a continuing role in the C++ library and language. Abstract Local (arena) memory allocators have been demonstrated to be effective at improving runtime performance both empirically in repeated controlled experiments and anecdotally in a variety of real-world applications. The initial development and subsequent maintenance effort of implementing bespoke data structures using custom memory allocation, however, are typically substantial and often untenable, especially in the limited timeframes that urgent business needs impose. To address such recurring performance needs effectively across the enterprise, Bloomberg has adopted a consistent, ubiquitous allocator-aware software infrastructure (AASI) based on the PMR-style1 plug-in memory allocator protocol pioneered at Bloomberg and adopted into the C++17 Standard Library. In this paper, we highlight the essential mechanics and subtle nuances of programing on top of Bloomberg’s AASI platform. In addition to delineating how to obtain performance gains safely at minimal development cost, we explore many of the inherent collateral benefits — such as object placement, metrics gathering, testing, debugging, and so on — that an AASI naturally affords. After reading this paper and surveying the available concrete allocators (provided in Appendix A), a competent C++ developer should be adequately equipped to extract substantial performance and other collateral benefits immediately using our AASI.
    [Show full text]
  • P1083r2 | Move Resource Adaptor from Library TS to the C++ WP
    P1083r2 | Move resource_adaptor from Library TS to the C++ WP Pablo Halpern [email protected] 2018-11-13 | Target audience: LWG 1 Abstract When the polymorphic allocator infrastructure was moved from the Library Fundamentals TS to the C++17 working draft, pmr::resource_adaptor was left behind. The decision not to move pmr::resource_adaptor was deliberately conservative, but the absence of resource_adaptor in the standard is a hole that must be plugged for a smooth transition to the ubiquitous use of polymorphic_allocator, as proposed in P0339 and P0987. This paper proposes that pmr::resource_adaptor be moved from the LFTS and added to the C++20 working draft. 2 History 2.1 Changes from R1 to R2 (in San Diego) • Paper was forwarded from LEWG to LWG on Tuesday, 2018-10-06 • Copied the formal wording from the LFTS directly into this paper • Minor wording changes as per initial LWG review • Rebased to the October 2018 draft of the C++ WP 2.2 Changes from R0 to R1 (pre-San Diego) • Added a note for LWG to consider clarifying the alignment requirements for resource_adaptor<A>::do_allocate(). • Changed rebind type from char to byte. • Rebased to July 2018 draft of the C++ WP. 3 Motivation It is expected that more and more classes, especially those that would not otherwise be templates, will use pmr::polymorphic_allocator<byte> to allocate memory. In order to pass an allocator to one of these classes, the allocator must either already be a polymorphic allocator, or must be adapted from a non-polymorphic allocator. The process of adaptation is facilitated by pmr::resource_adaptor, which is a simple class template, has been in the LFTS for a long time, and has been fully implemented.
    [Show full text]
  • The Slab Allocator: an Object-Caching Kernel Memory Allocator
    The Slab Allocator: An Object-Caching Kernel Memory Allocator Jeff Bonwick Sun Microsystems Abstract generally superior in both space and time. Finally, Section 6 describes the allocator’s debugging This paper presents a comprehensive design over- features, which can detect a wide variety of prob- view of the SunOS 5.4 kernel memory allocator. lems throughout the system. This allocator is based on a set of object-caching primitives that reduce the cost of allocating complex objects by retaining their state between uses. These 2. Object Caching same primitives prove equally effective for manag- Object caching is a technique for dealing with ing stateless memory (e.g. data pages and temporary objects that are frequently allocated and freed. The buffers) because they are space-efficient and fast. idea is to preserve the invariant portion of an The allocator’s object caches respond dynamically object’s initial state — its constructed state — to global memory pressure, and employ an object- between uses, so it does not have to be destroyed coloring scheme that improves the system’s overall and recreated every time the object is used. For cache utilization and bus balance. The allocator example, an object containing a mutex only needs also has several statistical and debugging features to have mutex_init() applied once — the first that can detect a wide range of problems throughout time the object is allocated. The object can then be the system. freed and reallocated many times without incurring the expense of mutex_destroy() and 1. Introduction mutex_init() each time. An object’s embedded locks, condition variables, reference counts, lists of The allocation and freeing of objects are among the other objects, and read-only data all generally qual- most common operations in the kernel.
    [Show full text]
  • An Evolutionary Study of Linux Memory Management for Fun and Profit Jian Huang, Moinuddin K
    An Evolutionary Study of Linux Memory Management for Fun and Profit Jian Huang, Moinuddin K. Qureshi, and Karsten Schwan, Georgia Institute of Technology https://www.usenix.org/conference/atc16/technical-sessions/presentation/huang This paper is included in the Proceedings of the 2016 USENIX Annual Technical Conference (USENIX ATC ’16). June 22–24, 2016 • Denver, CO, USA 978-1-931971-30-0 Open access to the Proceedings of the 2016 USENIX Annual Technical Conference (USENIX ATC ’16) is sponsored by USENIX. An Evolutionary Study of inu emory anagement for Fun and rofit Jian Huang, Moinuddin K. ureshi, Karsten Schwan Georgia Institute of Technology Astract the patches committed over the last five years from 2009 to 2015. The study covers 4587 patches across Linux We present a comprehensive and uantitative study on versions from 2.6.32.1 to 4.0-rc4. We manually label the development of the Linux memory manager. The each patch after carefully checking the patch, its descrip- study examines 4587 committed patches over the last tions, and follow-up discussions posted by developers. five years (2009-2015) since Linux version 2.6.32. In- To further understand patch distribution over memory se- sights derived from this study concern the development mantics, we build a tool called MChecker to identify the process of the virtual memory system, including its patch changes to the key functions in mm. MChecker matches distribution and patterns, and techniues for memory op- the patches with the source code to track the hot func- timizations and semantics. Specifically, we find that tions that have been updated intensively.
    [Show full text]
  • Fast Efficient Fixed-Size Memory Pool: No Loops and No Overhead
    COMPUTATION TOOLS 2012 : The Third International Conference on Computational Logics, Algebras, Programming, Tools, and Benchmarking Fast Efficient Fixed-Size Memory Pool No Loops and No Overhead Ben Kenwright School of Computer Science Newcastle University Newcastle, United Kingdom, [email protected] Abstract--In this paper, we examine a ready-to-use, robust, it can be used in conjunction with an existing system to and computationally fast fixed-size memory pool manager provide a hybrid solution with minimum difficulty. On with no-loops and no-memory overhead that is highly suited the other hand, multiple instances of numerous fixed-sized towards time-critical systems such as games. The algorithm pools can be used to produce a general overall flexible achieves this by exploiting the unused memory slots for general solution to work in place of the current system bookkeeping in combination with a trouble-free indexing memory manager. scheme. We explain how it works in amalgamation with Alternatively, in some time critical systems such as straightforward step-by-step examples. Furthermore, we games; system allocations are reduced to a bare minimum compare just how much faster the memory pool manager is to make the process run as fast as possible. However, for when compared with a system allocator (e.g., malloc) over a a dynamically changing system, it is necessary to allocate range of allocations and sizes. memory for changing resources, e.g., data assets Keywords-memory pools; real-time; memory allocation; (graphical images, sound files, scripts) which are loaded memory manager; memory de-allocation; dynamic memory dynamically at runtime.
    [Show full text]
  • Effective STL
    Effective STL Author: Scott Meyers E-version is made by: Strangecat@epubcn Thanks is given to j1foo@epubcn, who has helped to revise this e-book. Content Containers........................................................................................1 Item 1. Choose your containers with care........................................................... 1 Item 2. Beware the illusion of container-independent code................................ 4 Item 3. Make copying cheap and correct for objects in containers..................... 9 Item 4. Call empty instead of checking size() against zero. ............................. 11 Item 5. Prefer range member functions to their single-element counterparts... 12 Item 6. Be alert for C++'s most vexing parse................................................... 20 Item 7. When using containers of newed pointers, remember to delete the pointers before the container is destroyed. ........................................................... 22 Item 8. Never create containers of auto_ptrs. ................................................... 27 Item 9. Choose carefully among erasing options.............................................. 29 Item 10. Be aware of allocator conventions and restrictions. ......................... 34 Item 11. Understand the legitimate uses of custom allocators........................ 40 Item 12. Have realistic expectations about the thread safety of STL containers. 43 vector and string............................................................................48 Item 13. Prefer vector
    [Show full text]
  • Diehard: Probabilistic Memory Safety for Unsafe Languages
    DieHard: Probabilistic Memory Safety for Unsafe Languages Emery D. Berger Benjamin G. Zorn Dept. of Computer Science Microsoft Research University of Massachusetts Amherst One Microsoft Way Amherst, MA 01003 Redmond, WA 98052 [email protected] [email protected] Abstract Invalid frees: Passing illegal addresses to free can corrupt the Applications written in unsafe languages like C and C++ are vul- heap or lead to undefined behavior. nerable to memory errors such as buffer overflows, dangling point- Double frees: Repeated calls to free of objects that have already ers, and reads of uninitialized data. Such errors can lead to program been freed undermine the integrity of freelist-based allocators. crashes, security vulnerabilities, and unpredictable behavior. We present DieHard, a runtime system that tolerates these errors while Tools like Purify [17] and Valgrind [28, 34] allow programmers to probabilistically maintaining soundness. DieHard uses randomiza- pinpoint the exact location of these memory errors (at the cost of a tion and replication to achieve probabilistic memory safety through 2-25X performance penalty), but only reveal those bugs found dur- the illusion of an infinite-sized heap. DieHard’s memory manager ing testing. Deployed programs thus remain vulnerable to crashes randomizes the location of objects in a heap that is at least twice or attack. Conservative garbage collectors can, at the cost of in- as large as required. This algorithm not only prevents heap corrup- creased runtime and additional memory [10, 20], disable calls to tion but also provides a probabilistic guarantee of avoiding memory free and so eliminate three of the above errors (invalid frees, dou- errors.
    [Show full text]
  • Memory Allocation
    University of Washington Today ¢ Dynamic memory allocaon § Size of data structures may only be known at run 6me § Need to allocate space on the heap § Need to de-allocate (free) unused memory so it can be re-allocated ¢ Implementaon § Implicit free lists § Explicit free lists – subject of next programming assignment § Segregated free lists ¢ Garbage collecon ¢ Common memory-related bugs in C programs Autumn 2012 Memory Alloca7on 1 University of Washington Process Memory Image memory protected kernel virtual memory from user code stack %esp What is the heap for? How do we use it? run-me heap uninialized data (.bss) inialized data (.data) program text (.text) 0 Autumn 2012 Memory Alloca7on 2 University of Washington Process Memory Image memory protected kernel virtual memory from user code stack %esp Allocators request addional heap memory from the kernel using the sbrk() funcon: the “brk” ptr error = sbrk(amt_more) run-me heap (via malloc) uninialized data (.bss) inialized data (.data) program text (.text) 0 Autumn 2012 Memory Alloca7on 3 University of Washington Dynamic Memory Alloca7on ¢ Memory allocator? Applica7on § VM hardware and kernel allocate pages § Applicaon objects are typically smaller Dynamic Memory Allocator § Allocator manages objects within pages Heap Memory ¢ How should the applica4on code allocate memory? Autumn 2012 Memory Alloca7on 4 University of Washington Dynamic Memory Alloca7on ¢ Memory allocator? Applica7on § VM hardware and kernel allocate pages § Applicaon objects are typically smaller Dynamic Memory Allocator
    [Show full text]