Towards Safely Tolerating Memory Leaks for Garbage-Collected Languages Yan Tang, Qi Gao, and Feng Qin the Ohio State University {Tangya, Gaoq, Qin}@Cse.Ohio-State.Edu

LeakSurvivor: Towards Safely Tolerating Memory Leaks for Garbage-Collected Languages Yan Tang, Qi Gao, and Feng Qin The Ohio State University {tangya, gaoq, qin}@cse.ohio-state.edu Abstract arithmetics. With tremendous advances of hardware and Just-In-Time (JIT) compiler techniques, garbage- Continuous memory leaks severely hurt program per- collected languages are now applied even in enterprise formance and software availability for garbage-collected server environments [3, 7]. programs. This paper presents a safe method, called Unfortunately, memory leaks still exist and severely LeakSurvivor, to tolerate continuous memory leaks at affect performance and availability for garbage- runtime for garbage-collected programs. Our main idea collected programs, even though garbage collectors can is to periodically swap out the “Potentially Leaked” (PL) reclaim unreachable memory objects [23]. Memory memory objects identified by leak detectors from the vir- leaks occur in a garbage-collected program when the tual memory to disks. As a result, the virtual memory program keeps references to memory objects that are space occupied by the PL objects can be reclaimed by no longer needed. As a result, the heap space occupied garbage collectors and available for future uses. If a by leaked memory objects cannot be reclaimed by swapped-out PL object is accesses later, LeakSurvivor garbage collectors. For long-running garbage-collected will restore it from disks to the memory for correct pro- programs, continuously-leaked memory objects take up gram execution. Furthermore, LeakSurvivor helps de- more and more space in the heap, leading to more fre- velopers to prune false positives. quent invocation of garbage collections (GC) at runtime. We have built the prototype of LeakSurvivor on top of Additionally, more leaked objects in the heap space Jikes RVM 2.4.2, a high performance Java-in-Java vir- increase the object traversal time during each GC phase. tual machine developed by IBM. We conduct the exper- Therefore, continuous memory leaks cumulatively iments with three Java applications including Eclipse, degrade overall program performance [12, 27]. Even SPECjbb2000 and Jigsaw. Among them, Eclipse and worse, memory leaks may exhaust system resources, Jigsaw contain memory leaks introduced by their devel- eventually causing program crashes [18, 27]. opers, while SPECjbb2000 contain a memory leak in- The performance degradation problem due to memory jected by us. Our results show that LeakSurvivor ef- leaks cannot be completely solved by the paging mech- fectively tolerates memory leaks for two applications anism used in OS memory management or by infinitely (Eclipse and SPECjbb2000), i.e., no cumulative perfor- increasing the heap size (e.g., in 64-bit machines). The mance degradation and no software failures when fac- OS paging mechanism swaps out physical memory to ing continuous memory leaks at runtime. For Jigsaw, disks when physical memory is under pressure, which LeakSurvivor extends the program lifetime by two times leaves the virtual memory space un-reclaimed. There- and improves the performance by 46% compared with fore, heap pressure (i.e., little available heap space) in native runs. Furthermore, when there are no memory the virtual memory due to memory leaks remains the leaks, LeakSurvivor imposes small runtime overhead, same, leading to cumulative performance degradation. i.e., 2.5% over the leak detector and 23.7% over the na- Alternatively, infinitely increasing the heap size, e.g., tive runs. in 64-bit machines, reduces the heap pressure and thus eliminates the increasing GC frequency. However, once 1 Introduction the working set of garbage collectors, i.e., the whole heap, becomes too large to fit into the available physi- Garbage-collected languages such as Java and C# be- cal memory, the program performance will be drastically come increasingly popular. This is partly because pro- degraded due to increased memory paging during each grams written in these languages are free of many types GC phase [44]. of memory errors, which are notorious for compromis- Therefore, it is imperative to devise mechanisms ing system availability and security [42]. For example, for tolerating continuous memory leaks at runtime for by forbidding explicit pointers, Java programs do not garbage-collected programs. The mechanisms must encounter memory corruptions due to incorrect pointer enable programs to maintain relatively stable perfor- USENIX Association USENIX ’08: 2008 USENIX Annual Technical Conference 307 mance and survive software failures caused by contin- its lifetime by two times and improves performance uous memory leaks. by 46% compared with native runs. There are only a few studies [13, 35, 24, 34] on tolerating memory leaks. They are either unsafe or unable • LeakSurvivor incurs low overhead for programs to prevent performance degradation and out-of-memory during normal execution without memory leaks. errors. Cyclic memory allocation [34] tolerates memory This is because it only adds one extra check dur- leaks by restricting each allocation site to a fixed number ing GC phase. Our evaluation with DaCapo bench- of live objects. While this method may work for appli- marks shows that LeakSurvivor incurs small run- cations that have predictable memory requirements, it is time overhead, i.e., 2.5% over the leak detector and in general unsafe since the live objects could be over- 23.7% over the native runs, when there are no leaks. written. Melt [13] and Plug [35] isolate leaked objects from non-leaked objects and rely on the OS paging for • LeakSurvivor requires no modification in the appli- releasing the physical memory occupied by leaked ob- cations’ source codes. Therefore, it can be easily jects. While they are effective in alleviating the phys- applied to legacy garbage-collected programs. ical memory shortage caused by memory leaks, these methods cannot reduce heap usage in the virtual mem- • LeakSurvivor provides false positive information ory and will eventually trigger the out-of-memory errors. on memory leaks, if any, to developers for prun- Goldstein et al. proposed to dump large unused objects ing purposes. Once a PL object is restored to the to disks [24]. However, their approach does not handle memory, LeakSurvivor marks it as a false positive small leaked objects, which can insidiously cause perfor- and passes this information to the leak detectors. mance and availability problems. Furthermore, it lacks an automatic mechanism to detect leaked objects. 2 Main Idea of LeakSurvivor Our Contributions. In this paper, we propose a safe method for tolerating continuous memory leaks for The main idea of LeakSurvivor is to periodically swap garbage-collected programs at runtime. It can avoid cu- out “Potentially Leaked” (PL) objects from the vir- mulative performance degradation and software failures tual memory to disks so that garbage collectors can re- due to continuous memory leaks. In addition, it helps claim the heap space occupied by these objects. If all developers to prune false positives. leaked memory objects are identified as PL objects and Our idea is to periodically move the “Potentially swapped out, the heap pressure due to memory leaks is Leaked” (PL) memory objects identified by leak detec- eliminated. As a result, LeakSurvivor can avoid cumu- tors from the virtual memory to disks. As a result, the lative program performance degradation and future pro- heap space occupied by PL objects can be reclaimed by gram crashes due to continuous memory leaks. Upon garbage collectors and thus are available for future uses. accesses to some swapped-out PL objects during sub- If a swapped-out PL object is accesses later, LeakSur- sequent program execution, i.e., PL objects are falsely vivor will restore it from disks to the memory for correct identified, LeakSurvivor swaps them from disks to mem- program execution. ory so that the programs can continue execution cor- Based on the above idea, we build a tool called Leak- rectly. In this way, LeakSurvivor guarantees the safety Survivor in Jikes RVM [7], a high performance Java- of leak tolerance. Additionally, LeakSurvivor helps de- in-Java virtual machine developed by IBM, to tolerate velopers to prune false positives. memory leaks at runtime. We evaluate LeakSurvivor Figure 1 shows the process by which LeakSurvivor with three applications, including Eclipse, a popular tolerates memory leaks at runtime and guarantees the integrated development environment, SPECjbb2000, a correctness of subsequent program execution. During simulator of an order-processing server system, and Jig- program execution, LeakSurvivor periodically checks saw, a web server. Among them, Eclipse and Jigsaw whether there are PL objects marked by the integrated contain memory leaks introduced by their developers, leak detectors. If so, it swaps out the PL objects by copy- while SPECjbb2000 contain a memory leak injected by ing the object contents from memory to the leak space us. Unlike previous approaches, LeakSurvivor possesses (See Section 3.2.1), which is mainly on disks. Addition- the following advantages: ally, LeakSurvivor modifies all the PL objects’ incoming references, i.e., references from other objects to the PL • LeakSurvivor can safely and effectively tolerate objects, and outgoing references, i.e., references from memory leaks for garbage-collected program at the PL objects to other objects (See Section 3.2). After runtime. Our evaluation shows that it can tolerate this, the virtual memory space occupied by the PL ob- leaks for two of the three applications (Eclipse and jects can be safely reclaimed by garbage collectors and SPECjbb2000). For Jigsaw, LeakSurvivor extends are available for future uses. Figure 1(a)(b) shows the 308 USENIX ’08: 2008 USENIX Annual Technical Conference USENIX Association Virtual Physical Virtual Physical Virtual Physical Virtual Physical Memory Memory Memory Memory Memory Memory Memory Memory PL object Leak Space/Disk Leak Space/Disk Leak Space/Disk Leak Space/Disk (a) Mark a PL object (b) Swap out a PL object (c) Access a PL object (d) Swap in a PL object Figure 1: LeakSurvivor: The main idea (PL means “Potentially Leaked”).

Load more