Automatic Collection of Leaked Resources
Total Page:16
File Type:pdf, Size:1020Kb
IEICE TRANS. INF. & SYST., VOL.E96–D, NO.1 JANUARY 2013 28 PAPER Resco: Automatic Collection of Leaked Resources Ziying DAI†a), Student Member, Xiaoguang MAO†b), Nonmember, Yan LEI†, Student Member, Xiaomin WAN†, and Kerong BEN††, Nonmembers SUMMARY A garbage collector relieves programmers from manual rent approaches either strive to detect and fix these leaks memory management and improves productivity and program reliability. or provide new language features to simplify resource man- However, there are many other finite system resources that programmers agement [2], [3], [5]–[8]. Although these approaches can must manage by themselves, such as sockets and database connections. Growing resource leaks can lead to performance degradation and even achieve success to some extent, there are some escaped re- program crashes. This paper presents the automatic resource collection source leaks that manifest themselves after programs have approach called Resco (RESource COllector) to tolerate non-memory re- been deployed. We also note that similar to memory leaks, source leaks. Resco prevents performance degradation and crashes due to leaks of other resources are not necessarily a problem. A resource leaks by two steps. First, it utilizes monitors to count resource ff consumption and request resource collections independently of memory few or a small amount of leaked resources will neither a ect usage when resource limits are about to be violated. Second, it responds program behavior nor performance. Only when accumu- to a resource collection request by safely releasing leaked resources. We lated leaks exhaust all available resources or lead to large ex- implement Resco based on a Java Virtual Machine for Java programs. The tra computation overhead, a system failure occurs. Based on performance evaluation against standard benchmarks shows that Resco has these two observations, we propose the automatic resource a very low overhead, around 1% or 3%. Experiments on resource leak bugs show that Resco successfully prevents most of these programs from collection approach to enforce resource limits and tolerate crashing with little increase in execution time. resource leaks. Resource collections are triggered just when key words: resource leaks, resource collection, fault tolerance, monitoring there are so many leaked resources that the system is about to crash or its performance is about to degrade. Let us take 1. Introduction database connections as an example. We can collect leaked connections at the limit of the connection pool size to avoid Automatic garbage collection has gained considerable suc- performance degradation due to establishing extra physical cess in many mainstream programming languages, such as connections, and also at the limit of the maximum number of Java and C#. Garbage collector relieves programmers from concurrent connections to prevent database crashes. Since manual memory management and improves productivity opening physical connections is time-consuming, connec- and program reliability [1]. However, there are many other tions should be closed as soon as possible. In such cases, non-memory, finite system resources, such as file descrip- current solutions such as leak detection and fixing can ben- tors and database connections, whose management garbage efit and are complementary to ours. collector does not help with. In programs written in Java- This paper presents the Resco approach to automat- like languages, once acquired, a resource instance must be ically collect leaked non-memory resources in response released by explicitly calling a cleanup method. A resource to abnormal resource consumption. When the program leak is a software bug that occurs when the cleanup method approaches a resource limit, the resource collection pro- of the resource is not invoked after the last use of the re- cess is triggered. First, Resco identifies leaked resources. source. Growing resource leaks can hurt application per- Among un-released resources, unreachable ones are defi- formance and even result in system crashes due to resource nitely leaked. However, according to the leak definition exhaustion. above, there may be leaked resources that are still reach- Resource leaks are common in Java programs [2]. It able. Because determining whether an object is live (will is difficult for programmers to correctly release all re- be used later) or not is un-decidable in general, most re- sources along all possible exceptional paths. Implicit con- source leak detection approaches including garbage collec- trol flows introduced by exceptions also impose difficul- tors target unreachable resources [2], [3], [8]. Resco also ties on traditional testing and analysis techniques [3]. Cur- employs this approach to identify leaked resources as un- Manuscript received May 2, 2012. released and unreachable resources. Second, corresponding Manuscript revised October 4, 2012. cleanup methods such as close are invoked to safely re- †The authors are with School of Computer, National University lease these leaked resources. Resco is analogous to the fi- of Defense Technology, 410073, Changsha, China. †† nalization mechanism since both aim at reclaiming unreach- The author is with the Department of Computer Engineering, able resources. However, the execution of finalize meth- Naval University of Engineering, 430033, Wuhan, China. a) E-mail: [email protected] ods may be arbitrarily delayed in an indeterminate way [9], b) E-mail: [email protected] (Corresponding author) which makes it well known that finalization is unqualified DOI: 10.1587/transinf.E96.D.28 to perform resource collections. There are two main rea- Copyright c 2013 The Institute of Electronics, Information and Communication Engineers DAI et al.: RESCO: AUTOMATIC COLLECTION OF LEAKED RESOURCES 29 sons: (1) finalize methods are bound to garbage collec- and thus impose different limits on them. These limits can tor that may not run until the application is about to run not necessarily be mapped to low-level system resources. out of memory. However, the application may be about For example, database connections typically have two lim- to exhaust some non-memory resources or may suffer per- its: the size of the connection pool and the maximum num- formance degradation due to huge resource consumption ber of concurrently available connections. These two lim- while there is still a large amount of available memory; (2) its usually have small numbers and are set for performance Various finalization implementations do not always execute considerations. Second, the common resource we refer to finalize methods immediately when they are ready to be may not be the one on which the system directly imposes called [9]. Asynchronous finalization is a necessary fea- limits. Moreover, one limited resource may be required by ture for correct implementation, but the situation becomes several other resources and one resource may require sev- worse because of delayed invocations of ready finalize eral different limited resources. Let us take the file descrip- methods. Resco improves the situation based on two de- tor as an example. All examples in this paper are provided sign decisions. First, Resco separates non-memory re- in the context of Java except explicitly stated. We know source collections from memory collections. Resource col- that FileInputStream is a finite resource whose instances lections are triggered in response to abnormal non-memory should be closed after use. However, its background lim- resource consumption independently of memory usage. For ited resource is the file descriptor that the operating sys- the flexibility to enforce various limits and to handle new tem usually imposes a constraint on its maximum avail- application-specific resources, we instrument monitors into able number. One instance of FileInputStream occupies application code and system libraries to count the resource one file descriptor. Knowing just these is not enough, be- consumption and request resource collections if necessary. cause the file descriptor is also the background resource of Second, the separate thread to release leaked resources is FileOutputSteam, Socket,etc. given the privilege to run immediately after the liveness To facilitate discussions and to drive our Resco ap- analysis. We ensure the safety by releasing only leaked proach, we give several concepts below. The concrete re- resource instances that are not depended on by any object source introduced below is to denote resources on which with actions (e.g. close or finalize) to perform in fu- limits are directly imposed, and the concept of abstract re- ture and that do not reference any reachable instances of the source is to characterize resources at the programming level. collection-triggering resource or its wrappers. Resource-related concepts can also be found in [6], [23], but We implement Resco based on Jikes RVM [11] for Java we give our own ones here for the specific purpose of col- programs. Users provide Resco with resource releasing lecting leaked resources. specifications through a configuration file that includes re- Definition 1. (Concrete Resource) A concrete resource source types, resource acquiring and releasing methods, and is a hardware or software entity whose amount available to resource limits. Resco can handle new application-specific a program is limited. A concrete resource CR is formally resources that have a limited amount available to programs. specified as a pair N, L, where N is its unique identifier We evaluate Resco’s performance through standard