Refactoring for Reentrancy Pdfauthor
Total Page:16
File Type:pdf, Size:1020Kb
Refactoring for Reentrancy Jan Wloka Manu Sridharan Frank Tip Dept. of Computer Science IBM T.J. Watson Research IBM T.J. Watson Research Rutgers University Center Center Piscataway, NJ 08854 P. O. Box 704 P. O. Box 704 [email protected] Yorktown Heights, NY 10598 Yorktown Heights, NY 10598 [email protected] [email protected] ABSTRACT of the term.) As multicore machines become more pervasive, A program is reentrant if distinct executions of that pro- reentrancy is an increasingly desirable property|reentrant gram on distinct inputs cannot affect each other. Reen- programs can be run in parallel on multicores without ad- trant programs have the desirable property that they can ditional concurrency control or a separate virtual machine be deployed on parallel machines without additional concur- for each program instance. Furthermore, even in single- rency control. Many existing Java programs are not reen- threaded settings, reentrancy is a desirable property because trant because they rely on mutable global state. We present it ensures that successive executions of an application will a mostly-automated refactoring that makes such programs behave consistently. reentrant by replacing global state with thread-local state A key obstacle to obtaining increased performance on mul- and performing each execution in a fresh thread. The ap- ticores is the effort required to make legacy single-threaded proach has the key advantage of yielding a program that is programs reentrant. Many such programs were written with- obviously safe for parallel execution; the program can then out concurrency in mind and use global state in a non- be optimized selectively for better performance. We imple- reentrant manner. Manually introducing reentrancy can be mented this refactoring in Reentrancer, a practical Eclipse- labor-intensive since any use of mutable global state may based tool. Reentrancer successfully eliminated observed re- break reentrancy, and hence such state must either be elim- entrancy problems in five single-threaded Java benchmarks. inated or enclosed in carefully-placed synchronization con- For three of the benchmarks, Reentrancer enabled speedups structs. on a multicore machine without any further code modifica- We are aware of several approaches for making programs tion. reentrant. First, global state can be encapsulated in one or more \environment objects", and additional method pa- rameters can be introduced to make such objects accessible Categories and Subject Descriptors throughout the application. In our experience, this approach D.2.3 [Software Engineering]: Coding Tools and Tech- requires a prohibitive amount of intrusive code change. A niques; D.2.6 [Software Engineering]: Programming En- second approach is to use isolation mechanisms such as sepa- vironments rate OS processes or class loaders to ensure non-interference of parallel program executions. However, an important dis- advantage of this approach is that it does not allow for se- General Terms lectively removing some isolation for performance (e.g., by Performance, Reliability sharing some state), and it may introduce high overhead. This paper employs a third approach based on replac- 1. INTRODUCTION ing global mutable state with thread-local state. In this ap- proach, each execution thread gets a separate copy of each This paper presents a refactoring tool that makes single- global variable, and any access to a global variable is implic- threaded programs reentrant. A program is reentrant1 if itly indexed by the current thread. After the introduction of distinct executions of the program on distinct inputs cannot thread-local state, the programmer can ensure reentrancy by affect each other, whether run sequentially or concurrently. (manually) modifying code to use a fresh thread for each ex- (Definition 1 in Section 2 provides a more precise definition ecution of the program; the threads can be run sequentially 1 Other definitions of the term \reentrant" can be found in or concurrently. the literature and will be discussed in Section 5. The use of thread-local state for reentrancy has two key desirable properties. First, the approach is widely applica- ble, as many platforms provide built-in support for thread- local state, including Java, pthreads, and .NET. Second, if Permission to make digital or hard copies of all or part of this work for desired (e.g., for performance), a developer can undo the in- personal or classroom use is granted without fee provided that copies are troduction of thread-local state for select global variables not made or distributed for profit or commercial advantage and that copies (e.g., by introducing carefully synchronized shared state) bear this notice and the full citation on the first page. To copy otherwise, to while leaving the rest as thread-local; this is not possible republish, to post on servers or to redistribute to lists, requires prior specific in isolation-based approaches. This leads to an approach permission and/or a fee. ESEC-FSE’09, August 23–28, 2009, Amsterdam, The Netherlands. of correctness before performance when moving software to Copyright 2009 ACM 978-1-60558-001-2/09/08 ...$10.00. multicores (also espoused elsewhere [6]): the refactoring 5. Transforming the application to use a fresh thread for produces an obviously reentrant program (i.e., one with no each execution. global mutable state in the application) which can then be Steps 2-4 are automatic, while steps 1 and 5 must be per- optimized as needed. We believe this approach of correct- formed by the developer. In the remainder of this section, ness before performance when parallelizing code will lead to we will explain each of these steps in detail using an example more robust code and greater programmer productivity. program. The refactoring tool in which we implemented these ideas, Reentrancer, targets the Java programming language and deals with several additional issues that arise when han- Definitions. Before discussing our example, we define the dling real-world programs. One such issue is initialization key concepts of reentrancy and immutability as used in this semantics|Java's static initializers can have fairly intricate paper. behavior that must be preserved when introducing thread- Definition 1 (Reentrancy). Let an execution of a local state.2 While our transformation does not preserve program P be any external invocation of P, e.g., running static initializer semantics in all cases, we have formulated P's main method or invoking a public API method from P. preconditions that ensure that the initializers in an input A program P is reentrant iff for any two executions ei and ej program are safe for our refactoring; our tool emits warnings of P such that ei and ej have no mutable shared inputs (see when it is able to detect violations of these preconditions. Definition 2), the results of ei and ej are unaffected by how Another challenge is handling library code, which cannot be the executions are ordered, including parallel interleavings. refactored and allows for accessing possibly-mutable data in The goal of our refactoring is to make sequential programs the environment (user input, files, etc.); our tool also warns reentrant when each execution is run in a fresh thread (se- about common library usage that may break reentrancy. quentially or concurrently). Note that the refactoring may Reentrancer was implemented as an extension to the not preserve program semantics if certain preconditions do Eclipse Java development tools (JDT)3 and evaluated by not hold (see Section 3.2). using it to make five Java applications reentrant. Our eval- uation showed that Reentrancer can successfully refactor re- Definition 2 (Immutability). We consider an object alistic programs. For three of the refactored benchmarks, o to be immutable iff all of o's transitively reachable abstract we measured small speedups when the tests associated with state (reached by following instance field pointers) cannot be the application were executed in parallel on a multicore ma- mutated (i.e., it is deeply immutable [20]). A static field is chine, with no additional performance tuning. immutable iff (1) it is declared final and (2) it only points In summary, the contributions of this paper are as follows: to an immutable object o. 1. a characterization of the causes of non-reentrancy in This notion of immutability is often a requirement for safely Java programs, sharing state between threads (further discussion in [23]). 2. a mostly-automated refactoring for making Java pro- 2.1 Example Program grams reentrant by replacing global state with thread- Figure 1 shows a small (contrived) Java program that we local state, with associated precondition checking that shall use as our example. In this program, class Configura- warns of possible behavioral changes due to fragile tion represents a configuration as a set of key-value pairs, static initializer behavior and non-reentrant library us- which are stored in a HashMap. A static initializer is used to age, initialize the configuration with a pair <"version","2.0"> (see lines 5{7). Additional entries can be added using the 3. an implementation of a refactoring tool, Reentrancer, setOption() method and retrieved using the getOption() as an extension to the Eclipse JDT, and method (see lines 8{10 and 11{13). Class CheckUserFiles defines a method checkDir() that performs checks on some 4. an evaluation of Reentrancer on a set of Java applica- directory homedir by setting a Configuration option and tions, demonstrating its practicability. calling runAllChecks(). The checkDir() method throws an exception if the homedir option has already been set (see 2. MOTIVATING EXAMPLE line 18). Class Tests defines two JUnit tests, testFoo() This section presents a motivating example that illustrates and testBar() that invoke the checkDir() method with ar- our refactoring for making programs reentrant. The refac- guments "foo" and "bar", respectively. toring is comprised of the following 5 steps: The program in Figure 1 is not reentrant: when testFoo() and testBar() are executed consecutively, testBar() fails 1.