Investigating Java Concurrency Using Abstract State Machines

Investigating Java Concurrency Using Abstract State Machines 1 1 2 Yuri Gurevich ,Wolfram Schulte , and Charles Wallace 1 Microsoft Research, Redmond, WA 98052-6399, USA fgurevich,schulteg@microsof t.co m 2 UniversityofDelaware, Newark, DE 19716, USA [email protected] Abstract. We present a mathematically precise, platform-indep endent mo del of Java concurrency using the Abstract State Machine metho d. We cover all asp ects of Java threads and synchronization, gradually adding details to the mo del in a series of steps. Wemotivate and explain each concurrency feature, and p oint out subtleties, inconsistencies and ambiguities in the ocial, informal Java sp eci cation. 1 Intro duction The Java programming language [7, 13] provides sophisticated supp ort for concurrency. The fundamental op erations of concurrent programs are implemented as built-in features of the language. Many of the notoriously complicated details of concurrent programming are hidden from the programmer, simplifying the design of concurrent applications. Furthermore, a platform-neutral memory mo del is included as part of the language. The incorp oration of suchintricate, subtle op erations into Java calls for a precise sp eci cation. As interest in the language's concurrency mo del increases, Javadevelop ers are examining and exploiting its details [15, 17, 16, 14]. The p opularityofJava and, more imp ortantly, its empha- sis on cross-platform compatibility make the need for such a sp eci cation even stronger. We present a mo del of the concurrent features of Java, using the formal 1 op erational sp eci cation metho d of Abstract State Machines (ASMs) [9, 22, 23]. We use the Java Language Speci cation manual (JLS) [7], as our reference for the language. The JLS is an informal sp eci cation, and due to the ambiguitywhich p ervades natural language, it can be interpreted in di erent ways. Our mo del gives an unambiguous sp eci cation which re ects our interpretation of the JLS. Throughout the pap er, we indicate where ambiguities and omissions in the JLS give rise to other interpretations. The formal sp eci cation pro cess also uncovers many subtle but imp ortant issues which the JLS do es not bring to light. Our goal is a sp eci cation that is not only precise but accessible to its readers, even those not familiar with Java, concurrent programming, or ASMs. As part of this 1 ASMs were formerly known as Evolving Algebras. 2 pro ject, we implement the ASM sp eci cations using the ASMGofer interpreter [20]. This implementation serves as a convenient to ol for prototyping and testing. It is imp ortant to distinguish unintentional ambiguityfromintentional un- dersp eci cation. The authors of the JLS envision supp ort for Java concurrency taking many di erent forms: \by having many hardware pro cessors, by time- slicing a single hardware pro cessor, or by time-slicing many hardware pro cessors [7]". The JLS leaves some details unsp eci ed in order to give implementers of Java a certain amountoffreedom:\theintent is to p ermit certain standard hardware and software techniques that can greatly improve the sp eed and e- ciency of concurrent co de [7]". As is usual with the ASM metho dology [8, 2], we design our ASMs to mo del Java concurrency at its natural level of abstraction, capturing the essence of the JLS without committing to any implementation decisions. While most of the JLS sp eci cation is imp erative in nature, some of the concurrency mo del is describ ed in a declarativestyle: rather than explain howto execute a concurrent program, the JLS gives conditions that a correct execution must meet. This shift in presentation style is most likely due to the notion that only a declarative sp eci cation can b e truly implementation-indep endent. There are twoways for us to deal with the declarative p ortions of the JLS. The obvious and simple way is to reformulate them declaratively as conditions on the execution of our ASM. The other way is to implement themintheASM itself. Wecho ose to do the latter, whenever it is natural to do so, but in a way that do es not sacri ce generality. Our mo del ob eys the conditions established in the JLS, yet it is fully general in the sense that for any concurrent program execution that follows the rules of the JLS, there is a corresp onding run of the ASM. We nd our imp erative approach helpful in creating a clear mental picture of the concurrency mo del, and in uncovering hidden assumptions, inconsistencies and ambiguities in the JLS. There are several formalizations of Java concurrency in the literature. Borger and Schulte [3] also uses ASMs, to give a semantic analysis of Java that exp oses a hierarchy of natural sublanguages. It covers the language features for concurrency, but its fo cus is on the high-level language rather than the lower-level concurrency mo del, so it do es not provide a full sp eci cation of Java concurrency. There are several works [1, 4, 5] using the Structural Operational Semantics (SOS) metho dology [18]. Attali et al. [1]donotconcentrate on the lower-level details of Java concurrency,while Cenciarelli et al. [4] give them in a declarativestyle, in keeping with the presentation in the JLS. As mentioned earlier, we believe that there are advantages to our imp erativeapproach. Coscia and Reg- gio [5] explicitly deal with the details of the concurrency mo del and prop ose some changes to its design. Gontmakher and Schuster [6] present a non-imp erative sp eci cation, with the purp ose of comparing Java's memory b ehavior with other well-known notions of consistency. Our fo cus is di erent, however: while their work assumes a particular interpretation of the JLS and pro ceeds from there, our goal is rst to come to a clear understanding of the concurrency mo del as 3 presented in the JLS, and then to discuss its consequences. As we shall see, there are several issues within the JLS whichmake this initial understanding dicult. We intro duce in x2 the basic rules by which agents (threads ) interact with regard to shared variables, and in x3 the sp ecial considerations for variables marked as volatile.Inx4weintro duce locks,which are used to limit concurrency within a program. In x5 we discuss prescient store actions. In x6 we describ e thread objects.Inx7, wecover waiting and noti cation. Our technical rep ort [11] provides a complete presentation of our ASM sp ec- i cation. In it, we also prove that our sp eci cation is as general as p ossible while ob eying all the correctness criteria of the JLS. Acknowledgment. This work was partially supp orted by NSF grantCCR-95- 04375. 2 Threads and variables In a single-threaded computation, a single agent executes the instructions of a program, one at a time. In contrast, a run of a Java program may be multithreaded, (intuitively) involving multiple agents that execute instructions concurrently.Each agent executes sequentially, the sequence of its actions forming a thread of execution. In the parlance of distributed computing, the term \thread" is used to refer not only to an agent's computation but also to the agent itself. Threads may execute on a single pro cessor or multiple pro cessors. The atomic actions of di erent threads maybeinterleaved; they mayeven b e concurrentin the case of multiple pro cessors. Since di erent threads may access and up date common data, the order of their actions a ects the results of an execution. The JLS establishes some conditions on the interaction of threads with resp ect to shared data, but intentionally leaves some freedom to implementers of the language. Consequently, the results of executing a multithreaded program may vary between di erent Java plat- forms, and even b etween di erent executions on the same platform. A data value is either an instance of a primitivetyp e, suchasint or boolean, or an object, a dynamically created value. A value resides at a lo cation either in the working memory of a thread or in the main memory. Certain memory lo cations serve as manifestations of a variable.Themaster copy of a variable is the unique lo cation in the main memory that manifests the variable. A thread's working copy of a variable is the unique lo cation in the thread's working memory that manifests the variable. A thread accesses or up dates only its working copies. The JLS sp eaks of the main memory as an agentwhich communicates with threads and up dates variable master values. While this view is adequate for describing a system with a centralized main memory, it do es not capture the p ossibility of a distributed main memory [16], where up dates of di erentvariable master copies may o ccur concurrently.We nd it convenient to imagine that each variable has a master agent, which accesses and up dates the variable's master copy. Since there is a one-to-one corresp ondence between variables and their masters, wemay identify a variable master with the variable it controls. 4 We create an ASM to mo del threads' actions on variables. The agents of are threads (members of Thread) and variable masters (memb ers of Var ). At the b eginning of each run, there is a single member of the universe Thread. Since we identify variable masters with the variables they control, a member of the universe Var is both a variable and a variable master.

Load more