Generating Random Well-Typed Featherweight Java Programs Using Quickcheck

Generating Random Well-Typed Featherweight Java Programs Using QuickCheck Samuel da Silva Feitosa Rodrigo Geraldo Ribeiro Andre Rauber Du Bois PPGC/UFPel PPGCC/UFOP PPGC/UFPel Universidade Federal de Pelotas Universidade Federal de Ouro Preto Universidade Federal de Pelotas Pelotas - RS - Brazil Ouro Preto - MG - Brazil Pelotas - RS - Brazil Email: [email protected] Email: [email protected] Email: [email protected] Abstract—Currently, Java is one of the most used program- example, we can’t state that a program meets its specification, ming language, being adopted in many large projects, where a type system is sound, or that a compiler or an interpreter is applications reach a level of complexity for which manual testing correct [5]. and human inspection are not enough to guarantee quality in software development. Even when using automated unit tests, In this context, this work provides the specification of such tests rarely cover all interesting cases of code, which a test generator for Java programs, using the typing rules means that a bug could never be discovered, once the code is of Featherweight Java [6] (FJ) to generate only well-typed tested against the same set of rules over and over again. This programs. FJ is a small core calculus of Java with a rigorous paper addresses the problem of generating random well-typed programs in the context of Featherweight Java, a well-known semantic definition of its main core aspects. The motivations object-oriented calculus, using QuickCheck, a Haskell library for using FJ as a starting point are that it is compact, so for property-based testing. we can model our test generators in a way that it can be extended with new features, and its minimal syntax, typing I. INTRODUCTION rules, and operational semantics fit well for modeling and Nowadays, Java is one of the most popular programming proving properties for the compiler and programs. As far as languages [1]. It is a general-purpose, concurrent, strongly we know, there are no well-typed test generators for FJ. This typed, class-based object-oriented language. Since its release work aims to fill this gap by specifying a generator for FJ in 1995 by Sun Microsystems, and currently owned by Oracle programs using QuickCheck, a property-based testing library Corporation, Java has been evolving over time, adding features for Haskell. We are aware that using automated testing is not and programming facilities in its new versions. In a recent ma- sufficient to ensure correctness, but it can expose bugs before jor release of Java, new features such as lambda expressions, using more formal approaches, like formalizing the semantics method references, and functional interfaces, were added to in a proof assistant. the core language, offering a programming model that fuses Specifically, we made the following contributions: the object-oriented and functional styles [2]. 1 Considering the growth in adoption of the Java language • We implement an interpreter for FJ in Haskell, which for large projects, many applications have reached a level of can be used as the basis to study new features on the complexity for which testing, code reviews, and human in- object-oriented context. spection are no longer sufficient quality-assurance guarantees. • We provide a type-directed heuristic [7] for constructing This problem increases the need for tools that employ static random programs. We conjecture that our specification analysis techniques, aiming to explore all possibilities in an is sound with respect to FJ type system, i.e. it generates application, in order to guarantee the absence of unexpected only well-typed programs. behaviors [3]. Normally, this task is hard to be accomplished • We use QuickCheck as a lightweight manner to check due to computability issues considering certain problem sizes. if all generated programs are well-typed and to test our For overcoming this situation it is possible to model formal interpreter against type soundness proofs in order to subsets of the problem applying a certain degree of abstraction, validate the proposed approach. using only properties of interest, facilitating the understanding The remainder of this text is organized as follows: Section II of the problem and also allowing the use of automatic tools [4]. summarizes FJ. Section III presents the process of generating Therefore, an important research area concerns the formal well-typed random programs in the context of FJ. Section IV semantics of languages and type-system specification, which shows the results of testing type-safety properties of FJ with enables formal proofs and establishing program properties. Be- QuickCheck. Section V discusses related works. Finally, we sides, solutions can be machine checked providing a degree of present the final remarks in Section VI. confidence that cannot be reached using informal approaches. We should note that without a formal semantics it is impossible 1The source-code for our interpreter and the test suite is available at https: to state or prove anything about a language with certainty. For //github.com/sfeitosa/fj-qc. II. FEATHERWEIGHT JAVA and y range over variables, d and e range over expressions. Featherweight Java (FJ) [6] is a minimal core calculus for Throughout this paper, we write C as shorthand for a possibly Java, in the sense that as many features of Java as possible are empty sequence C1; :::; Cn (similarly for f, x, etc.). An empty omitted, while maintaining the essential flavor of the language sequence is denoted by •, and the length of a sequence x¯ and its type system. However, this fragment is large enough is written #x¯. We use Γ to represent an environment, which to include many useful programs. A program in FJ consists is a finite mapping from variables to types, written x : T , of the declaration of a set of classes and an expression to be and we let Γ(x) denote the type C such that x: C 2 Γ. We evaluated, that corresponds to the Java’s main method. slightly abuse notation by using set operators on sequences. FJ is to Java what λ-calculus is to Haskell. It offers similar Their meaning is as usual. operations, providing classes, methods, attributes, inheritance Syntax and dynamic casts with semantics close to Java’s. The Feath- erweight Java project favors simplicity over expressivity and L ::= class declarations offers only five ways to create terms: object creation, method class C extends fC f; K Mg invocation, attribute access, casting and variables [6]. The K ::= constructor declarations following example shows how classes can be modeled in FJ. C(C f) fsuper(f); this:f = f; g There are three classes, A, B, and Pair, with constructor and M ::= method declarations method declarations. C m(C x) f return e; g classA extends Object { e ::= expressions A() { super(); } x variable } e:f field access classB extends Object { e:m(e) method invocation B() { super(); } C(e) } new object creation class Pair extends Object { (C) e cast Object fst; Object snd; Pair(X fst, Y snd) { Fig. 1. Syntactic definitions for FJ. super(); this.fst=fst; A class table CT is a mapping from class names, to class this.snd=snd; L } declarations , and it should satisfy some conditions, such Pair setfst(Object newfst) { as each class C should be in CT, except Object, which is a return new Pair(newfst, this.snd); special class; and there are no cycles in the subtyping relation. } Thereby, a program is a pair (CT; e) of a class table and an } expression. In the following example, we can see different kinds of Figure 2 shows the rules for subtyping, where we write C terms: new A(), new B(), and new Pair(...) are ob- <: D when C is a subtype of D. ject constructors, and .setfst(...) refers to a method invocation. C <: C new Pair(new A(),new B()).setfst(new B()); C <: D D <: E FJ semantics provides a purely functional view without side C <: E effects. In other words, attributes in memory are not affected CT(C) = class C extends D f ... g by object operations [8]. Furthermore, interfaces, overloading, C <: D call to base class methods, null pointers, base types, abstract methods, statements, access control, and exceptions are not Fig. 2. Subtyping relation between classes. present in the language. As the language does not allow side effects, it is possible to formalize the evaluation just using The authors also proposed some auxiliary definitions for the FJ syntax, without the need for auxiliary mechanisms to working in the typing and reduction rules. These definition are model the heap [8]. Next, we present the original description given in Figure 3. The rules for field lookup demonstrate how of FJ [6]. to obtain the fields of a given class. If the class is Object, an empty list is returned. Otherwise, it returns a sequence A. Syntax and Auxiliary Functions C f pairing the type of each field with its name, for all fields The abstract syntax of FJ is given in Figure 1, where L rep- declared in the given class and all of its superclasses. The rules resents classes, K defines constructors, M stands for methods, for method type lookup (mtype) show how the type of method and e refers to the possible expressions. The metavariables m in class C can be obtained. The first rule of mtype returns A, B, C, D, and E can be used to represent class names, f a pair, written B ! B, of a sequence of argument types B and g range over field names, m ranges over method names, x and a result type B, when the method m is contained in C. Otherwise, it returns the result of a call to mtype with the it obtains the fields of class C0, matching the position of superclass.

Load more