Grammar-Based Testing Using Realistic Domains in PHP Ivan Enderlin, Frédéric Dadeau, Alain Giorgetti, Fabrice Bouquet
Total Page:16
File Type:pdf, Size:1020Kb
Grammar-Based Testing using Realistic Domains in PHP Ivan Enderlin, Frédéric Dadeau, Alain Giorgetti, Fabrice Bouquet To cite this version: Ivan Enderlin, Frédéric Dadeau, Alain Giorgetti, Fabrice Bouquet. Grammar-Based Testing using Realistic Domains in PHP. A-MOST 2012, 8th Workshop on Advances in Model Based Testing, joint to the ICST’12 IEEE Int. Conf. on Software Testing, Verification and Validation, Jan 2012, Canada. pp.509–518. hal-00931662 HAL Id: hal-00931662 https://hal.archives-ouvertes.fr/hal-00931662 Submitted on 16 Jan 2014 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Grammar-Based Testing using Realistic Domains in PHP Ivan Enderlin, Fred´ eric´ Dadeau, Alain Giorgetti and Fabrice Bouquet Institut FEMTO-ST UMR CNRS 6174 - University of Franche-Comte´ - INRIA CASSIS Project 16 route de Gray - 25030 Besanc¸on cedex, France Email: fivan.enderlin,frederic.dadeau,alain.giorgetti,[email protected] Abstract—This paper presents an integration of grammar- Contract-based testing [5] has been introduced in part to based testing in a framework for contract-based testing in PHP. address these limitations. It is based on the notion of Design It relies on the notion of realistic domains, that make it possible by Contract (DbC) [6] introduced by Meyer with Eiffel [7]. to assign domains to data, by means of contract assertions written inside the source code of a PHP application. Then a test A contract is a way to embed a piece of model inside the generation tool uses the contracts to generate relevant test data code of a program. It mainly consists of two kinds of simple for unit testing. Finally a runtime assertion checker validates the modelling elements: invariants describe properties that should assertions inside the contracts (among others membership of data hold at each step of the execution, pre- and postconditions to realistic domains) to establish the conformance verdict. We respectively represent the conditions that have to hold for an introduce here the possibility to generate and validate complex textual data specified by a grammar written in a dedicated operation/method to be invoked, and the conditions that have grammar description language. This approach is tool-supported to hold after the execution of the operation/method. and experimented on the validation of web applications. Various contract languages extend programming languages, Keywords-Grammar-based testing, contracts, realistic domains, such as JML for Java [8], ACSL for C [9], Spec# for PHP, random generation, rule coverage. C# [10]. The advantages of contracts are numerous: they make it possible to introduce formal properties inside the code I. INTRODUCTION of a program, using annotations. Besides, the properties are Model-based testing [1] is a technique according to which a expressed in the same formalism as the code, without any (preferably formal) model of the System Under Test (SUT) is gap due to the abstraction level. Moreover, properties can be employed in order to validate it. In this context, the model can exploited for (unit) testing. Indeed, the information contained be of two uses. First, it can be used to compute the test cases in invariants and preconditions can be used to generate test by providing an exploitable abstraction of the SUT from which data. In addition, these assertions can be checked at run time test data or even test sequences can be computed. Second, the (preconditions and invariants are checked at the beginning of model can provide the test oracle, namely the expected result the method, postconditions and invariants at their end) and against which the execution of the SUT is checked. Model- thus provide a (partial) test oracle for free. The test succeeds based testing makes it possible to automate the test generation. if no assertion is violated, and fails otherwise. It is implemented in many tools, based on various modelling In a previous work, we have introduced Praspel, a tool- languages or notations [2]. In addition, model-based testing is supported specification language for contract-based testing also an efficient approach for testing software evolution and in PHP [11]. Praspel extends contracts with the notion of regression, mainly by facilitating the maintainability of the test realistic domain, which makes it possible to assign a domain of repository [3]. values to data (class attributes or method parameters). Realistic Even though model-based testing is very convenient in domains present two useful features for testing: predicability, theory, its application is often restricted to critical and/or which is the possibility to check that a data belongs to its embedded software that require a high level of validation [4]. associated domain, and samplability, which is the possibility Several causes can be identified. First, the design of the to automatically generate a data from a realistic domain. A formal model represents an additional cost, and is sometimes library for predefined basic realistic domains (mainly scalar more expensive than the manual validation cost. Second, the domains, strings, arrays) is already available along with a design of the model is a complex task that requires modelling test environment.1 It is to be noted that one of the main skills and understanding of formal methods, thus necessitating arguments against annotation languages is that they require the dedicated skilled engineers for being put into practice. Finally, source code of the application to be employed, which prevents as the model represents an abstraction of the SUT, the distance them from being used in a black-box approach. Nevertheless, between the model and the considered system may vary and applying contract-based testing to interpreted languages, such thus an additional step of test concretization is often required as PHP, makes full sense, since this limitation does not apply. to translate abstract test cases (computed from the model) into Validating PHP web applications often involves the gen- executable test scripts (using the API provided by the SUT and the concrete data that it operates). 1Freely available at http://hoa-project.net. class EmailAddress extends String f eration of structured textual test data (e.g. specific pieces of HTML code produced by a web application, email addresses, public function predicate($q) f // regular expression for email addresses SQL or HTTP queries, or more complex messages). To facili- // see. RFC 2822, 3.4.1. address specs. tate the use of Praspel in this context, we provide a grammar- $regexp = ’:::’; // it is a string. based testing [12] mechanism that makes it possible to express return false === parent::predicate($q) and validate structured textual data. // it is an email address. && 0 !== preg_matches($regexp, $q); The contributions of this paper are twofold. First, we g introduce a grammar description language to be use with public function sample() f PHP, named PP (for PHP Parser). This language makes it // string of authorized chars $chars = ’ABCDEFGHIJKL:::’; possible to describe tokens and grammar rules, that are then // array of possible domain extensions exploited by a dedicated PHP interpreter to validate a text $doms = array(’net’,’org’,’edu’,’com’); $q = ’’; w.r.t. the expected syntax given by the grammar. Second, we $nbparts = mt_rand(2, 4); provide grammar-based testing mechanisms that automatically for($i = 0; $i < $nbparts; ++$i) f generate instances of text using various data generators. These if($i > 0) two contributions are gathered into two new realistic domains // add separator dot or arobase that can be used in Praspel annotations in a PHP program: a $q .= ($i == $nbparts - 1) ? ’@’ : ’.’; realistic domain for regular expressions and a realistic domain // generate firstname or name or domain name for grammars, parameterized by a grammar description file. $len = rand(1,10); The paper is organized as follows. Section II explains the for($j=0; $j < $len; ++$j) f $index = rand(0, strlen($chars) - 1); notion of realistic domain and presents its implementation in $q .= $chars[$index]; Praspel for PHP. Then, the PP language and its semantics are g defined in Section III. Section IV proposes data generation g algorithms from grammars. Then, we report in Section V $q .= ’.’ . $doms[rand(0, count($doms) - 1)]; return $q; some experiments aiming at validating our tool and showing g its usefulness and efficiency in practice. Related works are g presented in Section VI. Finally, Section VII concludes and Fig. 1. PHP code of a realistic domain for email addresses presents future works. 2) Samplability: The second feature of a realistic domain II. REALISTIC DOMAINS AND PRASPEL is to propose a value generator, called the sampler, that makes This section presents the notion of realistic domain and its it possible to generate values in the realistic domain. The data application to PHP programs [11]. Realistic domains are de- value generator can be of many kinds: a random generator, a signed for test generation purposes. They specify which values walk in the domain, an incrementation of values, etc. can be assigned to a data in a given program. Realistic domains are well-suited to PHP, since this language is dynamically B. Realistic Domains in PHP typed (i.e. no types are assigned to data) and realistic domains thus introduce a specification of data types that are mandatory In PHP, we have implemented realistic domains as classes for test data generation. We first introduce general features of providing at least two methods, corresponding to the two realistic domains, and then present their application to PHP.