<<

sensors

Article Automated Specification-Based Testing of REST

Ovidiu Banias, *, Diana Florea, Robert Gyalai and Daniel-Ioan Curiac *

Automation and Applied Informatics Department, Politehnica University of Timisoara, Parvan 2, 300223 Timisoara, Romania; diana.fl[email protected] (D.F.); [email protected] (R.G.) * Correspondence: [email protected] (O.B.); [email protected] (D.-I.C.); Tel.: +40-256-403-211 (O.B.)

Abstract: Nowadays, REpresentational State Transfer Application Programming Interfaces (REST APIs) are widely used in web applications, hence a plethora of test cases are developed to validate the APIs calls. We propose a solution that automates the generation of test cases for REST APIs based on their specifications. In our approach, apart from the automatic generation of test cases, we provide an option for the user to influence the test case generation process. By adding user interaction, we aim to augment the automatic generation of APIs test cases with human testing expertise and specific context. We use the latest version of OpenAPI 3.x and a wide range of coverage metrics to analyze the functionality and performance of the generated test cases, and non-functional metrics to analyze the performance of the APIs. The experiments proved the effectiveness and practicability of our method.

Keywords: testing; specification-based testing; automatic test case generation; REST API; OpenAPI 3.x

  1. Introduction

Citation: Banias, , O.; Florea, D.; Nowadays, in the era of the internet and interconnectivity, communication and inter- Gyalai, R.; Curiac, D.-I. Automated actions over web applications are very common. Web applications are not only interacting Specification-Based Testing of REST with their users through various devices, such as PCs, smartphones, and tablets, but they APIs. Sensors 2021, 21, 5375. https:// also exchange information with other applications. For these communications between doi.org/10.3390/s21165375 applications to work properly, web services usually employ APIs as a backbone process. Hence, the high demand and usage of web services require high quality services that Academic Editor: Giovanni Betta can be effectively ensured based on an automated and process of the functionalities and performance of the APIs. Received: 17 June 2021 Soon after the release of the REST architecture [1], it became the most popular architec- Accepted: 6 August 2021 ture used for Web APIs development. The reason behind this quick adoption is the fact that Published: 9 August 2021 REST APIs are based on the HTTP protocol and make web applications easy to develop and maintain [2,3]. A study over the advantages of REST APIs [2] states that they are easy Publisher’s Note: MDPI stays neutral to use and comprehend, have a fast response time, and support a multitude of data types. with regard to jurisdictional claims in It is worth mentioning that REST APIs are the most utilized API Architectural Style existing published maps and institutional affil- in ProgrammableWeb’s (https://www.programmableweb.com/news/which-api-types- iations. and-architectural-styles-are-most-used/research/2017/11/26, accessed on 6 August 2021) APIs directory, with a utilization percentage of 81.53%. Furthermore, REST APIs will most probably continue being the most popular API architectural style in the following years [2]. The fact that REST APIs are preferred by both developers and businesses generates Copyright: © 2021 by the authors. the need for continuous testing of APIs to maintain the quality and performance of the Licensee MDPI, Basel, Switzerland. related web services. Automating the test case generation and execution upon REST APIs This article is an open access article can be very efficient to the entire development and quality assurance. The distributed under the terms and resulting set of test cases, automatically generated based on the API specification, may conditions of the Creative Commons cover a higher number of scenarios that would probably not be possible by manual test Attribution (CC BY) license (https:// case creation. OpenAPI is the most common specification language for REST APIs [4] and creativecommons.org/licenses/by/ version 3.x is the latest version released. Due to the major feature improvement compared 4.0/).

Sensors 2021, 21, 5375. https://doi.org/10.3390/s21165375 https://www.mdpi.com/journal/sensors Sensors 2021, 21, 5375 2 of 20

to version 2.x, in the current work, we propose a test case generation process based on the latest OpenAPI version. In the process of developing a specification-based solution for automated testing, the test case coverage metrics are arguably the most useful functional descriptors that can be analyzed to check how much of the system has been tested through the generated test cases [4–6]. In [7], a practical set of coverage criteria assembled in a comprehensive eight-level model, named “Test Coverage Model”, is proposed. The model offers different levels of coverage based on code coverage and faults detection measures. We employ these test coverage criteria to evaluate and demonstrate the effectiveness of our automated test case generation approach in real world settings. To summarize, this paper provides the following contributions: • A novel technique to generate test cases for REST APIs based on black-box testing prin- ciples. This includes four types of configurations for generation of test cases, from fully automated test case generation to semi-automated test case generation with input from the user for certain parameters with no detailed values in the OpenAPI specification. • Our approach mainly focuses on but includes the testing of one non-functional property of performance of the API. This non-functional metric is calculated as the average time for a response to be received after a request is sent. • A wide range of coverage criteria analysis, by providing different types of coverage criteria based on the REST architecture and HTTP protocol properties in the OpenAPI specification. This approach aims to provide a more complex overview of the generated test cases for a better study of the validity of the test cases after they were executed. • Our approach uses OpenAPI version 3.x, the latest OpenAPI specification. The rest of the paper is organized as follows. Section2 provides a theoretical back- ground that we used to design and implement our approach. It also presents related work in the field of automated testing of REST APIs and underlines the specificities of our method. Section3 describes the proposed method and its implementation. Section4 offers a detailed and practical example of how the proposed procedure can be used to automatically generate test cases, together with some illustrative results that confirm the validity of the suggested methods. Section5 provides discussion and limitations regarding our method. Finally, conclusions are drawn in Section6.

2. Background 2.1. REST REpresentational State Transfer (REST) is an architectural paradigm employed to develop web services-based systems. Although REST is widely used when developing web services, it is not yet a standard. REST is using a combination of standards, such as HTTP, JSON, and XML, for describing its guidelines in developing HTTP web services. The main purpose of REST is to improve different non-functional properties of the web system, such as performance, scalability, simplicity, reliability, visibility, etc. Figure1 presents the usage of REST API in the context of the communication between the client and a web service. The or another client makes a request to the web server for a resource. The request can be considered as formed by the REST API endpoint URL, API method—which mostly includes the CRUD operations (GET, PUT, POST, DELETE) and the parameters. The response is a representation of the resource, generally in the JSON or XML format. Status codes represent the type of responses received by the HTTP-based REST API calls. A status code is a threedigit number, the first digit represents the class (for example, 2xx—successful, 5xx—server error, etc.), the last two digits represent the specific response type (for example, 202—accepted, 401—unauthorized). They only represent a summary of the response and not the actual information that is requested. Status codes are very important and useful in the testing of REST APIs [8]. In our proposed solution, we focus on status codes to generate test cases that cover both successful and failed requests. In addition to this, we further use the information provided by the status codes to calculate the status code coverage metric and analyze the metric results. Sensors 2021, 21, 5375 3 of 20

2.2. OpenAPI 3.0 Specification OpenAPI (formerly known as Swagger Specification) is a widely used specification language for REST APIs [4]. With OpenAPI, the capabilities and functions of a web service are exposed without having access to the source code. Moreover, an entity can therefore understand how to interact remotely with the web service having minimal documentation Sensors 2021, 21, x FOR PEER REVIEW 3 of 21 about it written in the OpenAPI specification language [9]. Usually, OpenAPI defines the specification of the REST API, mostly in JSON or YAML formats.

FigureFigure 1. Applicability 1. Applicability of REST API. of REST API.

Status Thecodes OpenAPIrepresent the architecture type of responses contains received HTTPby the HTTP-based actions, such REST asAPI GET, POST, PUT, and calls. DELETE,A status code and is a threedigit parameters number, which the first contain digit represents the name, the type,class (for format, example, and the location where 2xx—successful,the given 5xx—server parameter error, is used. etc.), the Likewise, last two digits the OpenAPIrepresent the specification specific response contains a list of various type (for example, 202—accepted, 401—unauthorized). They only represent a summary of theresponses response and represented not the actual by information HTTP return that is codes requested. [10]. Status codes are very important andThe useful drawback in the testing of the of REST OpenAPI APIs [8]. specification In our proposed language solution, we is focus that it does not cover the on statusdependencies codes to generate between test cases the that parameters. cover both successful Parameter and failed dependencies requests. In are common for web additionservices to this, aswe constraints further use the could informat beion applied provided on by certain the stat parametersus codes to calculate to restrict their interactions the status code coverage metric and analyze the metric results. or to require a combination of parameters to form a valid request [11]. There are a few 2.2. OpenAPIversions 3.0 ofSpecification OpenAPI, the most used one being the OpenAPI 2.0 version, while the latest OpenAPIversion (formerly available known on theas Swagger market Speci isfication) OpenAPI is a widely version used 3.x specification [4]. After reviewing the API languagespecifications for REST APIsavailable [4]. With OpenAPI, at APIs.guru the capabilities (https://apis.guru/browse-apis/ and functions of a web service , accessed on 6 are exposedAugust without 2021), having we decided access to the to employsource code. the Moreover, OpenAPI an 3.x-basedentity can therefore specifications to implement understandand validate how to interact our proposedremotely with solution. the web service having minimal documentation about it written in the OpenAPI specification language [9]. Usually, OpenAPI defines the specification of the REST API, mostly in JSON or YAML formats. The2.3. OpenAPI Testing architecture REST APIs contains HTTP actions, such as GET, POST, PUT, and DE- LETE, and parametersAPI testing which is an contain important the name, part type, of aformat, software and the testing location phase where as the this layer is closer to the givenuser parameter and itis alsoused. connectsLikewise, the the OpenAPI client to specification the server. contains API testing a list of isvarious very important to validate responses represented by HTTP return codes [10]. Thewhether drawback an APIof the meets OpenAPI the specification expectations language regarding is that reliability,it does not cover performance, the and security, as dependencieswell as between the specified the parameters. functionality Parameter in dependencies various scenarios are common [4]. for web ser- vices as constraintsTesting could a REST be applied API on can certain be challenging parameters to restrict as it requires their interactions interaction or with the requests to requireand a responses combination between of parameters the to client form anda valid server request entities [11]. There [5 ].are It a is few as ver- complex as the business sions of OpenAPI, the most used one being the OpenAPI 2.0 version, while the latest ver- service it exposes. The approach presented in [12] lists a series of principles that could sion available on the market is OpenAPI version 3.x [4]. After reviewing the API specifi- cationsprovide available a at suitable APIs.guru strategy (https://apis.guru/browse-apis for testing an API./, accessed According on 6 toAugust [13], web services testing 2021),can we decided be split to employ into six the mainOpenAPI stages: 3.x-based validation specifications of to the implement WSDL and (Web val- Services Description idate Language),our proposed solution. , functional testing, , , and adherence to standards. In our proposed solution, we focus on the functional testing by validating 2.3. Testing REST APIs that the communication between the server and the clients, and the requests and responses API testing is an important part of a phase as this layer is closer to the userare and behaving it also connects accordingly. the client to the server. API testing is very important to vali- date whetherIn an the API current meets the paper expectations we address regarding a reliability, real-life performance, scenario: theand informationsecu- regarding the rity, asactual well as API the specified implementation functionality is limitedin various to scenarios what is [4]. provided in the specification, thus restricting Testingthe number a REST andAPI can types be challenging of test cases as it thatrequires can interaction be implemented with the requests and executed. Taking into and responses between the client and server entities [5]. It is as complex as the business serviceconsideration it exposes. The theapproach fact that presented the source in [12] codelists a ofseries the of web principles application that could isnot accessible, the API provideis black-boxa suitable strategy tested for [5 testing]. Consequently, an API. According functional to [13], services is the testing most can common type of black- be splitbox into testing six main that stages: can bevalidation employed. of the WSDL Black-box (Web testingServices isDescription also known Lan- as specification-based guage),testing unit testing, because functional the , underregression test testing, is considered load testing, a and black adherence box with to inputs and outputs. standards.Forspecification-based In our proposed solution, API we testing, focus on thethe functional test cases testing are executed by validating by sendingthat HTTP requests to the API, and the responses of the API to the requests are analyzed and verified against the specifications [4]. Sensors 2021, 21, 5375 4 of 20

Fault-based testing is a special type of testing technique that anticipates incorrect behavior of the system under test. It has the purpose of injecting incorrect values in the requests and to validate that the system will respond accordingly without failure. The fault-based testing can be applied when testing an API [4]. Requests with wrong values for the API parameters are sent on purpose, verifying whether the web service will return error status codes. In this situation, the test case will be considered successful if the system will not accept the wrong values. Based on the currently mentioned principles and types of testing, our solution is designed to address both black-box testing and functional testing. We also employed fault-based testing for our test cases to cover more scenarios.

2.4. Non-Functional and Functional Testing and Metrics over REST APIs Each software is developed based on two types of : functional requirements and non-functional requirements. Non-functional test cases are aiming to validate the system behavior against the written requirements and are addressing the following aspects: security, accessibility, performance, scalability, compatibility, reliability, usability [14]. Testing the non-functional properties of REST APIs is a challenging task since the APIs specifications provide little information about such properties. Based on our research, there are few software tools that can test non-functional properties of APIs, such as the tool presented in [15] and other online software tools, such as SOAP UI (https://www.soapui. org/, accessed on 6 August 2021) and Moesif API Analytics (https://www.moesif.com/ solutions/track-api-program, accessed on 6 August 2021). These tools are mostly focused on measuring the performance and the availability of the operations described in the API specifications. We will also focus on a performance metric regarding the average duration needed for a request to be executed. Compared to non-functional testing, the functional testing is easier to perform on REST APIs. We employ coverage metrics in our study, having the API requests as inputs and the API responses as outputs [7]. Consequently, the following coverage criterion types, theoretically presented in [7], are measured in the next sections: • Input coverage criteria: path coverage, operation coverage, parameter coverage, parame- ter value coverage. • Output coverage criteria: status code coverage, response body properties coverage, content-type coverage. According to [7], the coverage criteria, mentioned in the previous paragraph, can be arranged into eight test coverage levels (Figure2), from the weakest coverage TCL0 to the strongest one, namely TCL7. To reach a specific level, all the previous coverage criteria Sensors 2021, 21, x FOR PEER REVIEW 5 of 21 from the other levels must be met. The role of this Test Coverage Model and its levels is to rank the test suits according to how much of the functionality is covered by the test cases.

FigureFigure 2. 2. TestTest Coverage Coverage Model: Model: Levels and Levels Criteria and [7]. Criteria [7].

3. Related Work Analyzing the scientific literature, we have found several approaches and methods of testing web services by using the project specifications to generate test cases. Many of these works were focused on web APIs developed using the SOAP standard. Approaches such as [16,17] are generating specification-based test cases focusing on the WSDL defini- tions of the SOAP Web APIs. The method presented in [17] focuses on fault-based testing, which is one of our testing approaches. In [16], the test case generation based on specifi- cations is mainly focused on data types of different parameters and on operations that can be applied to API parameters. The approach presented in [16] is similar to our approach, as we also focus on testing parameters based on their data types and on testing various operations performed over the API parameters. In recent years, a larger number of methods have been proposed for automation of testing and generation of test cases for REST APIs. These are either black-box or white- box approaches. Arcuri proposes EvoMaster [18], an automated white-box testing ap- proach which generates test cases using an evolutionary algorithm. In this approach, the API specification is based on OpenAPI, similar to our approach. However, being a white- box testing solution, it requires access to more than just the API specification, but also to the source code. In the case of [18], in order for the solution to work it requires access to the Java bytecode of the API. In further work, Zhang et al. [19] develops the previously mentioned EvoMaster into a more complex solution for resource-based test case genera- tion by adding templates with organized test actions and using a search-based method to generate test cases. Black-box testing techniques over APIs are employed when access to the source code is not possible and, therefore, test case generation can be automated based only on the API specification. Chakrabarti and Kumar [20] propose an automated generation of test cases for black-box testing (Test-The-Rest) where a manual definition of the model is re- quired. Another method that is worth mentioning employs an open-source software tool called Tcases to generate test cases from the API specification [5]. Similar to our proposed solution, the test cases are generated from the OpenAPI specification and the method that generates the test cases involves combinatorial testing (i.e., both valid and invalid values are considered). There is a significant difference between our proposed solution and the one presented in [5]: while Tcases generates the test cases in a model-based manner to a minimal coverage of the requirements, our solution uses combinatorial and domain test-

Sensors 2021, 21, 5375 5 of 20

3. Related Work Analyzing the scientific literature, we have found several approaches and methods of testing web services by using the project specifications to generate test cases. Many of these works were focused on web APIs developed using the SOAP standard. Approaches such as [16,17] are generating specification-based test cases focusing on the WSDL definitions of the SOAP Web APIs. The method presented in [17] focuses on fault-based testing, which is one of our testing approaches. In [16], the test case generation based on specifications is mainly focused on data types of different parameters and on operations that can be applied to API parameters. The approach presented in [16] is similar to our approach, as we also focus on testing parameters based on their data types and on testing various operations performed over the API parameters. In recent years, a larger number of methods have been proposed for automation of testing and generation of test cases for REST APIs. These are either black-box or white-box approaches. Arcuri proposes EvoMaster [18], an automated white-box testing approach which generates test cases using an evolutionary algorithm. In this approach, the API specification is based on OpenAPI, similar to our approach. However, being a white-box testing solution, it requires access to more than just the API specification, but also to the source code. In the case of [18], in order for the solution to work it requires access to the Java bytecode of the API. In further work, Zhang et al. [19] develops the previously mentioned EvoMaster into a more complex solution for resource-based test case generation by adding templates with organized test actions and using a search-based method to generate test cases. Black-box testing techniques over APIs are employed when access to the source code is not possible and, therefore, test case generation can be automated based only on the API specification. Chakrabarti and Kumar [20] propose an automated generation of test cases for black-box testing (Test-The-Rest) where a manual definition of the model is required. Another method that is worth mentioning employs an open-source software tool called Tcases to generate test cases from the API specification [5]. Similar to our proposed solution, the test cases are generated from the OpenAPI specification and the method that generates the test cases involves combinatorial testing (i.e., both valid and invalid values are considered). There is a significant difference between our proposed solution and the one presented in [5]: while Tcases generates the test cases in a model-based manner to a minimal coverage of the requirements, our solution uses combinatorial and domain testing principles to generate test cases, so that the variety of test cases covers as many scenarios as possible. RESTler [21] approach is similar to our approach in the sense that both use status codes as a major factor in testing results. Whilst RESTler aims to be a tool with the goal of finding faults in the REST API, we instead aim to test different aspects of the REST API, not only its faults. Ed-douibi [4] proposes another method of generating specification-based testing. Our approaches are related in many aspects, such as OpenAPI being the API specification source, as well as using fault-based testing. In addition, our proposed solution introduces four different configurations that allow the user to choose what kind of test cases to be generated. The Standard configuration is the simplest and most related to [4] and was used as the starting point for our solution. The other three configurations contain all of the available parameters from the specification that require an initial input from the user to generate test cases that depend on parameters that are not specified or required. Besides this, we further developed the single test coverage criteria used in [4] by analyzing different types of coverage for the tests that were executed. The OpenAPI version used by Ed-douibi [4] is OpenAPI version 2.0. We focused on the newer version of OpenAPI 3.x, that is growing in usage and comes with updated document structure, including new and reusable component objects and improved JSON schema. Karlsson [10] proposes QuickREST, a solution similar to ours but focusing on property-based testing. Related to the non-functional properties testing of REST API, there is a limited number of approaches developed in this field. There are some commercial tools, such as SOAP Sensors 2021, 21, 5375 6 of 20

UI (https://www.soapui.org/, accessed on 6 August 2021), which propose testing non- functional properties of APIs. However, they mostly focus on load testing. In [15], a framework named GADOLINIUM is presented. It automatically tests non-functional properties, such as performance and availability of the API. Influenced by this approach, we decided to include the non-functional performance metric beside our other functional metrics, mainly to study if performance is impacted by the different configuration in which we generate tests for an API. The previously mentioned approaches are focusing on test case generation via API specification, as well as execution of test cases to measure the test coverage criterion. Nevertheless, we consider the test coverage of these approaches could be improved. Hence, our solution comes with a novel approach and a detailed report on multiple test coverage metrics. Moreover, our solution provides both fully automated and semi-automated generation of test cases through the four configurations. Therefore, the user has a wider range of choices for the types of test cases to be generated which makes our solution different from the literature that has been reviewed. In addition, our approach also tries to shift the focus to a non-functional metric, such as performance. Based on the literature review, to the best of our knowledge, our proposed solution is the first that combines testing and measuring of both functional and non-functional properties of a REST API.

4. Proposed Approach The purpose of our approach is to validate any REST API by automatically generating and executing test cases based solely on its OpenAPI specification. Since we have no information regarding the way the API was implemented, we follow the black-box testing principles. To make the procedure of test generating more versatile, we proposed to include different levels of configuration that can be chosen by the user. For the purpose of having detailed results, we decided to include several coverage criteria as well as a performance metric when creating the test statistics for an API. The following subsections will present the proposed method, the four types of testing configurations that can be selected, and an in-depth description of the actual implementation.

4.1. Method Description Our proposed solution requires either the definition of the API under test or the link to the definition of the API in JSON format as input. This definition is then interpreted by the system. It will output a collection of specifically designed objects that will be used in our implementation, as well as execution, to encapsulate the information contained in the API’s definition. Such an object includes information about the number and type of parameters, the path, and the type of an operation, etc. A flow diagram of our proposed solution, including the activities and actions of our system, is presented in Figure3. Based on the pre-runtime settings, there are different types of test cases that our proposed test case generator system can provide: 1. Test cases with minimum number of parameters: Only the parameters that are spec- ified as “required” in the OpenAPI specification of the API under test. Required is a Boolean property type which, in this case, must have the value true. If the property does not appear in the specification, then it is considered with the value false by default [9]. 2. Test cases with maximum number of parameters: All the parameters that are defined in the OpenAPI specification will be used for generating the test cases. The required property will be ignored. 3. Automated test cases: No input from the user is required, the test cases are automati- cally generated. 4. Partial test cases: The user is required to give an input only if the information that is needed is not in the specification or API definition. When a request is made for the user to input some information, the software will require the name of the parameter, the data type, and, optionally, its description. Usually, sample data or Sensors 2021, 21, x FOR PEER REVIEW 7 of 21 Sensors 2021, 21, 5375 7 of 20

4.1. Method Description Ourother proposed information solution is provided requires in either the definition the definition of the of parameter the API under regarding test or its the values, link to the althoughdefinition this of the information API in JSON isnot format explicitly as input. written This definition in the specification. is then interpreted Such a case by the system.is presented It will output in Figure a 4collection, where the of parameterspecifically cultureCodedesigned objectsdoes notthat have will thebe usedexample in our implementation,attribute defined as inwell the as specification, execution, to therefore encapsulate the standardthe information implementation contained wouldin the API’s senddefinition. a random Such string an object that wouldincludes not information be accepted about by the the API. number Hence, and the usertype will of pa- see the information regarding the parameter that is provided in the specification and will rameters, the path, and the type of an operation, etc. A flow diagram of our proposed know what kind of input is expected. Once the software receives the input from the solution, including the activities and actions of our system, is presented in Figure 3. user, the test case will be automatically generated.

FigureFigure 3. 3. SystemSystem activities activities and and actions actions (activity (activity diagram). diagram).

Sensors 2021, 21, x FOR PEER REVIEW 8 of 21

Based on the pre-runtime settings, there are different types of test cases that our pro- posed test case generator system can provide: 1. Test cases with minimum number of parameters: Only the parameters that are spec- Sensors 2021, 21, x FOR PEERified REVIEW as “required ” in the OpenAPI specification of the API under test. Required is a 8 of 21 Boolean property type which, in this case, must have the value true. If the property does not appear in the specification, then it is considered with the value false by de- fault [9]. Based on the pre-runtime settings, there are different types of test cases that our pro- 2. Test cases with maximum number of parameters: All the parameters that are defined posed test case generator system can provide: in the OpenAPI specification will be used for generating the test cases. The required property will1. beTest ignored. cases with minimum number of parameters: Only the parameters that are spec- 3. Automated testified cases: as No“required input” from in the the OpenAPI user is required, specification the test of casesthe API are under automat- test. Required is a ically generated.Boolean property type which, in this case, must have the value true. If the property 4. Partial test cases:does The not user appear is required in the specification, to give an input then only it is if considered the information with thethat value is false by de- needed is not infault the [9]. specification or API definition. When a request is made for the user to input2. someTest casesinformation, with maximum the software number will of require parame theters: name All ofthe the parameters parameter, that are defined the data type, inand, the optionally, OpenAPI specificationits description. will Usually, be used sample for generating data or theother test infor- cases. The required mation is providedproperty in the will definition be ignored. of the parameter regarding its values, although this information3. Automated is not explicitly test cases: written No ininput the fromspecification. the user Suchis required, a case isthe presented test cases are automat- in Figure 4, whereically the generated. parameter cultureCode does not have the example attribute de- fined in the4. specification,Partial test cases:therefore The the user standard is required implementation to give an input would only send if the a informationran- that is dom string thatneeded would is not not be in accepted the specification by the API. or Hence,API definition. the user Whenwill see a therequest infor- is made for the Sensors 2021, 21, 5375 mation regardinguser the to inputparameter some thatinformation, is provided the insoftware the specification will require and the will name know of the parameter,8 of 20 what kind of inputthe data is expected. type, and, Once optionally, the software its description. receives the Usually, input from sample the datauser, or other infor- the test case willmation be automatically is provided generated.in the definition of the parameter regarding its values, although this information is not explicitly written in the specification. Such a case is presented In the given example, the user would be presented with a dialog such as the one in in Figure 4, where the parameter cultureCode does not have the example attribute de- Figure5. The input from the user is then used in the test generation process as a valid fined in the specification, therefore the standard implementation would send a ran- value, replacing the usual defaults. Thus, all the operations that make use of this parameter dom string that would not be accepted by the API. Hence, the user will see the infor- are more likely to behave correctly and, in turn, make it easier to detect actual problems mation regarding the parameter that is provided in the specification and will know with the API. Given the fact that one parameter may be used in different operations, the what kind of input is expected. Once the software receives the input from the user, input of the user is cached and reused if necessary. the test case will be automatically generated.

Figure 4. Descriptive values for API parameters.

In the given example, the user would be presented with a dialog such as the one in Figure 5. The input from the user is then used in the test generation process as a valid value, replacing the usual defaults. Thus, all the operations that make use of this parame- ter are more likely to behave correctly and, in turn, make it easier to detect actual problems with the API. Given the fact that one parameter may be used in different operations, the input of the user is cachedFigure and 4. Descriptivereused if necessary. values for API parameters.

In the given example, the user would be presented with a dialog such as the one in Figure 5. The input from the user is then used in the test generation process as a valid value, replacing the usual defaults. Thus, all the operations that make use of this parame- ter are more likely to behave correctly and, in turn, make it easier to detect actual problems with the API. Given the fact that one parameter may be used in different operations, the input of the user is cached and reused if necessary. Figure 5. User input dialog example.Figure 5. User input dialog example.

To generate theTo test generate cases, each the test parameter cases, each will parameter have a series will of have correct a series and erroneous of correct and erroneous values dependingvalues on the depending parame onter thetype parameter [4]. When type the API [4]. Whenspecification the API provides specification param- provides parame- eters with valuesters ranging with values in a given ranging interval, in a given we apply interval, the we principles apply the of principles domain testing. of domain testing. A A test case willtest include case the will execution include the of a execution particular of request a particular with all request possible with combinations all possible combinations of parameters. ofTo parameters. determine if Toa test determine passed or if afailed, test passed it will be or verified failed, it if will the bevalues verified of all if the values of Figureall parameters 5. User input were dialog according example. to the specification provided for their specific category. In the case in which all the parameters have correct values, according to the specification, a successfulTo generate response the test with cases, status each code parameter of 2xx will will be have expected. a series On of thecorrect other and hand, erroneous when valuesperforming depending fault-based on the testing, parame anter error type response [4]. When will the be API expected specification if at least provides one parameter param- etersis given with an values incorrect ranging value. in a given interval, we apply the principles of domain testing. A testWhile case will the include test cases the are execution being executed, of a particular data is request collected with and all storedpossible for combinations later use. At ofthe parameters. end of the execution To determine of all if the a test test passed cases for or thefailed, given it will API, be the verified metrics if will the bevalues calculated of all and the statistics regarding coverage and API performance will be displayed. A combinatorial function will be applied based on the number of parameters necessary to perform a request. In the case the number of possible combinations is higher than a given

threshold, a specific function and method will be used to generate a suitable combination of parameters.  Given the sequence of p parameters for an API operation x1, x2,..., xp , a subset Vi = {v1, v2,...} of values for each parameter xi will be generated based on the parameter type. Consequently, we generate the following number of test cases:

p ∑(card(Vi) − 1) + 1 i=1

Any parameter xi will take in turn all its possible values in the set Vi, while the rest of the parameters will have the same value. When the total number of combinations is less or equal to the given threshold, all the possible combinations will be used. The resulting number of combinations will be: p ∏(card(Vi)). i=1 Sensors 2021, 21, 5375 9 of 20

4.2. Implementation Details In this paragraph we will describe our Java implementation using Eclipse IDE based on the classes, explaining classes functionalities and their roles. We used Maven to handle all our dependencies, one of them being the Jersey plug-in for making the needed requests to the API, as well as Google’s JSON-Simple plug-in to facilitate easier handling of the JSON inputs. When approaching this implementation, we aimed a high degree of decou- pling between the components to facilitate easier expansion or maintenance. The main components are the user interface, the utility classes as configurations and logging, and the business logic that generates and runs the test cases based on the given specification. The rest of this section will elaborate more on the core/business logic implementation. The main part of the program can, in turn, be divided into two sections: the parsing of the specification and the generation and execution of the test cases. The class diagram of Sensors 2021, 21, x FOR PEER REVIEW 10 of 21 our proposed solution is presented in Figure6. The functionality and implementation of the classes presented in Figure6 will be further detailed in the following paragraphs.

Figure 6. Structural description of the system (class diagram).

The ApiHandler class receives the URL or file file which contains the API definitiondefinition and then executes the process by makingmaking use of thethe otherother classes.classes. The ApiParser reads and parses the API definition definition and, when the process is over, outputs a list of Endpoint objects that encapsulate all all the the information information that that will will be be necessary necessary in in the the next next stages stages of of the the imple- imple- mentation. An EndpointEndpoint object represents an operation. For each each EndpointEndpoint object,object, a a TestCaseTestCase object is created and all the test cases associatedassociated with the operation will be executed. The statistics are calculated and displayed. ApiParser class receives the API specification as an input in JSON format. From given specification, it extracts all the data needed to build the Endpoint objects and the Parameter objects and stores the relevant statistical data. An example of data that is stored is the total number of parameters and the total number of operations. To aid the parsing of the infor- mation from the JSON file, an external library called JSON-Simple (https://mvnreposi- tory.com/artifact/com.googlecode.json-simple/-simple, accessed on 6 August 2021) was used. Both Endpoint class and Parameter class have the purpose of memorizing all the data that is needed to generate and run a test case for a particular operation. An Endpoint con- tains data such as the base URL, the path, the type of the operation, as well as a list of all parameters needed for that particular operation. Each Parameter in this list contains data such as name, type of parameter, location, etc. TestCase class receives an Endpoint as an input parameter, for which it will later build a test case scenario. The first step of creating this test case scenario is assigning a set of values to the parameters present in the respective Endpoint. Both valid and purposefully

Sensors 2021, 21, 5375 10 of 20

ApiParser class receives the API specification as an input in JSON format. From given specification, it extracts all the data needed to build the Endpoint objects and the Parameter objects and stores the relevant statistical data. An example of data that is stored is the total number of parameters and the total number of operations. To aid the parsing of the information from the JSON file, an external library called JSON-Simple (https://mvnrepository.com/artifact/com.googlecode.json-simple/json-simple, accessed on 6 August 2021) was used. Both Endpoint class and Parameter class have the purpose of memorizing all the data that is needed to generate and run a test case for a particular operation. An Endpoint contains data such as the base URL, the path, the type of the operation, as well as a list of all parameters needed for that particular operation. Each Parameter in this list contains data such as name, type of parameter, location, etc. TestCase class receives an Endpoint as an input parameter, for which it will later build a test case scenario. The first step of creating this test case scenario is assigning a set of values to the parameters present in the respective Endpoint. Both valid and purposefully erroneous values will be given to the parameters as we are integrating a fault-based testing approach. This process can be influenced by a series of factors: • Depending on the chosen configuration, all or only the required parameters will be taken into account. • The data type of each parameter will determine the default valid and invalid values that are given to the parameters. • When the parameter is a numeric data type and it has a specific range defined in the specification, the given values respect the domain testing principles. • If an example is given in the specification or if an example was provided by the user (given that the appropriate configuration is selected), that value will overwrite the default valid value for the parameter. After each parameter has its set of values assigned, a combinatorial function is used to generate all the possible permutations (or a subset of them if the total number is over a set threshold). Each one of these combinations represents a request that will be made to that endpoint of the API. Once the test case scenario is generated, all the requests will be executed sequentially. In order for the execution to take place, an external library called Jersey-Client (https://mvnrepository.com/artifact/org.glassfish.jersey.core/jersey-client, accessed on 6 August 2021), was used. During and after this process, all the relevant information will be memorized for the statistics. ParameterExampleCache is used only when the user is requested to give an input. The purpose of this class is to cache a parameter in case it is used in several requests. Config is a static class which provides the values that were previously configured. Statistics class has the purpose to calculate the statistics for the API under test based on all the data resulted from the execution of the test cases. In terms of performance, our program utilizes only a small part of a computer’s resources, and it can function in parallel with other programs. RAM usage varies from 70 to 150 MB (results may vary depending on the complexity of the API and the chosen configuration) and the CPU usage peaked at 7% during a run (running on an Intel i7 CPU). As our program is self-contained, it only communicates with the API directly, so there is no possibility of interference with other programs. The implementation described above that was used to test and obtain the results that will be discussed in the following chapters can be found on GitHub (https://github.com/ RobertGyalai/RESTApiTester/, accessed on 6 August 2021).

5. Results In the following section, we offer two detailed case studies showing how our program performs. In the first part, we take into consideration a complex and illustrative API and we evaluate different types of coverage criteria such as path coverage, operation coverage, parameter coverage, parameter value coverage, content-type coverage, status Sensors 2021, 21, 5375 11 of 20

code coverage, response body properties coverage, and content-type coverage. In the second part, our approach is tested and then analyzed over 30 different APIs, providing an overview of the statistical test case execution results.

5.1. First Case Study: “College Football Data” API Testing “College football data” API (https://api.apis.guru/v2/specs/collegefootballdata.com/ 2.2.14/openapi.json, accessed on 6 August 2021) gives access to a variety of data regarding college football. We have selected this API to evaluate our solution because it provides a wide range of data and it exposes a very good number of paths (42 paths) and parameters (38 parameters), as well as a well-defined and complex OpenAPI 3.0 specification. The proposed solution was evaluated in four different configurations: 1. Standard: using a minimum number of parameters (only those required by the API) and with no generation of partial test cases. 2. Max Parameter Coverage: all parameters were used, either required or not required, with no generation of partial test cases. 3. Partial Test Generation: using a minimum number of parameters, only the required ones, with generation of partial test cases. 4. Max Parameter Coverage and Partial Tests: using all parameters (required or not required) with generation of partial test cases. It is worth mentioning that an API specification defines multiple paths, each having one or more operations as GET and POST, while every operation represents a functionality of the API and each of them may require one or more parameters to be given when making the request. Running the Partial Test Generation configuration, there were nine parameters that needed to be set by the user, while for the Max Parameter Coverage and Partial Tests, 37 parameters needed to be configured. The four mentioned configurations generated the following number of test cases: Standard—75 test cases; Max Parameter Coverage—250 test cases; Partial Test Generation— 78 test cases; and Max Parameter Coverage and Partial Tests—272 test cases. In the following paragraphs, we will investigate the statistics provided by our im- plementation according to the Test Coverage Model: Levels and Criteria [7] that is briefly presented in Figure2.

5.1.1. Path Coverage Path coverage metric refers to the number of API paths taken into consideration during a run, out of the total number of paths available from the API specification. This metric belongs to the level TCL1 [7] which represents, according to Figure2, the lowest level of coverage. In our chosen API example, there were a total of 42 paths defined (such as /plays, /teams, /venues, /games/teams, etc.). Due to our approach, differentiating between parameters with or without parameter examples is important, thus we propose splitting this metric into two categories, path covered with full examples and paths covered without full examples. These two categories are based on whether or not all the parameters defined for a specific path have examples given in the API description. In our approach, we cover all the paths, even if they fall in the second category. Hence, we were able to run test cases for most of the paths in the specification, the exception being that we could not test operations of type DELETE and PATCH, because they require a prior knowledge of the available API data. In the Standard configuration, 27 out of the 42 paths were covered with full examples, while the remaining 15 were covered without full examples. When using the Max Parameter Coverage configuration, there were more parameters without examples that were being used in each path, thus making the total number of paths covered with full examples drop to five out of forty-two. For the last two configurations that include the partial test feature enabled, all the parameters that were previously without examples received the according values due to the user interaction. Thus, the paths covered with full examples now have 100% coverage. The equivalent statistics are presented in Figure7. Sensors 2021, 21, 5375 12 of 20 Sensors 2021, 21, x FOR PEER REVIEW 13 of 21

Sensors 2021, 21, x FOR PEER REVIEW 13 of 21

100.00% 0.00% 0.00% 90.00% 100.00%80.00% 35.71% 0.00% 0.00% 70.00%90.00% 35.71% 60.00%80.00% 70.00% 88.10% 50.00% 100.00% 100.00% 60.00% 40.00% 88.10% 30.00%50.00% 64.29% 100.00% 100.00% 20.00%40.00% 64.29% 10.00%30.00% 11.90% 20.00%0.00% 10.00% Standard Max Parameter Partial Test Max Parameter 11.90% 0.00% Coverage Generation Coverage and Partial Standard Max Parameter Partial Test Max ParameterTests Coverage Generation Coverage and Partial Paths covered with full examples Paths covered without full examples Tests

Figure 7. Path coverage.Paths covered with full examples Paths covered without full examples Figure 7. Path coverage.

5.1.2.Figure Operation 7. Path coverage. Coverage 5.1.2. Operation Coverage Using the same reasoning as mentioned above, we have chosen to split the operation coverage5.1.2.Using Operation metric the sameCoverageinto two reasoning parts based as mentioned on whether above, or not weall their have parameters chosen to splitcontain the ex- operation coverageamples:Using operations metric the same into covered reasoning two with parts as full mentioned based examples on above, and whether operations we have or chosen not covered all to their withoutsplit the parameters fulloperation exam- contain examples:ples.coverage Operation metric operations coverage into two covered metricparts based belongs with on fullto whet Test examplesher Coverage or not andall Model their operations TCL2parameters [7]. covered contain without ex- full examples.amples:As previouslyoperations Operation coveredmentioned, coverage with an full metric API examples path belongs can and have tooperations Test one Coverageor more covered operations Model without TCL2 associatedfull exam- [7]. withples. Asit.Operation In previously the current coverage mentioned, case metric study, belongs every an API pathto pathTest had Coverage can one have GET Model oneoperation, or TCL2 more resulting [7]. operations in a total associated withof 42 it.Asoperations Inpreviously the current considered. mentioned, case study,The an results API every path presented path can hadhave in oneFigure one GET or 8 more are operation, the operations same resultingas theassociated results in a total of 42forwith operations path it. Incoverage the current considered. due caseto the study, Thefact that results every there path presented is onlyhad oneone in GEToperation Figure operation,8 arefor each the resulting samepath. as in thea total results for of 42 operations considered. The results presented in Figure 8 are the same as the results path coverage due to the fact that there is only one operation for each path. for100.00% path coverage due to the fact that there is only one0.00% operation for 0.00%each path. 90.00% 100.00%80.00% 35.71% 0.00% 0.00% 70.00%90.00% 60.00%80.00% 35.71% 88.10% 50.00%70.00% 100.00% 100.00% 40.00%60.00% 88.10% 30.00%50.00% 64.29% 100.00% 100.00% 20.00%40.00% 10.00%30.00% 64.29% 11.90% 20.00%0.00% 10.00% Standard Max Parameter Partial Test Max Parameter 11.90% 0.00% Coverage Generation Coverage and Standard Max Parameter Partial Test MaxPartial Parameter Tests Coverage Generation Coverage and Operations covered with full examples Operations covered without full examples Partial Tests

Figure 8.Operations Operation covered coverage. with full examples Operations covered without full examples

5.1.3. Paths Tested FigureFigure 8. Operation coverage. coverage. There are two categories for tested paths: paths tested with success and paths tested 5.1.3.with5.1.3. failures. Paths Tested Tested When testing a path, there can be one or more requests that are made to the API dependingThereThere are are two twoon the categories categories number for of for parametertested tested paths paths:co: mbinations.paths paths tested tested Each with one success with of these success and requests paths and tested pathswill tested withwith failures. When When testing testing a path, a path, there there can be can onebe or onemore or requests more requeststhat are made that to are the made to API depending on the number of parameter combinations. Each one of these requests will the API depending on the number of parameter combinations. Each one of these requests will have a result based on the status code class. A request is considered successful if it has received a response with a 2xx status code (for example, the path /metrics/wp/pregame returns a successful response code of 200 along with the actual response which, in the Sensors 2021, 21, x FOR PEER REVIEW 14 of 21

Sensors 2021, 21, 5375 13 of 20 have a result based on the status code class. A request is considered successful if it has received a response with a 2xx status code (for example, the path /metrics/wp/pregame re- turns a successful response code of 200 along with the actual response which, in the cur- currentrent case case study, study, represents represents the the win win probabilit probabilitiesies for for a game). a game). A Arequest request is isconsidered considered to tohave have failed failed if we if we receive receive a response a response with with a st aatus status code code different different from from the 2xx the status 2xx status code codeclass class(for example, (for example, 404). 404For). the For handling the handling of the oftest the cases test that cases use that intentionally use intentionally invalid invalidparameters parameters (applying (applying the fault-base the fault-base testing testing principle) principle) the comparison the comparison is reversed. is reversed. It is Itconsidered is considered a successful a successful test if test the if API the returns API returns an error an status error statuscode, and code, the and test the is consid- test is consideredered a failure a failure if the API if the return API returna 2xx status a 2xx co statusde (meaning code (meaning that the that API the accepted API accepted the invalid the invalidvalue). value). EachEach setset ofof parameterparameter combinationscombinations willwill havehave anan expectedexpected outcomeoutcome forfor thatthat request.request. AA pathpath isis consideredconsidered asas testedtested successfullysuccessfully ifif allall thethe responsesresponses matchmatch thethe expectedexpected ones.ones. Otherwise,Otherwise, thethe testedtested pathpath isis consideredconsidered a a failure.failure. TheThe resultsresults forfor thisthis metricmetric areare presentedpresented in Figure9. in Figure 9.

100.00% 90.00% 80.00% 35.71% 47.62% 50.00% 70.00% 69.05% 60.00% 50.00% 40.00% 30.00% 64.29% 52.38% 50.00% 20.00% 30.95% 10.00% 0.00% Standard Max Parameter Partial Test Max Parameter Coverage Generation Coverage and Partial Tests

Paths tested with success Paths tested with failures

FigureFigure 9.9.Paths Paths tested.tested.

ItIt cancan be be observed observed that that once once the the number number of of parameters parameters is rising,is rising, the the number number of passingof pass- testing casestest cases is decreasing is decreasing due to due the to higher the higher probability probability of errors of and errors due and to a due higher to numbera higher ofnumber parameters of parameters without without examples. examples. In the currentIn the current situation, situation, the GET the operationGET operation for the for /metrics/wp/pregamethe /metrics/wp/pregamepath path has has four four parameters parameters but but none none of of them them are are required. required. Thus, Thus, in in thethe StandardStandard configurationconfiguration the the request request is is undertaken undertaken with with no no parameter, parameter, while while all allthe thepa- parametersrameters were were given given values values in in the the MaxMax Parameter Parameter Coverage Coverage configuration.configuration. InIn thethe lastlast twotwo configurations,configurations,Partial Partial TestTest Generation Generationand andMax Max Parameter Parameter Coverage Coverage and and PartialPartial TestsTests,, which which have have partial partial tests tests enabled, enabled, we we observe observe a a growing growing number number of of successful successful testtest casescases becausebecause of of thethe useruser interactioninteraction whichwhich providesprovides validvalid parametersparameters for for the the requestsrequests basedbased onon thethe operationoperation description.description. ForFor example,example, thethe parameterparameter seasonTypeseasonTypefrom from the the GET GET operationoperation is is of of type type String, String, has has no no example, example, but but has has“regular “regular or postseason”or postseason”as its as description. its descrip- Thetion. written The written description description is simple is simple to interpret to interp byret the by userthe user that that will will consequently consequently input input a correcta correct example, example, while while the the default default string string value value that that was was given given to to the the parameter parameter in in thethe twotwo previousprevious configurations, configurations, would would not not produce produce valid valid samples. samples.

5.1.4.5.1.4. OperationsOperations TestedTested TheTheoperations operations tested tested metric metric follows follows the the same same rules rules as theas the path path tested tested metric, metric, however how- theever difference the difference is that is that only only individual individual operations operations are are taken taken into into consideration. consideration. As As men-men- tionedtioned before,before, forfor thethe selectedselected API,API, eacheach pathpathhas hasonly onlyone oneoperation, operation,therefore therefore the the results results presented in Figure 10 are the same as the previous metric. presented in Figure 10 are the same as the previous metric.

Sensors 2021, 21, 5375 14 of 20 Sensors 2021, 21, x FOR PEER REVIEW 15 of 21

Sensors 2021, 21, x FOR PEER REVIEW 15 of 21 100.00% 90.00% 80.00% 35.71% 100.00% 47.62% 50.00% 70.00% 90.00% 69.05% 60.00% 80.00% 35.71% 50.00% 47.62% 50.00% 70.00% 69.05% 60.00%40.00% 64.29% 50.00%30.00% 52.38% 50.00% 40.00%20.00% 30.95% 30.00%10.00% 64.29% 52.38% 50.00% 20.00% 0.00% 30.95% 10.00% Standard Max Parameter Partial Test Max Parameter 0.00% Coverage Generation Coverage and Standard Max Parameter Partial Test Max ParameterPartial Tests Coverage Generation Coverage and Operations Tested with success Operations TestedPartial with Tests failures

Operations Tested with success Operations Tested with failures FigureFigure 10. 10.Operations Operations tested. tested. Figure 10. Operations tested. 5.1.5.5.1.5. Parameter Parameter Coverage Coverage 5.1.5.For ParameterFor aa parameter parameter Coverage toto be be considered considered as as covered, covered, itit needs needs to to have have been been used used in in one one or or moremoreFor test test a cases.parameter cases. We We separatedto separated be considered the the covered covered as covered, parameters parameters it needs into tointo have two two groupsbeen groups used based basedin one on on theor the overall over- resultsmoreall results test of cases. the of testthe We casestest separated cases generated generatedthe covered for eachfor parameters each operation. operation. into two The groupsThe two two groups based groups areon the represented,are over- represented, on oneallon results hand,one hand, of by the the bytest parameters the cases parameters generated covered forcovered each in successful operation. in successful test The cases two test groups and, cases on areand, the represented, otheron the hand, other by hand, the on one hand, by the parameters covered in successful test cases and, on the other hand, parametersby the parameters covered covered in failed in test failed cases. test Thesecases. twoThese groups two groups form the form parameter the parameter coverage cov- by the parameters covered in failed test cases. These two groups form the parameter cov- metric,erage metric, which correspondswhich corresponds to levels to TCL4–TCL6 levels TCL4–TCL6 [7] presented [7] presented in Figure in2 Figure. 2. erageThe metric, results which concerning corresponds the to parameter levels TCL4–TCL6 coverage [7] metricpresented are in presented Figure 2. in Figure 11. TheThe results results concerning concerning the parameterthe parameter covera coverage metricge metricare presented are presented in Figure in 11. Figure 11.

100.00%100.00% 90.00%90.00% 80.00%80.00% 70.00% 63.16% 63.16% 70.00% 63.16% 63.16% 60.00% 60.00% 50.00% 50.00% 40.00% 40.00% 30.00% 30.00% 20.00% 7.89% 36.84% 7.89% 36.84% 20.00% 7.89% 7.89% 10.00% 15.79% 36.84% 15.79% 36.84% 10.00%0.00% 15.79% 15.79% Standard Max Parameter Partial Test Max Parameter 0.00% Coverage Generation Coverage and Partial Standard Max Parameter Partial Test TestsMax Parameter Coverage Generation Coverage and Partial Parameter Covered in successful tests Parameter Covered in failed tests Tests

Figure 11. ParameterParameter coverage. Covered in successful tests Parameter Covered in failed tests Figure 11. Parameter coverage. In the Standard configuration, only the required parameters were populated with val- FigureIn 11. the ParameterStandard coverage.configuration, only the required parameters were populated with ues. The selected API had nine required parameters out of thirty-eight. Using the second values. The selected API had nine required parameters out of thirty-eight. Using the configurationIn the Standard Max Parameter configuration, Coverage, onlyall of thethe required38 parameters parameters were utilized were butpopulated only 14 with val- secondof them configurationwere present in Maxsuccessful Parameter tests. Coverage, all of the 38 parameters were utilized but onlyues. 14The of selected them were API present had nine in successfulrequired para tests.meters out of thirty-eight. Using the second configuration Max Parameter Coverage, all of the 38 parameters were utilized but only 14 of them were present in successful tests.

SensorsSensors2021 2021,, 21 21,, 5375 x FOR PEER REVIEW 1516 ofof 2021

ForFor thethe “College“College football data” data” API, API, we we ob observedserved that that the the configurations, configurations, which which in- includeclude partial partial tests, tests, have have no no significant significant impact impact on onthis this coverage coverage metric, metric, although although the acti- the activationvation of the of the partial partial tests tests options options actually actually increased increased the thenumber number of operations of operations tested tested with withsuccess. success. In this In case, this case,the additional the additional operatio operationsns that were that successful were successful did not did include not include param- parameterseters that were that werepreviously previously covered covered in the in fail theed failed tests. tests. Even Even if a ifparameter a parameter has has a valid a valid ex- exampleample provided provided by by the the user, user, it it can can still still be part of an operation whichwhich hashas failedfailed test test cases cases becausebecause of of other other API API related related circumstances. circumstances. Consequently,Consequently, that that specific specific parameter parameter is is still still notnot counted counted as as covered covered in in successful successful tests. tests.

5.1.6.5.1.6. StatusStatus CodeCode CoverageCoverage InIn the the API API specification, specification, there there may may be be one one or or more more possible possible responses responses defined defined for for a a givengiven operation.operation. OneOne ofof thethe attributesattributes forfor thisthistype type of of responses responses is is the the status status code. code. The The GETGET operation operation for for the the/metrics/wp/pregame /metrics/wp/pregameprovides provides two two possible possible response response codes, codes,200 200and and 400400.. Consequently, Consequently, we we compute compute the the percentage percentage of of each each returned returned status status code. code. The The resulting resulting percentagepercentage isis thethe status code code coverage coverage metric metric which which corresponds corresponds to tolevel level TCL5 TCL5 [7], [as7], pre- as presentedsented in inFigure Figure 12. 12 .

100.00% 88.46% 90.00% 82.05% 80.00% 70.00% 62.82% 62.82% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% Standard Max Parameter Partial Test Max Parameter Coverage Generation Coverage and Partial Tests

Status Code Received

FigureFigure 12. 12.Status Status code code coverage. coverage.

InIn thethe selectedselected API,API, therethere are are a a total total of of 78 78 possible possible status status codes codes defined. defined. WithWith the the StandardStandardconfiguration, configuration, 48 48 status status codes codes are are covered, covered, while while with with the theMax Max Parameter Parameter Coverage Coverage configuration,configuration, 6464 statusstatus codescodes areare covered. covered. InIn thethe third third configuration configuration it it can can be be observed observed thatthat therethere isis no increase increase in in the the percentage, percentage, compared compared to tothe the StandardStandard one,one, as the as few the extra few extraparameter parameter examples examples provided provided by the by API the do API no dot have not havean influence an influence over this over metric. this metric. How- However,ever, in the in thelast lastconfiguration configuration with with more more parameters parameters and and more more examples/input examples/input from from the theuser, user, the the resulting resulting percentage percentage for for this this metric metric is the is the highest highest of ofall allfour four configurations. configurations.

5.1.7.5.1.7. ContentContent TypeType Coverage Coverage TheThe API API specification specification may may include include one one or more or more responses responses for each for operationeach operation and some and ofsome these of responses these responses can define can define the specific the specific content content type that type is that expected is expected to be to received be received (for instance,(for instance, “application/json “application/json”, “text/plain”, “text/plain”, “application/”, “application/xml”). When”). When receiving receiving the response the re- fromsponse a request,from a request, it can be it verifiedcan be verified if the content if the content type of type the of response the response matches matches the one the definedone defined in the in specification. the specification. The content The conten typet coverage type coverage metric metric measures measures how many how validmany contentvalid content types weretypes receivedwere received from thefrom one the defined one defined in the in specification the specification and it and represents it repre- levelsents TCL3 level [TCL37] of coverage.[7] of coverage. The results The results obtained obtained for the for “College the “College football football data” data” API areAPI presented in Figure 13. are presented in Figure 13.

Sensors 2021, 21, 5375 16 of 20 Sensors 2021, 21, x FOR PEER REVIEW 17 of 21

100.00% 95.24% 95.24% 100.00% 90.48% 90.48% 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% Standard Max Parameter Partial Test Max Parameter Coverage Generation Coverage and Partial Tests

Content Type Received

Figure 13. Content type coverage. FigureFigure 13. 13.Content Content type type coverage. coverage.

InIn thethe selectedselected API,API, thethe total number of of sp specifiedecified content content types types is is 42. 42. When When using using all allthe the parameters parameters (both (both MaxMax Parameter Parameter Coverage Coverage and andMaxMax Parameter Parameter Coverage Coverage and Partial and Partial Tests), Testsan increase), an increase from 38 from to 40 38 content to 40 content types covere types coveredd can be canobserved. be observed. In the case In the of casethe content of the contenttype coverage type coverage metric, the metric, enabling the enabling of the partial of the tests partial generation tests generation feature ( featurePartial Test (Partial Gen- Testeration Generation and Maxand ParameterMax Parameter Coverage Coverage and Partial and PartialTests) does Tests )not does impact not impact the results. the results.

5.1.8.5.1.8. ResponseResponse BodyBody Property Property Coverage Coverage InIn some some instances,instances, thethe APIAPI definitiondefinition maymay includeinclude thethe structurestructure ofof thethe expectedexpected re-re- sponsesponse bodybody based on on the the content content type type in in the the operations operations response. response. For Forexample, example, the oper- the operationation GETGET for thefor thepath path /metrics/wp/pregame/metrics/wp/pregame hashas a total a total of eight of eight response response body body parameters, parame- ters,such such as gameId as gameId, season, season, homeTeam, homeTeam, week, week, etc., etc. UsingUsing this this information, information, we we may may compute compute how how many many of of the the mentioned mentioned response response body body properties were actually received. This represents the response body property coverage properties were actually received. This represents the response body property coverage metric (Figure 14), which has the strongest output coverage criteria TCL6 [7] as defined by metric (Figure 14), which has the strongest output coverage criteria TCL6 [7] as defined Figure2. by Figure 2.

100.00% 90.00% 80.00% 70.00% 60.00% 60.00% 48.93% 50.00% 40.00% 40.00% 30.60% 30.00% 30.00% 20.64% 20.00% 13.25% 10.00% 0.00% Standard Max Parameter Partial Test Max Parameter Coverage Generation Coverage and Partial Tests

Response Body Properties Received

Figure 14. Response body property coverage. FigureFigure 14. 14.Response Response body body property property coverage. coverage.

SensorsSensors2021 2021,, 21 21,, 5375 x FOR PEER REVIEW 1718 ofof 2021

ForFor the the selected selected API, API, there there are are a a total total of of 562 562 response response body body parameters. parameters. The The configura- configu- tionsrations with with partial partial tests tests (Partial (Partial Test GenerationTest Generationand Maxand ParameterMax Parameter Coverage Coverage and Partial and Partial Tests) leadTests to) lead an increase to an increase in the in number the number of received of received response response body body parameters parameters because because of the of validthe valid examples examples provided provided by the by user.the user. On theOn otherthe other hand, hand, when when using using all the all parametersthe parame- (Maxters ( ParameterMax Parameter Coverage Coverageand Maxand ParameterMax Parameter Coverage Coverage and Partialand Partial Tests ),Tests a decrease), a decrease in the in percentagethe percentage can becan observed, be observed, which which is attributed is attributed to the to the fact fact that that most most parameters parameters have have a rolea role in filteringin filtering the the response. response. Thus, Thus, using using more more parameters parameters denotes denotes getting getting a more a more filtered fil- responsetered response with potentially with potentially less response less response body parameters. body parameters. ItIt should should be be noted noted that that some some APIs APIs may may or or may may not not provide provide the the response response part part for for an an operationoperation oror maymay provideprovide only only a a part part of of the the information. information.In In those those cases, cases, depending depending on on whatwhat informationinformation isis available,available, itit maymay notnot be be possible possible to to generate generate some some of of the the following following metrics:metrics: statusstatus codecode coverage,coverage, content content type type coverage, coverage, response response body body property property coverage. coverage. 5.1.9. Performance Metric 5.1.9. Performance Metric To measure the performance of the API, we propose to measure the average time To measure the performance of the API, we propose to measure the average time needed to receive the response for every request that is sent. In our solution, this average needed to receive the response for every request that is sent. In our solution, this average response time is represented by the performance metric and the obtained results are response time is represented by the performance metric and the obtained results are de- described in Figure 15. scribed in Figure 15.

1800 1682 1600 1451 1400 1200 1000 800 600 400 209 210 200 0 Standard Max Parameter Partial Test Max Parameter Coverage Generation Coverage and Partial Tests

Average Response Time [ms]

FigureFigure 15. 15.Performance Performance metric. metric.

InInthe the configurations configurations thatthat use use all all the the parameters parameters ( Max(Max Parameter Parameter Coverage Coverageand andMax Max ParameterParameter Coverage Coverage and and Partial Partial Tests Tests), a), noticeable a noticeable decrease decrease in the in timethe time needed needed for the for request the re- toquest be completed to be completed could be could observed. be observed. The reason The isreason that ais lot that of parametersa lot of parameters have the have role the of filteringrole of filtering the response, the response, resulting resulting in shorter in shorter responses responses and reduced and reduced average average response response times. Thetimes. configurations The configurations that have that partial have tests partial do tests not present do not apresent significant a significant changein change the overall in the durations.overall durations. As the performance As the performance metric can metric be affected can be byaffected outside by sources, outside such sources, as network such as connectivitynetwork connectivity and latency, and ideally, latency, the ideally, metric should the metric be measured should be on measured the APIs side. on the APIs side. 5.2. Second Case Study: Testing a Set of 30 APIs 5.2. SecondTo further Case evaluate Study: Testing our proposed a Set of solution, 30 APIs we applied the same procedure as described in theTo previous further caseevaluate study our to aprop totalosed of 30solution, APIs. Usingwe applied the services the same of APIs.guru procedure ( https:as de- //apis.guru/browse-apis/scribed in the previous case, accessed study to on a total 6 August of 30 2021), APIs. weUsing filtered the services out all APIs of APIs.guru that did not(https://apis.guru/browse-apis/, have the OpenAPI version 3.0 accessed or above on and 6 obtainedAugust 2021), about we 800 filtered APIs. Additionally, out all APIs wethat eliminateddid not have the the APIs OpenAPI that require version a security 3.0 or key/API above and key/authentication obtained about 800 (any APIs. APIs Addition- that had someally, we form eliminated of security the and APIs needed that credentialsrequire a security to be tested) key/API and key/authentication we obtained about (any 100 APIs. APIs Afterthat had filtering some out form the ofones security that and had needed faulty specifications,credentials to be we tested) gathered and awe total obtained of 30 APIsabout that ranged from simple (APIs with one or two paths) to complex (30 paths or above). The 100 APIs. After filtering out the ones that had faulty specifications, we gathered a total of following list contains the names of all the 30 mentioned APIs that have been considered: 30 APIs that ranged from simple (APIs with one or two paths) to complex (30 paths or

Sensors 2021, 21, 5375 18 of 20 Sensors 2021, 21, x FOR PEER REVIEW 19 of 21

“papiNetabove). APIThe following CWG” API, list contai “Corrently”ns the names API, of “Data all the & 30 Search mentioned team APIs at UK that Parliament” have been API, “JSONconsidered: storage” “papiNet API, “XKCD” API CWG” API, API, “HackathonWatch” “Corrently” API, “Data API, “shipstation” & Search team API, at UK “topupsapi” Par- API,liament” “Geomag” API, “JSON API, “Gravity”storage” API, API, “XKCD” “NSIDC API,Web “HackathonWatch” Service Documentation API, “shipstation” Index” API, “BikeWiseAPI, “topupsapi” v2” API, API, “ODWeather” “Geomag” API, API, “Gravity” “WorkBC API, Job “NSIDC Posting” Web Service API, Atmosphere” Documenta- API, “Spacetion Index” Radiation” API, “BikeWise API, “NeoWs—(Near v2” API, “ODWea Earthther” Object API, Web “WorkBC Service)” Job API,Posting” “Domains API, At- RDAP” API,mosphere” “API Discovery API, “Space Service” Radiation” API, API, “WikiPathways “NeoWs—(Near Webservice” Earth Object API,Web Service)” “httpbin.org” API, API, “OSDB“Domains REST RDAP” v1” API, API, “RiteKit” “API Discovery API, “Aviation Service” Radiation”API, “WikiPathways API, “eNanoMapper Webservice” API, ” API,“httpbin.org” “Tweets and API, Users” “OSDB API, REST “RAWG v1” API, Video “RiteKit” Games API, Database” “Aviation API, Radiation” “College API, Football Data”“eNanoMapper API, “Rat Genome database” Database API, “Tweets REST and API” Users” API, API, “VocaDB” “RAWG API. Video Games Data- base” API, “College Football Data” API, “Rat Genome Database REST API” API, “Vo- The coverage metrics mentioned in the previous section were computed for the entire set caDB” API. of 30 APIs by running our proposed solution using the four different configurations. The total The coverage metrics mentioned in the previous section were computed for the entire numberset of of30 testAPIs cases by running that were our generated proposed for solution each of using the fourthe four configurations different configurations. is: Standard—1222 testThe cases; total Max number Parameter of test Coverage—3785cases that were gene testrated cases; for Partial each of Test the Generation—1225four configurations test is: cases; andStandard—1222 Max Parameter test Coverage cases; Max and Parameter Partial Tests—3807Coverage—3785 test cases.test cases; Partial Test Gener- ation—1225In Figure test15 wecases; present and Max the Parameter aggregated Co resultsverage and obtained Partial fromTests—3807 evaluating test cases. our solution over theInset Figure of 30 15 APIs. we present The resultsthe aggregated are presented results obtained as the median from evaluating for each our metric solution evaluated in theover aforementioned the set of 30 APIs. scenarios. The results are presented as the median for each metric evaluated in Giventhe aforementioned the overall scenarios. results presented in Figure 16, we consider that our approach to provideGiven the ability the overall for the results user presented to input examplesin Figure 16, has we a significantconsider that positive our approach effect. Thisto can beprovide observed the in ability most for of the user evaluated to input metrics examples by has comparing a significant the Standardpositive effect.configuration This can with thebePartial observed Test in Generation most of theconfiguration evaluated metrics or comparing by comparingMax the Parameter Standard Coverageconfigurationwith Max with the Partial Test Generation configuration or comparing Max Parameter Coverage with Parameter Coverage and Partial Tests. Max Parameter Coverage and Partial Tests.

100.00% 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00%

Standard Max Parameter Coverage Partial Test Generation Max Parameter Coverage and Partial Tests

FigureFigure16. 16. OverviewOverview statistics. statistics.

It can also be observed that the total number of requested parameters over the span of testing 30 APIs is not significantly large. In our case, it would average between four and twelve parameters per API, a value that requires a small amount of effort needed from the user. Sensors 2021, 21, 5375 19 of 20

6. Conclusions The main purpose of our research was to provide a novel method to automate the testing of REST APIs based only on their OpenAPI 3.x specification. Using the method suggested in [4] as a starting point, our approach improves the test generation process by allowing minimal user interaction to provide examples for the parameters of the API operations. Furthermore, we expand the type of statistical results accompanying API testing to help gain a more in-depth analysis of each of the tested API. One of the main advantages of our approach is its ease of use. The only things needed to test a given API is the URL or a file containing its API specification in OpenAPI 3.x format, and the parameter examples provided by the user depending on the selected configuration type. As mentioned previously, the user can easily change the configuration that determines how the tests are generated. Thus, the user can choose whether to use all the parameters in the API or to just use the required ones, as well as choosing whether additional examples should be expected from the user. Since these two options are independent from one another, it generates four possible configurations as described in Section4. While the obtained coverage metrics provide a complex overview of the aspects tested over the API, our approach also provides a non-functional performance metric regarding the average time that is needed to run the chosen configuration of an API. One of the disadvantages of our approach, and of the approaches that generate test cases exclusively based on API specifications in general, is that in the actual implementation of the API there are some interdependencies that cannot be extrapolated from the API definition alone. One such example would be the need to know an existing object in the API database before being able to perform a successful PATCH operation on it. Another downside is that, due to the way the average time performance metric is calculated, it is not guaranteed that there is no interference from external factors. Ideally, such a metric should be calculated directly on the API’s server side to avoid the inherent network latencies that may affect the results. Unfortunately, this cannot occur without having full access to the server. Based on the current implementation which separates the parsing of the API and the test generation and execution, possible future improvements could include extending the number of specification languages that can be interpreted. Starting from the current performance metric, we could extend our implementation to also perform the load test- ing. This new functionality should simultaneously perform a series of requests to an API and monitor if the response time presents a noticeable difference. Another possible improvement would be the option to export the generated test cases in one or more testing frameworks, such as JUnit or Serenity. In our approach, besides the fully automatic test case generation, we provide an option for the user to guide the test case creation process by incorporating the user’s expertise and the particularities of the context in which the API testing is performed. The software implementation is modular, the API specification parsing being separated from test generation and execution, which allows future extensions of specification languages that can be interpreted. The experiments proved the effectiveness and practicability of our method, while the wide range of coverage and performance metrics that are provided facilitate an in-depth analysis of the API testing process.

Author Contributions: Conceptualization, O.B., D.F., R.G. and D.-I.C.; methodology, O.B., D.F., R.G. and D.-I.C.; software, R.G.; validation, O.B., D.F., R.G. and D.-I.C.; formal analysis, O.B., D.F., R.G. and D.-I.C.; investigation, O.B., D.F., R.G. and D.-I.C.; resources, O.B., D.F., R.G. and D.-I.C.; data curation, O.B., D.F., R.G. and D.-I.C.; writing—original draft preparation, O.B., D.F., R.G. and D.-I.C.; writing—review and editing, O.B., D.F., R.G. and D.-I.C.; visualization, O.B., D.F., R.G. and D.-I.C.; supervision, O.B. and D.-I.C. All authors have read and agreed to the published version of the manuscript. Funding: This research received no external funding. Institutional Review Board Statement: Not Applicable. Sensors 2021, 21, 5375 20 of 20

Informed Consent Statement: Not Applicable. Data Availability Statement: Not Applicable. Conflicts of Interest: The authors declare no conflict of interest.

References 1. Neumann, A.; Laranjeiro, N.; Bernardino, J. An Analysis of Public REST Web Service APIs. IEEE Trans. Serv. Comput. 2018, 14, 957–970. [CrossRef] 2. Toman, S.H. Review of Web Service Technologies: REST over SOAP. J. Al-Qadisiyah Comput. Sci. Math. 2020, 12, 18–30. 3. Bülthoff, F.; Maleshkova, M. RESTful or RESTless—Current State of Today’s Top Web APIs. In Lecture Notes in Computer Science, Proceedings of the Semantic Web: ESWC 2014 Satellite Events, Crete, Greece, 25–29 May 2014; Springer: Cham, Switzerland, 2014; pp. 64–74. 4. Ed-douibi, H.; Izquierdo, J.L.C.; Cabot, J. Automatic Generation of Test Cases for REST APIs: A Specification-Based Approach. In Proceedings of the 2018 IEEE 22nd International Enterprise Distributed Object Computing Conference, Stockholm, Sweden, 16–19 October 2018; Universitat Oberta de Catalunya: Barcelona, Spain, 2018. 5. Karlsson, A. Automatic Test Generation of REST API. Ph.D. Thesis, Linköping University, Linköping, Switzerland, 2020. 6. Ferreira, F.A.C. Automatic Tests Generation for RESTful APIs. Ph.D. Thesis, Universidade de Lisboa, Lisboa, Portugal, 2017. 7. Martin-Lopez, A.; Segura, S.; Ruiz-Cortes, A. Test Coverage Criteria for RESTful Web APIs. In Proceedings of the 10th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation (A-TEST’19), Tallinn, Estonia, 26–27 August 2019; ACM: New York, NY, USA, 2019. 7p. 8. Fielding, R.T.; Taylor, R.N. Architectural Styles and the Design of Network-Based Software Architectures; University of California: Irvine, CA, USA, 2000; Volume 7. 9. OpenAPI Initiative. 15 February 2021. Available online: http://spec.openapis.org/oas/v3.1.0#openapi-specification (accessed on 28 February 2021). 10. Karlsson, S.; Causevic, A.; Sundmark, D. QuickREST: Property-based Test Generation of OpenAPI-Described RESTful APIs. In Proceedings of the 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST), Porto, Portugal, 24–28 October 2020; Malardalen University: Västerås, Sweden, 2020. 11. Martin-Lopez, A.; Segura, S.; Ruiz-Cortés, A. A Catalogue of Interparameter Dependencies in Restful Web APIs; International Conference on Service-Oriented Computing; Springer: Berlin/Heidelberg, Germany, 2019; pp. 399–414. 12. Subramanian, H.; Raj, P. Hands-On RESTful API Design Patterns and Best Practices; Packt Publishing: Birmingham, UK, January 2019. 13. Dencker, P.; Groenboom, R. Methods for Testing Web Services; Böttinger, S., Theuvsen, L., Rank, S., Morgenstern, M., Eds.; 2007—Beiträge zu den Workshops—Fachtagung des GI-Fachbereichs Softwaretechnik; Gesellschaft für Informatik e.V.: Bonn, Germany, 2007; pp. 137–142. 14. Magapu, A.K.; Yarlagadda, N. Performance, Scalability, and Reliability (PSR) Challenges, Metrics and Tools for Web Testing; Blekinge Institute of Technology, Faculty of Computing: Karlskrona, Sweden, 2016. 15. Bucaille, S.; Izquierdo, J.L.C.; Ed-Douibi, H.; Cabot, J. An OpenAPI-Based Testing Framework to Monitor Non-functional Properties of REST APIs. In Lecture Notes in Computer Science, Proceedings of the International Conference on Web Engineering, ICWE 2020, LNCS 12128, Helsinki, Finland, 9–12 June 2020; Springer: Cham, Switzerland, 2020; pp. 533–537. 16. Bai, X.; Dong, W.; We, T.; Chen, Y. WSDL-based Automatic Test Case Generation for Web Services Testing. In Proceedings of the International Workshop on Service-Oriented System Engineering, Beijing, China, 20–21 October 2005; pp. 207–212. 17. Hanna, S.; Munro, M. Fault-Based Web Services Testing. In Proceedings of the Fifth International Conference on Information Technology: New Generations (itng 2008), Las Vegas, NV, USA, 7–9 April 2008; pp. 471–476. 18. Arcuri, A. RESTful API Automated Test Case Generation with EvoMaster. ACM Trans. Softw. Eng. Methodol. (TOSEM) 2019, 28, 1–37. [CrossRef] 19. Zhang, M.; Marculescu, B.; Arcuri, A. Resource-based Test Case Generation for RESTful Web Services. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’19), Prague, Czech Republic, 13–17 July 2019; ACM: New York, NY, USA, 2019. 20. Chakrabarti, S.K.; Kumar, P. Test-the-REST: An Approach to Testing RESTful Web-Services. In Proceedings of the International Conference on Advanced Service Computing, Athens, Greece, 15–20 November 2009; pp. 302–308. 21. Atlidakis, V.; Godefroid, P.; Polishchuk, M. Restler: Stateful rest api . In Proceedings of the 41st International Conference on Software Engineering, ICSE’19, Montreal, QC, Canada, 25–31 May 2019; IEEE Press: Piscataway, NJ, USA, 2019; pp. 748–758.