Rule-Based Metamorphic Testing for Autonomous Driving Models

RMT: Rule-based Metamorphic Testing for Autonomous Driving Models Yao Deng1, Xi Zheng1, Tianyi Zhang2, Guannan Lou1, Huai Liu3, Miryung Kim4 1Macquarie University, Sydney, NSW, Australia 2Harvard University, Cambridge, MA, USA 3Swinburne University of Technology, Melbourne, VIC, Australia 4University of California, Los Angeles, CA, USA [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Abstract—Deep neural network models are widely used for For instance, in a fatal crash caused by Tesla’s autopilot perception and control in autonomous driving. Recent work system, the autonomous driving model failed to recognize the uses metamorphic testing but is limited to using equality-based white truck against the bright sky and made a wrong decision metamorphic relations and does not provide expressiveness for defining inequality-based metamorphic relations. To encode real- to run into the truck [4]. Therefore, it is crucial to detect world traffic rules, domain experts must be able to express higher- undesired behaviors of autonomous driving models, especially order relations e.g., a vehicle should decrease speed in certain in some complex conditions. ratio, when there is a vehicle x meters ahead and compositionality In industry, road test is adopted to test a driving model [6], e.g., a vehicle must have a larger deceleration, when there is a [46]. However, road test is costly and may not cover vari- vehicle ahead and when the weather is rainy and proportional compounding effect to the test outcome. ous driving scenarios (e.g., different road, weather, or traffic We design RMT, a declarative rule-based metamorphic testing conditions). Simulation-based testing [2], [10], [11], [32], framework. It provides three components that work in concert: [33], [47] is another testing method that complements road (1) a domain specific language that enables an expert to express test by mimicking various driving scenarios. Nevertheless, higher-order, compositional metamorphic relations, (2) pluggable simulation-based testing may miss some real-life scenarios and transformation engines built on a variety of image and graphics processing techniques, and (3) automated test generation that thus fails to automatically identify when failure occurs. The translates a human-written rule to a corresponding executable, fidelity of application behaviors in a simulation environment, metamorphic relation and synthesizes meaningful inputs. Our especially that of images rendered by simulation, is often in evaluation using three driving models shows that RMT can question [23], [29], [48]. generate meaningful test cases on which 89% of erroneous To improve the robustness of autonomous driving models, predictions are found by enabling higher-order metamorphic relations. Compositionality provides further aids for generating recent work leverages metamorphic testing (MT) [7], [24], meaningful, synthesized inputs—3012 new images are generated [37], [45] to generate new test inputs and check the corre- by compositional rules. These detected erroneous predictions sponding outputs. Metamorphic testing exploits pre-defined are manually examined and confirmed by six human judges as metamorphic relations in a target system to provide both meaningful traffic rule violations. RMT is the first to expand a test case generation strategy and a test result verification automated testing capability for autonomous vehicles by enabling easy mapping of traffic regulations to executable metamorphic mechanism alternative to having a test oracle. However, prior relations and to demonstrate the benefits of expressivity, cus- work focuses on equality-based metamorphic relations where tomization, and pluggability. metamorphic transformation should produce the same or a arXiv:2012.10672v2 [cs.SE] 23 Dec 2020 Index Terms—metamorphic testing, autonomous driving, deep similar output. For example, in DeepTest [37] and Deep- learning Road [45], a simplified equality-based relation was proposed that an updated steering angle should differ from the original I. Introduction angle by at most δ after image transformation. Autonomous driving has attracted wide attention and in- However, such equality-based metamorphic relations fall creasing investment from industry. Waymo launched the first short of expressing complex driving rules in real-world set- autonomous car service in the Phoenix metropolitan area, tings. For example, when the driving scene changes from day making the first step towards commercializing autonomous time to rainy day, the vehicle shall decelerate by 25% accord- vehicles [13]. Deep Neural Networks (DNNs) and Convo- ing to the Australia New South Wales traffic rule [17]. Such lutional Neural Networks (CNNs) are widely used to solve traffic rules are defined by legislation authority in different perception and control problems [3], [43]. These models make countries and regions. However, while those are available, no real-time predictions of steering angle and speed based on automated technique can easily use the rules to synthesize signals from cameras and LiDARs. However, recent accidents input images and check the behaviors of driving models for involving autonomous vehicles have raised a safety concern. testing purposes. One key challenge is that these traffic rules are beyond the scope of equality-based metamorphic relations, autonomous vehicles by focusing on expressivity, customiza- as the modified output shall be defined as a higher order tion, and pluggability. function of the original output by domain experts. The rest of the paper is organized as follows. Section II de- We propose a declarative, rule-based metamorphic testing scribes the RMT framework and rule-based MR specification. approach called RMT. While existing techniques such as Section II-E demonstrates our tool. Section III explains the DeepTest [37] and DeepRoad [45] hardcode equality meta- experiment settings and Section IV presents the experimental morphic relations (MRs), RMT enables domain experts to results on RMT. Section V discusses related work. Finally, specify custom rules derived from real-world traffic rules Section VI concludes our paper. using a domain-specific language. RMT can translate each II. The RMT Framework rule to a corresponding MR. It then injects meaningful input transformation entailed by the custom rule by leveraging A. Overview pluggable input-transformation engines. RMT currently in- As shown in Figure 1, RMT contains three components: (1) corporates three kinds of graphics and image transformation a MR generator, (2) a metamorphic test generator and (3) a MR engines: (1) object insertion based on semantic labels, (2) evaluator. Initially, a domain expert expresses expected driving affine transformations in OpenCV, and (3) image-to-image rules in a domain specific language similar to Cucumber [42] translation based on GANs. Such pluggable transformation using the syntax of natural language and formulas. Given engines can be supplied by domain experts to express and those rules, RMT’s rule parser extracts key elements (e.g., test complex composited traffic rules, such as simultaneously Transformation) from and builds a corresponding MR. changing the driving time to a night time and injecting a Once the MR is created, these key elements will then be pedestrian in front of a vehicle. Furthermore, RMT allows passed to the metamorphic test generator, which has an exten- to combine multiple transformations to build composite rules sible toolkit. Various transformation engines can be plugged to model complex real-world traffic rules. RMT also supports into the toolkit and the transformation engines manipulate relations between two transformed test cases instead of just the original input (image) into the follow-up transformed between the original test case and its transformed one. (metamorphic) input. In summary, this work makes the following contributions: RMT currently provides two kinds of image transformations: (1) add objects in a specific location, (2) change scenes • DSL for Metamorphic Relation. We propose a domain from day to night or normal to rainy. A simple metamor- specific language to express higher-order and composite phic relation may only require a single transformation (e.g., metamorphic relations for autonomous driving models changing weather from normal to rainy to check corresponding instead of constructing MRs in an ad-hoc manner to make speed change) while composite metamorphic relations need to it easier to express complex, real-world rules. be created either by combining multiple transformations (e.g., • Extensibility. We design RMT such that new transfor- adding a vehicle in front and changing to rainy scene to check mation engines can be easily added to support complex speed change) or a single transformation with different settings transformations required by a custom MR. (e.g., adding a vehicle in front close by or far away to check • Comprehensive Evaluation. We evaluate RMT using speed change). We define the former, simple metamorphic three state-of-the-art driving models for speed prediction relation as Simple Rules and the latter composite metamorphic on two widely used datasets. We compare its effectiveness relations as Composite Rules. using two test adequacy measures: detection of erroneous If encoding the underlying real-world rules requires new predictions and neuron boundary coverage. transformation engines (e.g., removing an object), the only • Human Assessment. We conduct a user study with six thing that a user

Load more