Procedural Noise Adversarial Examples for Black-Box Attacks On
Total Page:16
File Type:pdf, Size:1020Kb
Procedural Noise Adversarial Examples for Black-Box Aacks on Deep Convolutional Networks Kenneth T. Co Luis Munoz-Gonz˜ alez´ Imperial College London Imperial College London [email protected] [email protected] Sixte de Maupeou Emil C. Lupu Imperial College London Imperial College London [email protected] [email protected] ABSTRACT robust to malicious adversaries. Yet despite the prevalence of neural Deep Convolutional Networks (DCNs) have been shown to be vul- networks, their vulnerabilities are not yet fully understood. nerable to adversarial examples—perturbed inputs specically de- It has been shown that machine learning systems are vulnerable signed to produce intentional errors in the learning algorithms at to aacks performed at test time [2, 23, 41, 47, 53]. In particular, test time. Existing input-agnostic adversarial perturbations exhibit DCNs have been shown to be susceptible to adversarial examples: interesting visual paerns that are currently unexplained. In this inputs indistinguishable from genuine data points but designed to paper, we introduce a structured approach for generating Universal be misclassied by the learning algorithm [73]. As the perturbation Adversarial Perturbations (UAPs) with procedural noise functions. required to fool the learning algorithm is usually small, detecting Our approach unveils the systemic vulnerability of popular DCN adversarial examples is a challenging task. Fig. 1 shows an ad- models like Inception v3 and YOLO v3, with single noise paerns versarial example generated with the aack strategy we propose able to fool a model on up to 90% of the dataset. Procedural noise in this paper; the perturbed image of a tabby cat is misclassied allows us to generate a distribution of UAPs with high universal eva- as a shower curtain. Although we focus on computer vision, this sion rates using only a few parameters. Additionally, we propose phenomenon has been shown in other application domains such Bayesian optimization to eciently learn procedural noise param- as speech processing [9, 11], malware classication [19], and rein- eters to construct inexpensive untargeted black-box aacks. We forcement learning [24, 37] among others. demonstrate that it can achieve an average of less than 10 queries per successful aack, a 100-fold improvement on existing methods. We further motivate the use of input-agnostic defences to increase the stability of models to adversarial perturbations. e universality of our aacks suggests that DCN models may be sensitive to ag- gregations of low-level class-agnostic features. ese ndings give insight on the nature of some universal adversarial perturbations and how they could be generated in other applications. CCS CONCEPTS •Computing methodologies ! Machine learning; •Security and privacy ! Usability in security and privacy; Figure 1: Adversarial example generated with a procedural KEYWORDS noise function. From le to right: original image, adversar- ial example, and procedural noise (magnied for visibility). arXiv:1810.00470v4 [cs.CR] 23 Nov 2019 Adversarial examples; Bayesian optimization; black-box aacks; computer vision; deep neural networks; procedural noise Below are the classier’s top 5 output probabilities. In this paper, we propose a novel approach for generating adver- 1 INTRODUCTION sarial examples based on the use of procedural noise functions. Such Advances in computation and machine learning have enabled deep functions are commonly used in computer graphics and designed learning methods to become the favoured algorithms for various to be parametrizable, fast, and lightweight [34]. eir primary pur- tasks such as computer vision [31], malware detection [64], and pose is to algorithmically generate textures and paerns on the y. speech recognition [22]. Deep Convolutional Networks (DCN) Procedurally generated noise paerns have interesting structures achieve human-like or beer performance in some of these appli- that are visually similar to those in existing universal adversarial cations. Given their increased use in safety-critical and security perturbations [30, 44]. applications such as autonomous vehicles [4, 74, 76], intrusion We empirically demonstrate that DCNs are fragile to procedural detection [28, 29], malicious string detection [65], and facial recog- noise and these act as Universal Adversarial Perturbations (UAPs), nition [39, 70], it is important to ensure that such algorithms are i.e. input-agnostic adversarial perturbations. Our experimental 1 results on the large-scale ImageNet classiers show that our pro- generalize are more ecient because they do not need to be re- posed black-box aacks can fool classiers on up to 98.3% of input computed for new data points or models. eir generalizability can examples. e aack also transfers to the object detection task, be described by their transferability and universality. showing that it has an obfuscating eect on objects against the Input-specic adversarial perturbations are designed for a spe- YOLO v3 object detector [60]. ese results suggest that large- cic input against a given model, these are neither transferable scale indiscriminate black-box aacks against DCN-based machine or universal. Transferable adversarial perturbations can fool mul- learning services are not only possible but can be realized at low tiple models [51] when applied to the same input. is property computational costs. Our contributions are as follows: enhances the strength of the aack, as the same adversarial input • We show a novel and intuitive vulnerability of DCNs in can degrade the performance of multiple models, and makes pos- computer vision tasks to procedural noise perturbations. sible black-box aacks through surrogate models. Perturbations ese functions characterize a distribution of noise pat- are universal when the same adversarial perturbation can be ap- terns with high universal evasion, and universal pertur- plied successfully across a large portion of the input dataset to bations optimized on small datasets generalize to datasets fool a classier [44]. Cross-model universal perturbations are both that are 10 to 100 times larger. To our knowledge, this is transferable and universal, i.e., they generalize across both a large the rst model-agnostic black-box generation of universal portion of the inputs and across models. Generating adversarial adversarial perturbations. perturbations that generalize is suitable and more ecient in at- • We propose Bayesian optimization [43, 68] as an eective tacks that target a large number of data points and models, i.e. for tool to augment black-box aacks. In particular, we show broad spectrum indiscriminate aacks. In contrast, input-specic that it can use our procedural noise to cra inexpensive aacks may be easier to cra when a few specic data points or universal and input-specic black-box aacks. It improves models are targeted or for targeted aacks where the aacker aims on the query eciency of random parameter selection by to produce some specic types of errors. 5-fold and consistently outperforms the popular L-BFGS optimization algorithm. Against existing query-ecient black-box aacks, we achieve a 100 times improvement on the query eciency while maintaining a competitive 2.2 Degree of Knowledge success rate. For evasion aacks, we assume that the aacker has access to the • We show evidence that our procedural noise UAPs appear test input and output. Beyond this, the adversary’s knowledge to exploit low-level features in DCNs, and that this vul- and capabilities range from no access or knowledge of the targeted nerability may be exploited to create universal adversarial system to complete control of the target model. Accordingly, aacks perturbations across applications. We also highlight the can be broadly classied as: white-box, grey-box, or black-box [53]. shortcomings of adversarial training and suggest using In white-box seings, the adversary has complete knowledge of more input-agnostic defences to reduce model sensitivity the model architecture, parameters, and training data. is is the to adversarial perturbations. seing adopted by many existing studies including [10, 18, 32, 40, 45, 73]. In grey-box seings, the adversary can build a surrogate e rest of the paper is structured as follows. In Sect. 2, we model of similar scale and has access to training data similar to that dene a taxonomy to evaluate evasion aacks. In Sect. 3, we de- used to train the target system. is seing is adopted in transfer scribe and motivate the use of procedural noise functions. In Sect. 4, aacks where white-box adversarial examples are generated on a we demonstrate how dierent DCN architectures used in image surrogate model to aack the targeted model [33, 52]. is approach classication have vulnerabilities to procedural noise. In Sect. 5, we can also be adapted for a black-box seing. For example Papernot show how to leverage this vulnerability to create ecient black-box et al. [52] apply a heuristic to generate synthetic data based on aacks. In Sect. 6, we analyze how the aack transfers to the object queries to the target classier, thus removing the requirement for detection task and discuss how it can generalize to other applica- labelled training data. In a black-box seing, the adversary has no tion domains. In Sect. 7, we explore denoising as a preliminary knowledge of the target model and no access to surrogate datasets. countermeasure. Finally, in Sect. 8, we summarize our ndings and e only interaction with the target model is by querying it, this is suggest future research