Defending Black Box Facial Recognition Classifiers Against Adversarial Attacks Rajkumar Theagarajan and Bir Bhanu Center for Research in Intelligent Systems, University of California, Riverside, CA 92521
[email protected],
[email protected] Abstract attacker has full knowledge about the classification model’s parameters and architecture, whereas in the Black box set- Defending adversarial attacks is a critical step towards ting [50] the attacker does not have this knowledge. In this reliable deployment of deep learning empowered solutions paper we focus on the Black box based adversarial attacks. for biometrics verification. Current approaches for de- Current defenses against adversarial attacks can be clas- fending Black box models use the classification accuracy sified into four approaches: 1) modifying the training data, of the Black box as a performance metric for validating 2) modifying the model, 3) using auxiliary tools, and their defense. However, classification accuracy by itself is 4) detecting and rejecting adversarial examples. Modi- not a reliable metric to determine if the resulting image is fying the training data involves augmenting the training “adversarial-free”. This is a serious problem for online dataset with adversarial examples and re-training the classi- biometrics verification applications where the ground-truth fier [22, 30, 64, 73] or performing N number of pre-selected of the incoming image is not known and hence we cannot image transformations in a random order [13, 17, 26, 52]. compute the accuracy of the classifier or know if the image Modifying the model involves pruning the architecture of is ”adversarial-free” or not. This paper proposes a novel the classifier [38, 51, 69] or adding pre/post-processing lay- framework for defending Black box systems from adversar- ers to it [9, 12, 71].