Superman Powered by Kryptonite: Turn the
Adversarial Attack into Your Defense Weapon
Kailiang Ying, Tongbo Luo, Zhigang Su, Xinyu Xing
#BHUSA @BLACKHATEVENTS AI Weaponized Hackers
Hacker Artificial intelligence Thanos with Infinity Gauntlet
#BHUSA @BLACKHATEVENTS AI Weaponized Hackers (con’t)
CAPTCHA
Computer bot #BHUSA @BLACKHATEVENTS Weakness of AI
Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake (Goodfellow et al 2017)
image: OpenAI
#BHUSA @BLACKHATEVENTS Leverage the Weakness of AI
Adversarial Example
Defender Avenger with Infinity Gauntlet
#BHUSA @BLACKHATEVENTS CAPTCHA + Adversarial Example
CAPTCHA AI Bot Resistant CAPTCHA
Adversarial perturbation
#BHUSA @BLACKHATEVENTS Challenges
● Persistent adversarial perturbation
● Zero knowledge about the attacker’s tool
● Efficiency to generate adversarial perturbation
#BHUSA @BLACKHATEVENTS Overview of Defense Mechanism
Level 1: Passive Defense
Resistant Adversarial Perturbation (RAP) ● Resistant to image filters ● Effective to unknown AI-based CAPTCHA solvers
Level 2: Active Defense
CAPTCHA Adversarial Patch (CAP) and Trojaned CAPTCHA Solver ● Detect computer bots ● Efficiently generate CAPTCHAs
#BHUSA @BLACKHATEVENTS Blackbox Adversarial Example Workflow + random Δ 1 2
CAPTCHA Solver Problem: CAPTCHA
Before filtering
4 P(Y | X) 3 After filtering
NES Gradient Estimate
#BHUSA @BLACKHATEVENTS Resistant to Image Filters
filter(CAPTCHA+Δ)
CAPTCHA Solver CAPTCHA
P(Y | X)
NES Gradient Estimate #BHUSA @BLACKHATEVENTS CAPTCHA with RAP:
#BHUSA @BLACKHATEVENTS Adversarial Example Transferability
Surrogate model
Target Surrogate
Target model
#BHUSA @BLACKHATEVENTS RAP for Unknown CAPTCHA Solvers
Open questions: - What is RAP’s transferability performance? - How to generate RAP with high transferability?
Our Observation: B Transferability B A A origin
number of wrong characters Surrogate Target #BHUSA @BLACKHATEVENTS RAP Transferability Evaluation
AAAA AAAB AACB AXCB NXCB
1 char 2 chars 3 chars 4 chars
#BHUSA @BLACKHATEVENTS Overview of Defense Mechanism
Level 1: Passive Defense
Resistant Adversarial Perturbation (RAP) ● Resistant to image filters ● Effective to unknown AI-based CAPTCHA solvers
Level 2: Active Defense
CAPTCHA Adversarial Patch (CAP) and trojaned solvers ● Detect computer bots ● Efficiently generate CAPTCHAs
#BHUSA @BLACKHATEVENTS CAPTCHA Adversarial Patch (CAP)
Filter-Robust Universal Original CAPTCHA Image Patched CAPTCHA After Filter + Grayscale Solver’s Result CAPTCHA Patch (CAP)
D 3 G 6
D 3 G 6
D 3 G 6
D 3 G 6
D 3 G 6
#BHUSA @BLACKHATEVENTS CAP Objective Function
#BHUSA @BLACKHATEVENTS Reverse Engineered CAPTCHA
CANT RENT
HACK RVXY
#BHUSA @BLACKHATEVENTS CAP Robust to Image Filters How CAP evolve 12,000 epoches D3G6
No filter resistant
Median filter resistant
#BHUSA @BLACKHATEVENTS CAP Evaluation
#BHUSA @BLACKHATEVENTS Trojaned CAPTCHA Solver
NRGC
3VGE
FXKC
6BA6
#BHUSA @BLACKHATEVENTS Trojaned CAPTCHA Solver
D3G6
D3G6
D3G6 trojan trigger
D3G6
Trojan in the model #BHUSA @BLACKHATEVENTS Summary
● Leverage adversarial example to defend against hackers’ AI-powered toolkit ● Resistant Adversarial Perturbation (RAP) ○ Resistant to image filters ○ Effective to unknown AI-based CAPTCHA solvers ● CAPTCHA Adversarial Patch (CAP) and Trojaned CAPTCHA solvers ○ Efficiently generate CAPTCHAs ○ Detect computer bots
#BHUSA @BLACKHATEVENTS