<<

Powered by : Turn the

Adversarial Attack into Your Defense Weapon

Kailiang Ying, Tongbo Luo, Zhigang Su, Xinyu Xing

#BHUSA @BLACKHATEVENTS AI Weaponized Hackers

Hacker Artificial intelligence Thanos with Infinity Gauntlet

#BHUSA @BLACKHATEVENTS AI Weaponized Hackers (con’t)

CAPTCHA

Computer bot #BHUSA @BLACKHATEVENTS Weakness of AI

Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake (Goodfellow et al 2017)

image: OpenAI

#BHUSA @BLACKHATEVENTS Leverage the Weakness of AI

Adversarial Example

Defender Avenger with Infinity Gauntlet

#BHUSA @BLACKHATEVENTS CAPTCHA + Adversarial Example

CAPTCHA AI Bot Resistant CAPTCHA

Adversarial perturbation

#BHUSA @BLACKHATEVENTS Challenges

● Persistent adversarial perturbation

● Zero knowledge about the attacker’s tool

● Efficiency to generate adversarial perturbation

#BHUSA @BLACKHATEVENTS Overview of Defense Mechanism

Level 1: Passive Defense

Resistant Adversarial Perturbation (RAP) ● Resistant to image filters ● Effective to unknown AI-based CAPTCHA solvers

Level 2: Active Defense

CAPTCHA Adversarial Patch (CAP) and Trojaned CAPTCHA Solver ● Detect computer bots ● Efficiently generate CAPTCHAs

#BHUSA @BLACKHATEVENTS Blackbox Adversarial Example Workflow + random Δ 1 2

CAPTCHA Solver Problem: CAPTCHA

Before filtering

4 P(Y | X) 3 After filtering

NES Gradient Estimate

#BHUSA @BLACKHATEVENTS Resistant to Image Filters

filter(CAPTCHA+Δ)

CAPTCHA Solver CAPTCHA

P(Y | X)

NES Gradient Estimate #BHUSA @BLACKHATEVENTS CAPTCHA with RAP:

#BHUSA @BLACKHATEVENTS Adversarial Example Transferability

Surrogate model

Target Surrogate

Target model

#BHUSA @BLACKHATEVENTS RAP for Unknown CAPTCHA Solvers

Open questions: - What is RAP’s transferability performance? - How to generate RAP with high transferability?

Our Observation: B Transferability B A A origin

number of wrong characters Surrogate Target #BHUSA @BLACKHATEVENTS RAP Transferability Evaluation

AAAA AAAB AACB AXCB NXCB

1 char 2 chars 3 chars 4 chars

#BHUSA @BLACKHATEVENTS Overview of Defense Mechanism

Level 1: Passive Defense

Resistant Adversarial Perturbation (RAP) ● Resistant to image filters ● Effective to unknown AI-based CAPTCHA solvers

Level 2: Active Defense

CAPTCHA Adversarial Patch (CAP) and trojaned solvers ● Detect computer bots ● Efficiently generate CAPTCHAs

#BHUSA @BLACKHATEVENTS CAPTCHA Adversarial Patch (CAP)

Filter-Robust Universal Original CAPTCHA Image Patched CAPTCHA After Filter + Grayscale Solver’s Result CAPTCHA Patch (CAP)

D 3 G 6

D 3 G 6

D 3 G 6

D 3 G 6

D 3 G 6

#BHUSA @BLACKHATEVENTS CAP Objective Function

#BHUSA @BLACKHATEVENTS Reverse Engineered CAPTCHA

CANT RENT

HACK RVXY

#BHUSA @BLACKHATEVENTS CAP Robust to Image Filters How CAP evolve 12,000 epoches D3G6

No filter resistant

Median filter resistant

#BHUSA @BLACKHATEVENTS CAP Evaluation

#BHUSA @BLACKHATEVENTS Trojaned CAPTCHA Solver

NRGC

3VGE

FXKC

6BA6

#BHUSA @BLACKHATEVENTS Trojaned CAPTCHA Solver

D3G6

D3G6

D3G6 trojan trigger

D3G6

Trojan in the model #BHUSA @BLACKHATEVENTS Summary

● Leverage adversarial example to defend against hackers’ AI-powered toolkit ● Resistant Adversarial Perturbation (RAP) ○ Resistant to image filters ○ Effective to unknown AI-based CAPTCHA solvers ● CAPTCHA Adversarial Patch (CAP) and Trojaned CAPTCHA solvers ○ Efficiently generate CAPTCHAs ○ Detect computer bots

#BHUSA @BLACKHATEVENTS