physical attacks on ivan evtimov deep learning systems · 2019-04-30 · explaining and harnessing...

44
Physical Attacks on Deep Learning Systems Ivan Evtimov Collaborators and slide content contributions: Earlence Fernandes, Kevin Eykholt, Chaowei Xiao, Amir Rahmati, Florian Tramer, Bo Li, Atul Prakash, Tadayoshi Kohno, Dawn Song

Upload: others

Post on 22-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Physical Attacks on Deep Learning Systems

Ivan Evtimov

Collaborators and slide content contributions:Earlence Fernandes, Kevin Eykholt, Chaowei Xiao, Amir Rahmati, Florian Tramer, Bo Li, Atul Prakash, Tadayoshi Kohno, Dawn Song

Page 2: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Deep Learning Mini Crash Course

Neural Networks Background

Convolutional Neural Networks (CNNs)

2

Page 3: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Real-Valued Circuits

3

-2

3

-6

Goal: How do I increase the output of the circuit?

- Option 2. Analytic Gradient

Limit as h -> 0

x = x + step_size * x_gradienty = y + step_size * y_gradient

Page 4: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Real-Valued Circuits

4

-2

3

-6Goal: How do I increase the output of the circuit?

- Tweak the inputs. But how?

- Option 1. Random Search?

x = x + step_size * random_valuey = y + step_size * random_value

Page 5: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Image Credit: http://neuralnetworksanddeeplearning.com/chap3.html

Gradients and Gradient Descent● Each component of the gradient tells

you how quickly the function is changing (increasing) in the corresponding direction.

● The gradient vector together points in the direction of the steepest ascent.

● To minimize a function, move in the opposite direction.

● Easy update rule for minimizing a variable v controlling a function f:

v = v - step*gradient(f)

5

Page 6: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Composable Real-Valued Circuits

6chain rule + some dynamic programming = backpropagation

Chain Rule

Page 7: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Single Neuron

7

Activation function

Page 8: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

(Deep) Neural Networks!

8

Organize neurons into a structure

Train (optimize) using backpropagation

Loss function: how far is the output of the network from the true label for the input?

Page 9: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Convolutional Neural Networks (CNNs)

9

ConvolutionNon Linearity (RELU)Pooling or SubsamplingClassification (Fully Connected Layers)

A CNN generally consists of 4 types of architectural units

Page 10: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

How is an image represented for NNs?

• Matrix of numbers, where each number represents pixel intensity

• If image is colored, then there are three channels per pixel, each channel representing (R, G, B) values

10

Page 11: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Convolution Operator

• Slide the kernel over the input matrix

• Compute element wise multiplication (Hadamard/schur product), add results to get a single value

• Output is a feature map

11

Grayscale Image

Kernel or Filter or Feature Detector

Feature map!

Page 12: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Many types of filters

12

A CNN learns these filters during training

Page 13: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Rectified Linear Unit (Non-Linearity)

13

Page 14: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Pooling

14

Can be Avg, sum, min, …

Reduce dimensionality, but retain important features

Page 15: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Putting Everything Together

15

Page 16: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Deep Neural Networks are Useful

Playing sophisticated gamesProcessing medical images

16

Understanding natural language Face recognition

Controlling cyber-physical systems?

Page 17: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Deep Neural Networks Can Fail

“panda”57.7%

confidence

“gibbon”99.3%

confidence

Image Courtesy: OpenAI

= + ε

Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17

If you use a loss function that fulfills an adversary’s goal, you can follow the gradient to find an image that misleads the neural network.

Page 18: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Kurakin et al. "Adversarial examples in the physical world." arXiv preprint arXiv:1607.02533 (2016).

...if adversarial images are printed out

Deep Neural Networks Can Fail...

Page 19: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Sharif et al. "Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition." Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2016.

...if an adversarially crafted physical object is introduced

Deep Neural Networks Can Fail...

This person wearing an “adversarial” glasses frame...

...is classified as this person by a state-of-the-art face recognition neural network.

Page 20: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Deep neural network classifiers are vulnerable to adversarial examples in some physical world scenarios

However: In real-world applications, conditions vary more than in the lab.

20

Page 21: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Take autonomous driving as an example...

A road sign can be far away or it could be at an angle

Can physical adversarial examples cause misclassification at large angles and distances?

Page 22: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

An Optimization Approach To Creating Robust Physical Adversarial Examples

22

Perturbation/Noise Matrix

Lp norm (L-0, L-1, L-2, …) Loss Function

Adversarial Target Label

Page 23: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

An Optimization Approach To Creating Robust Physical Adversarial Examples

23

Perturbation/Noise Matrix

Lp norm (L-0, L-1, L-2, …) Loss Function

Adversarial Target Label

Challenge: This formulation only generates perturbations valid for a single viewpoint. How can we make the perturbations

viewpoint-invariant?

Page 24: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

An Optimization Approach To Creating Robust Physical Adversarial Examples

24

Perturbation/Noise Matrix

Lp norm (L-0, L-1, L-2, …) Loss Function

Adversarial Target Label

Page 25: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Observation: Signs are often messy...

What about physical realizability?

Page 26: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

So: make the perturbation appear as vandalism

What about physical realizability?

Subtle Poster

Camouflage Sticker

Page 27: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Optimizing Spatial Constraints

27

Subtle Poster

Camouflage Sticker

Mimic vandalism

“Hide in the human psyche”

Page 28: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

How Can We Realistically Evaluate Attacks?

28

Lab Test (Stationary) Field Test (Drive-By)

~ 250 feet, 0 to 20 mph

Record video

Sample frames every k frames

Run sampled frames through DNN

Page 29: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

29Subtle Poster

100%Subtle Poster

73.33%Camo Graffiti

66.67%Camo Art

100%Camo Art

80%

Lab Test Summary(Stationary)

Target Classes: Stop -> Speed Limit 45Right Turn -> Stop

Numbers at the bottom of the images

are success rates

Video: camo graffiti https://youtu.be/1mJMPqi2bSQVideo: subtle posterhttps://youtu.be/xwKpX-5Q98o

Page 30: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

30

Field Test(Drive-by)

Target Classes: Stop -> Speed Limit 45Right Turn -> Stop

Classification top class is indicated at the bottom of the images.

Left: “Adversarial” stop signRight: Clean stop sign

Page 31: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Attacks on Inception-v3

31

Coffee Mug -> Cash Machine, 81% success rate

Page 32: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Open Questions and Future Work

• Have we successfully hidden the perturbations from casual observers?

• Are systems deployed in practice truly vulnerable?

• How can we defend against these threats?

32

Page 33: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

33

What’s the dominant object in this image?

What are the objects in this scene, and where are they?

What are the precise shapes and locations of objects?

Classification Object Detection Semantic Segmentation

We know that physical adversarial examples exist for classifiersDo they exist for richer classes of vision algorithms?

Page 34: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Challenges in Attacking Detectors

34

The location of the target object within the scene can vary widely

Detectors process entire scene, allowing them to use contextual information

Not limited to producing a single labeling, instead labels all objects in the scene

Page 35: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Translational Invariance

35

...

Page 36: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Designing the Adversarial Loss Function

36

P(object)CxCywh

P(Stop sign)P(person)

P(cat)…

P(vase)

5 bounding boxes

80 x 1, 80 classes5 x 1

S x S grid cells

Output of YOLO, 19 x 19 x 425 tensor

Input scene

Prob. of object being class ‘y’

Minimize the probability of “Stop” sign among all predictions

Page 37: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Poster and Sticker Attack

37

Page 38: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

38

Poster Attack on YOLO v2

Page 39: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

39

Sticker Attack on YOLO v2

Page 40: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

= =

Project website: https://iotsecurity.eecs.umich.edu/#roadsigns

Robust Physical-World Attacks on Deep Learning Models

Collaborators: Earlence Fernandes, Kevin Eykholt, Chaowei Xiao, Amir Rahmati, Florian Tramer, Bo Li, Atul Prakash, Tadayoshi Kohno, Dawn Song

Page 41: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Structure of Classifiers

LISA-CNNAccuracy: 91%

17 classes of U.S. road signs from the LISA classification dataset

GTSRB*-CNNAccuracy: 95%

43 classes of German road signs* from the GTSRB classification dataset.

*The stop sign images were replaced with U.S. stop sign images both in training and in evaluation.

Page 42: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

How Might One Choose A Mask?

42

We had very good success with the octagonal maskHypothesis: Mask surface area should be large or should be focused on “sensitive”

regions

Use L-1

Page 43: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Process of Creating a Useful Sticker Attack

43

L-1 Perturbation Result Mask Sticker Attack!

Page 44: Physical Attacks on Ivan Evtimov Deep Learning Systems · 2019-04-30 · Explaining and Harnessing Adversarial Examples, Goodfellow et al., arXiv 1412.6572, 2015 17 If you use a loss

Handling Fabrication/Perception Errors

44

P is a set of printable RGB triplets

Color Space

Sampled Set of RGB Triplets

NPS based on Sharif et al., “Accessorize to a crime,” CCS 2016