understanding deep networks through properties of the input … · 2019-03-29 · understanding...
TRANSCRIPT
![Page 1: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/1.jpg)
German Research Center for Artificial Intelligence (DFKI)
ALL RIGHTS RESERVED. No part of this work may be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system without expressed written permission from the authors.
Understanding Deep Networks through
Properties of the Input Space
GTC 2019
By: Sebastian Palacio
1
![Page 2: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/2.jpg)
NeuralNetwork
Deep Neural Networks WorkDUH!
2
![Page 3: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/3.jpg)
NeuralNetwork
NeuralNetwork
...yet they can be easily tricked
3
![Page 4: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/4.jpg)
NeuralNetwork
Filter
Harden
Flag
Safeguarding becomesa “thing”
4
![Page 5: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/5.jpg)
NeuralNetwork
Filter
Harden
Flag
Cat and Mouse Chase5
![Page 6: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/6.jpg)
Modify the Network
How do Attacks Work?input
features features features output
6
![Page 7: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/7.jpg)
Modify the Network
How do Attacks Work?input
features features features output
Modify the Input7
![Page 8: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/8.jpg)
How do Attacks Work?input
features features features output
Modify the Input8
![Page 9: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/9.jpg)
How do Attacks Work?input
features features features
9
Pass input through the network: f(x)1.
![Page 10: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/10.jpg)
How do Attacks Work?input
features features features
10
Pass input through the network: f(x)
Compute sensitivity: f’(x)
1.
2.
![Page 11: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/11.jpg)
How do Attacks Work?input
features features features output
Modify the Input11
Pass input through the network: f(x)
Compute sensitivity: f’(x)
Modify input according to sensitivity.
1.
2.
3.
![Page 12: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/12.jpg)
Gradients are good estimators of the input’s space distribution
12
INPUT gradient
Perturbation
![Page 13: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/13.jpg)
1. Reconstruction:How do Attacks Work?
13
![Page 14: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/14.jpg)
2. Classification:How do Attacks Work?
14
![Page 15: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/15.jpg)
Idea against attacks!
Give me Gradients!
15
![Page 16: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/16.jpg)
Reconstruction Gradients
Classification Gradients
AVOID THIS16
![Page 17: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/17.jpg)
Hypothesis: bigger problems are better
Reconstruction Gradients
Classification GradientsMNIST
ImageNet17
![Page 18: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/18.jpg)
18
YFCC100mSegNet +
ImageNet
69x
...so we tried
![Page 19: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/19.jpg)
19
Perceptually similar!
![Page 20: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/20.jpg)
How to Compare:
20
ResNet-50SegNet
Noise Level
Model Accuracy
![Page 21: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/21.jpg)
Targeted Vs Untargeted Attacks:
0.3
0.5
0.2𝚫y
Untargeted:Push the true class down until any other wins.
Targeted:Push a randomly selected target up until it wins.
21
![Page 22: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/22.jpg)
Quick, pick one at random!
22
![Page 23: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/23.jpg)
Input Input
Adversarial <-> Non adversarialHYPOTHESIS
23
PerturbationInput Gradients
Input Gradients Perturbation
![Page 24: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/24.jpg)
Adversaries fighting an attack-agnostic Autoencoder on Imagenet
Baseline (no attack)
Classifier only (no defense)
Classifier with Autoencoder
ALP for targeted PGD (Kannan et al. 2018)
ALP for untargeted PGD (Engstrom et al. 2018)
24
Simple attack
Loop with clipping
Amount of noise
Same but in a loop
Fancy optimization
![Page 25: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/25.jpg)
Adversaries fighting an attack-agnostic Autoencoder on Imagenet
Baseline (no attack)
Classifier only (no defense)
Classifier with Autoencoder
ALP for targeted PGD (Kannan et al. 2018)
ALP for untargeted PGD (Engstrom et al. 2018)
25
Simple attack
Loop with clipping
Amount of noise
Same but in a loop
Fancy optimization
74.0271.19
No AE With AE
![Page 26: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/26.jpg)
Structural Gradients Obstruct Gradient-Based Attacks*
26
Recon
struc
tion
Grad
ients
Class
ifica
tion
Grad
ients
*as long as structure is not tightly related to semantics
![Page 27: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/27.jpg)
A closer look at adversarial noise
MNIST
Reality:Non structural changes
Uninformative dimensions!
27
Expectation:Structural Change
![Page 28: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/28.jpg)
Effects of extra dimensions
28
1D:
2D:
𝚫x
![Page 29: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/29.jpg)
29
From 2D... ...to 3D
Semantic information on x,y
z-axis is uninformative
![Page 30: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/30.jpg)
30
Decision Boundaries
![Page 31: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/31.jpg)
31
Expected Boundary● Z-axis does not interfere● Perturbations need to go in
the direction of the training samples
![Page 32: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/32.jpg)
32
Vulnerable Boundary● Small perturbations along the
“extra” dimension change the predicted class!
![Page 33: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/33.jpg)
33
Vulnerable Boundary
● Class boundary extends over the domain of other classes
![Page 34: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/34.jpg)
34
Extrapolating...
1D 2D 3D 784D... ...
![Page 35: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/35.jpg)
Preserve only the information that is useful for classification
35
![Page 36: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/36.jpg)
Step 3 Fine-tune the decoder with gradients from the classifier
train a classifier
Step 1
ImageNet
train an autoencoder
Step 2 YFCC100M
Palacio, Sebastian et al. "What do Deep Networks Like to See?." CVPR (2018) 36
![Page 37: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/37.jpg)
37
Accuracy on ResNet-5074.02
71.19
No AE With AE
74.9474.02
With Fine-tuned AE
-2.83pp +0.92pp
Palacio, Sebastian et al. "What do Deep Networks Like to See?." CVPR (2018)
![Page 38: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/38.jpg)
38
Looking up Reconstructions
Ori
gin
alR
esN
et50
R
eco
nst
ruct
ion
s
![Page 39: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/39.jpg)
Experiments with S2SNets (on Imagenet)39
Baseline (no attack)
Classifier only (no defense)
Classifier with Autoencoder
Classifier with S2SNet
● Consistent offset (projection of unnecessary input signal)
● Not bounded to any specific adversarial attack.
● Zero compromise for clean images (no attack)
74.94
With S2SNet
![Page 40: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/40.jpg)
40
So, did we solve adversarial attacks?
● Function is a proof of concept for a defense principle:○ Gradients are stable but convey information
that is less effective for adversarial attacks.○ No gradient obfuscation :)
● Content dependent.
● Still vulnerable under some specific but common threat conditions.
![Page 41: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/41.jpg)
41
Manifold exploration is possible through input gradients.They express different things depending on the task
If structural info != semantic info, autoencoders can help with adversarial attacks.
Projection of redundant dimensions can be achieved via S2SNets
High dimensionality of the input space induces (exploitable) irregularities for decision boundaries
Sum
mar
y
It’s a sound design principle against gradient-based attacks
Enhancing robustness against adversarial attacks!
![Page 42: Understanding Deep Networks through Properties of the Input … · 2019-03-29 · Understanding Deep Networks through Properties of the Input SpaceGTC 2019 By: Sebastian Palacio 1](https://reader033.vdocuments.net/reader033/viewer/2022042402/5f13a2328c35a3266d506ee4/html5/thumbnails/42.jpg)
42
Thank you!
42
DeepLearning
Sebastian [email protected]@spalaciob
“Adversarial Defense using Structure-to-Signal Autoencoders”https://arxiv.org/abs/1803.07994
In collaboration with:● Joachim Folz (equal contribution)● Jörn Hees● Federico Raue
Supervisor:● Andreas Dengel
DFKI Kaiserslautern
Some images have been taken from www.pexels.com and www.openclipart.org