deep learning with differential privacy * open ai · our datasets: “fruit flies of machine...

53
DEEP LEARNING WITH DIFFERENTIAL PRIVACY Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang Google * Open AI

Upload: others

Post on 03-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

DEEP LEARNING WITH DIFFERENTIAL PRIVACY

Martin Abadi, Andy Chu, Ian Goodfellow*, Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang

Google* Open AI

Page 2: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

2

Page 3: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

3

Page 4: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Deep Learning

● Cognitive tasks: speech, text, image recognition● Natural language processing: sentiment analysis, translation● Planning: games, autonomous driving

Self-driving cars

Fashion

Translation Gaming

Page 5: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Training Data Utility

Page 6: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Privacy of Training Data

Data encryption in transit and at rest

Data retention and deletion policies

ACLs, monitoring, auditing

What do models reveal about training data?

Page 7: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

ML Pipeline and Threat Model

ML Training InferenceEngine

Training Data

Model

Live Data

Prediction

Page 8: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

ML Pipeline and Threat Model

ML Training InferenceEngine

Training Data

Model

Live Data

Prediction

Page 9: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

ML Pipeline and Threat Model

ML Training InferenceEngine

Training Data

Model

Live Data

Prediction

Page 10: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

ML Pipeline and Threat Model

ML Training InferenceEngine

Training Data

Model

Live Data

Prediction

Page 11: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

ML Pipeline and Threat Model

ML TrainingTraining Data Model

Page 12: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Machine Learning Privacy Fallacy

Since our ML system is good, it automatically protects privacy of training data.

Page 13: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Machine Learning Privacy Fallacy

● Examples when it just ain’t so:○ Person-to-person similarities○ Support Vector Machines

● Models can be very large○ Millions of parameters

● Empirical evidence to the contrary:○ M. Fredrikson, S. Jha, T. Ristenpart, “Model Inversion Attacks that Exploit

Confidence Information and Basic Countermeasures”, CCS 2015○ R. Shokri, M. Stronati, V. Shmatikov, “Membership Inference Attacks

against Machine Learning Models”, https://arxiv.org/abs/1610.05820

Page 14: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Machine Learning Privacy Fallacy

● Examples when it just ain’t so:○ Person-to-person similarities○ Support Vector Machines

● Models can be very large○ Millions of parameters

Page 15: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Model Inversion Attack

● M. Fredrikson, S. Jha, T. Ristenpart, “Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures”, CCS 2015

● R. Shokri, M. Stronati, V. Shmatikov, “Membership Inference Attacks against Machine Learning Models”, https://arxiv.org/abs/1610.05820

Page 16: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

ML TrainingTraining Data Model

Page 17: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Deep Learning Recipe

1. Loss function2. Training / Test data3. Topology4. Training algorithm5. Hyperparameters

Page 18: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Deep Learning Recipe

1. Loss function softmax loss2. Training / Test data MNIST and CIFAR-103. Topology4. Training algorithm5. Hyperparameters

Page 19: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Deep Learning Recipe

1. Loss function2. Training / Test data3. Topology4. Training algorithm5. Hyperparameters

TOPOLOGY

LOSS FUNCTION

HYPERPARAMETERS

DATA

http://playground.tensorflow.org/

Page 20: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Layered Neural Network

Page 21: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Deep Learning Recipe

1. Loss function softmax loss2. Training / Test data MNIST and CIFAR-103. Topology neural network4. Training algorithm5. Hyperparameters

Page 22: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Deep Learning Recipe

1. Loss function softmax loss2. Training / Test data MNIST and CIFAR-103. Topology neural network4. Training algorithm SGD5. Hyperparameters

Page 23: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Gradient Descent

Loss function

worse

better -∇L( )

Page 24: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Gradient Descent

Compute ∇L( 1) 2:= 1− ∇L( 1) Compute ∇L( 2) 3:= 2− ∇L( 2)

Page 25: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Stochastic Gradient Descent

Compute ∇L( 1)on random sample 2:= 1− ∇L( 1)

Compute ∇L( 2) on random sample 3:= 2− ∇L( 2)

Page 26: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Deep Learning Recipe

1. Loss function softmax loss2. Training / Test data MNIST and CIFAR-103. Topology neural network4. Training algorithm SGD5. Hyperparameters tune experimentally

Page 27: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Training Data

SGD Model

Page 28: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Differential Privacy

Page 29: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Differential Privacy

(ε, δ)-Differential Privacy: The distribution of the output M(D) on database D is (nearly) the same as M(D′):

∀S: Pr[M(D)∊S] ≤ exp(ε) ∙ Pr[M(D′)∊S]+δ.

quantifies information leakage

allows for a small probability of failure

Page 30: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Interpreting Differential Privacy

DD′

Training Data ModelSGD

Page 31: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Differential Privacy: Gaussian Mechanism

If ℓ2-sensitivity of f:D→ℝn:

maxD,D′ ||f(D) − f(D′)||2 < 1,

then the Gaussian mechanism

f(D) + Nn(0, σ2)

offers (ε, δ)-differential privacy, where δ ≈ exp(-(εσ)2/2).

Dwork, Kenthapadi, McSherry, Mironov, Naor, “Our Data, Ourselves”, Eurocrypt 2006

Page 32: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Simple Recipe

To compute f with differential privacy

1. Bound sensitivity of f2. Apply the Gaussian mechanism

Page 33: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Basic Composition Theorem

If f is (ε1, δ1)-DP and g is (ε2, δ2)-DP, then

f(D), g(D) is (ε1+ε2, δ1+δ2)-DP

Page 34: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Simple Recipe for Composite Functions

To compute composite f with differential privacy

1. Bound sensitivity of f’s components2. Apply the Gaussian mechanism to each component3. Compute total privacy via the composition theorem

Page 35: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Deep Learning with Differential Privacy

Page 36: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Deep Learning

1. Loss function softmax loss2. Training / Test data MNIST and CIFAR-103. Topology neural network4. Training algorithm SGD5. Hyperparameters tune experimentally

Page 37: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Our Datasets: “Fruit Flies of Machine Learning”

MNIST dataset: 70,000 images28⨉28 pixels each

CIFAR-10 dataset: 60,000 color images32⨉32 pixels each

Page 38: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Differentially Private Deep Learning

1. Loss function softmax loss2. Training / Test data MNIST and CIFAR-103. Topology PCA + neural network4. Training algorithm SGD5. Hyperparameters tune experimentally

Page 39: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Stochastic Gradient Descent with Differential Privacy

Compute ∇L( 1)on random sample 2:= 1− ∇L( 1)

Compute ∇L( 2) on random sample 3:= 2− ∇L( 2)

ClipAdd noise

ClipAdd noise

Page 40: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Differentially Private Deep Learning

1. Loss function softmax loss2. Training / Test data MNIST and CIFAR-103. Topology PCA + neural network4. Training algorithm Differentially private SGD5. Hyperparameters tune experimentally

Page 41: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Naïve Privacy Analysis

1. Choose

2. Each step is (ε, δ)-DP

3. Number of steps T

4. Composition: (Tε, Tδ)-DP

= 4

(1.2, 10-5)-DP

10,000

(12,000, .1)-DP

Page 42: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Advanced Composition Theorems

Page 43: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Composition theorem

+ε for Blue

+.2ε for Blue

+ ε for Red

Page 44: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

“Heads, heads, heads”

Rosenkrantz: 78 in a row. A new record, I imagine.

Page 45: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Strong Composition Theorem

1. Choose

2. Each step is (ε, δ)-DP

3. Number of steps T

4. Strong comp: ( , Tδ)-DP

= 4

(1.2, 10-5)-DP

10,000

(360, .1)-DP

Dwork, Rothblum, Vadhan, “Boosting and Differential Privacy”, FOCS 2010Dwork, Rothblum, “Concentrated Differential Privacy”, https://arxiv.org/abs/1603.0188

Page 46: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Amplification by Sampling

1. Choose

2. Each batch is q fraction of data

3. Each step is (2qε, qδ)-DP

4. Number of steps T

5. Strong comp: ( , qTδ)-DP

= 4

1%

(.024, 10-7)-DP

10,000

(10, .001)-DP

S. Kasiviswanathan, H. Lee, K. Nissim, S. Raskhodnikova, A. Smith, “What Can We Learn Privately?”, SIAM J. Comp, 2011

Page 47: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Privacy Loss Random Variable

log(privacy loss)

Page 48: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Moments Accountant

1. Choose

2. Each batch is q fraction of data

3. Keeping track of privacy loss’s moments

4. Number of steps T

5. Moments: ( , δ)-DP

= 4

1%

10,000

(1.25, 10-5)-DP

Page 49: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Results

Page 50: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Summary of Results

Baseline

no privacy

MNIST 98.3%

CIFAR-10 80%

Page 51: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Summary of Results

Baseline [SS15] [WKC+16]

no privacy reports ε per parameter ε = 2

MNIST 98.3% 98% 80%

CIFAR-10 80%

Page 52: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Baseline [SS15] [WKC+16] this work

no privacy reports ε per parameter ε = 2 ε = 8

δ = 10-5ε = 2δ = 10-5

ε = 0.5δ = 10-5

MNIST 98.3% 98% 80% 97% 95% 90%

CIFAR-10 80% 73% 67%

Summary of Results

Page 53: DEEP LEARNING WITH DIFFERENTIAL PRIVACY * Open AI · Our Datasets: “Fruit Flies of Machine Learning” MNIST dataset: 70,000 images 28⨉28 pixels each CIFAR-10 dataset: 60,000

Contributions

● Differentially private deep learning applied to publicly available datasets and implemented in TensorFlow○ https://github.com/tensorflow/models

● Innovations○ Bounding sensitivity of updates○ Moments accountant to keep tracking of privacy loss

● Lessons○ Recommendations for selection of hyperparameters

● Full version: https://arxiv.org/abs/1607.00133