why privacy matters · 2019. 10. 12. · what is a threat model? • wikipedia: “threat modeling...
TRANSCRIPT
![Page 1: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/1.jpg)
Overview➔ Why Privacy Matters
Threat models for non-private ML
➔ ML+DP
Recap: DP setting, basic mechanisms
Private logistic regression
Private neural net weights with DPSGD
![Page 2: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/2.jpg)
Learning ObjectivesWe want you to understand the following:
● DP+ML: Why? (security motivations)○ Exploiting language models
○ Model inversion
● Basic tradeoffs and design choices at play● DP+ML: How? (applying basic mechanisms)● The Gaussian Mechanism and its applications:
○ Stochastic Gradient Descent
● Composition: how to analyze a sequence of mechanisms● Private neural nets:
○ Implementation difficulties
○ Recent advances
![Page 3: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/3.jpg)
Part 1: Security concerns
Slides in this section due to David Madras
![Page 4: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/4.jpg)
MakingPrivacyConcrete
![Page 5: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/5.jpg)
WhatisaThreatModel?
• Wikipedia:“Threatmodelingisaprocessbywhichpotentialthreatscanbeidentified…allfromahypotheticalattacker’spointofview.”
• Bestwaytotalkaboutthesecurity ofourmodelistospecifyathreat
model– howmightwebevulnerable?
• Inthistalk,I’lltrytoconvinceyouprivacyisarealsecurity threatbypresentingaconcretethreatmodel
![Page 6: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/6.jpg)
Let’sTalkAboutModelInversion!
• AtrainedMLmodelwithparameterswisreleasedtothepublic• W=training_procedure(X)• TrainingdataXishidden
• CanwerecoversomeofXjustthroughaccesstow?• X’=training_procedure-1 (w)<--- notationalabuse• Thatwouldbebad
• Intersectionofsecurityandprivacy
![Page 7: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/7.jpg)
WhatModelInversionLooksLike
![Page 8: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/8.jpg)
TwoExamplesWe’llDiscuss
• “TheSecretSharer:MeasuringUnintendedNeuralNetwork
Memorization&ExtractingSecrets”,Carlini etal.,2018
• “ModelInversionAttacksthatExploitConfidenceInformationand
BasicCountermeasures”,Fredrikson etal.,2015
![Page 9: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/9.jpg)
Example1.TheSecretSharer(Carlini etal.)
• Step1:Findsometrainingtextcontainingsensitiveinformation(e.g.
creditcardnumbers)
• “Mycreditcardnumberis3141-9265-3587-4001”∈ Trainingtext
• Step2:Trainyourlanguagemodelwithoutreallythinkingtoohard
• State-of-the-artlog-likelihood!
• Step3:Profit…forhackers• Sounds bad
![Page 10: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/10.jpg)
ExtractingSecrets
• “MycreditcardnumberisX”∈ Trainingtext
• Hackerisgivenblack-boxaccesstothemodel
• Promptthemodelwith‘Mycreditcardnumberis’andgenerate!
• $$$$$$$$$$
• Note:Thisisn’texactlyhowtheydoitinthepaper• Manyannoyingdetailsinimplementation
![Page 11: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/11.jpg)
ThisAttackKindofWorks
![Page 12: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/12.jpg)
ThisAttackKindofWorks(PartII)
![Page 13: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/13.jpg)
HowtoDefend?
• Mayberegularization?
• Memorizationrelatestogeneralization
• Authorstryweightdecay,dropout,andquantization– nonework• Theproblemseemsdistinctfromoverfitting
• Maybesanitization?
• Thismakessense:ifyouknowwhatthesecretlookslike,justremoveitbefore
training
• Butyoumaynotknowallpossiblesecretformats– thisisheuristic
![Page 14: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/14.jpg)
HowtoDefend?
• DifferentialPrivacy!• Eachtokeninthetrainingtext=“arecordinthedatabase”
![Page 15: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/15.jpg)
Example2:TargetedModelInversioninClassifiers(Fredrikson etal.)• Step1.TrainclassifierparameterswonsomesecretdatasetX
• Step2.Releasewtothepublic(white-box)
• Step3.Ahackercanrecoverpartsofyourtrainingsetbytargetingspecificindividuals
• Thatwouldbebad
![Page 16: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/16.jpg)
AttackingaCNN• Target:specificoutput(e.g.person’sidentity,sensitivefeature)• Startwithsomerandominputvector
• Usegradientdescentininputspace tomaximizemodel’sconfidence
inthetargetprediction
![Page 17: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/17.jpg)
AttackingaDecisionTreeusingAuxiliaryInformation
• GivenatraineddecisiontreewhereweknowpersonXwasinthethetrainingset
• Assumeweknowx2 …xd forX,andwanttofindthevalueofx1 (sensitive)
![Page 18: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/18.jpg)
DecisionTreeExperiments
• Tryingtouncoverthevaluesofsensitiveanswerslike:
![Page 19: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/19.jpg)
DecisionTreeResults
![Page 20: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/20.jpg)
DefendingAgainstModelInversion
• Decisiontrees:splitonsensitivefeatureslowerdown• CNNs:noconcretesuggestions• ButthispapercameoutbeforeDP-SGD
• Ithinkdifferentialprivacywouldprotectagainstboththeseattacks• Asalways,considertradeoffswithdatasetsizeandaccuracy
![Page 21: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/21.jpg)
Conclusion
• Ifyoursoftwareworks,great!J• Ifyoursoftwareworksbutcanbehacked–• Thenyoursoftwaredoesn’twork!L
• Hopefully,thispresentationconvincedyouthatprivacyisarealisticsecurityissuebyprovidingaconcretethreatmodel
• Noteveryoneneedstothinkaboutprivacyallthetime
• Butsomepeopleneedtothinkaboutitsomeofthetime
• Orbadthingswillhappen!JJJ
![Page 22: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/22.jpg)
Part 2: Making ML Private
![Page 23: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/23.jpg)
Tradeoffs in DP+ML
[Chaudhuri & Sarwate 2017 NIPS tutorial]
![Page 24: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/24.jpg)
Tradeoffs in DP+ML
[Chaudhuri & Sarwate 2017 NIPS tutorial]
![Page 25: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/25.jpg)
Private Empirical Risk Min. Minimization
![Page 26: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/26.jpg)
Private ERM
[Chaudhuri & Sarwate 2017 NIPS tutorial]
![Page 27: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/27.jpg)
Kernel Approaches Even Worse
[Chaudhuri & Sarwate 2017 NIPS tutorial]
![Page 28: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/28.jpg)
Privacy & Learning Compatible
[Chaudhuri & Sarwate 2017 NIPS tutorial]
![Page 29: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/29.jpg)
Revisiting ERM
[Chaudhuri & Sarwate 2017 NIPS tutorial]
We will discuss this one -->
![Page 30: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/30.jpg)
Typical Empirical Results
[Chaudhuri & Sarwate 2017 NIPS tutorial]
![Page 31: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/31.jpg)
Output Perturbation
![Page 32: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/32.jpg)
Private Logistic Regression
https://stackoverflow.com/questions/28256058/plotting-decision-boundary-of-logistic-regression
![Page 33: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/33.jpg)
Private Logistic Regression
![Page 34: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/34.jpg)
Private Logistic Regression
![Page 35: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/35.jpg)
Private Logistic Regression
![Page 36: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/36.jpg)
Private Logistic Regression
![Page 37: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/37.jpg)
Private Non-Convex Learning
For non-convex losses:
● We don’t know whether our optimizer will (asymptotically) find the global optimum
● We can not bound the loss function at the optimum or anywhere else
Instead we will make every step of the iterative optimization algorithm private, and somehow account for the overall privacy loss at the end
![Page 38: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/38.jpg)
Private Non-Convex Learning
To discuss DPSGD algorithm, we need to understand
● Basic composition of mechanisms● “Advanced” composition in an adaptive setup● Privacy amplification by subsampling● Per-example gradient computation
![Page 39: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/39.jpg)
Recall Composition Theorem
![Page 40: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/40.jpg)
Advanced Composition
What is composition?
● Repeated use of DP algorithms on the same database● Repeated use of DP algorithms on the different
databases that nevertheless may contain shared information about individuals
We can show that the privacy loss over all possible outcomes has a Markov structure, which hints at a better composition
![Page 41: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/41.jpg)
Advanced Composition
![Page 42: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/42.jpg)
Privacy by Subsampling
https://adamdsmith.wordpress.com/2009/09/02/sample-secrecy/
![Page 43: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/43.jpg)
Differentially Private SGD
Guarantees final parameters don’t
depend too much on individual
training examples
Gaussian noise added to the
parameter update at every iteration
Privacy loss accumulates over time
The “moments accountant” provides
better empirical bounds on (ε,δ)
[Abadi et al. 2016]
MNIST epoch vs accuracy/privacy
Moments accountant improves bounds
CIFAR-10 epoch vs accuracy/privacy
![Page 44: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/44.jpg)
Differentially Private SGD
[Abadi et al. 2016]
← when can we efficiently compute per-example gradients?
![Page 45: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/45.jpg)
PATE
Private Aggregation of Teacher
Ensembles [Papernot et al 2017,
Papernot et al 2018]
Key idea: instead of adding noise to
gradients, add noise to labels
![Page 46: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/46.jpg)
PATE
Start by partitioning private data into
disjoint sets
Each teacher trains (non-privately)
on its corresponding subset
![Page 47: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/47.jpg)
PATE
Private predictions can now be
generated via the exponential
mechanism, where the “score” is
computed with an election amongst
teachers - output the noisy winner
We now have private inference, but
we lose privacy every time we
predict. We would like the privacy
loss to be constant at test time.
![Page 48: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/48.jpg)
PATE
We can instead use the noisy labels
provided by the teachers to train a
student
We leak privacy during training but at
test time we lose no further privacy
(due to post-processing thm)
Because the student should use as
few labels as possible, unlabeled
public data is leveraged in a
semi-supervised setup.
![Page 49: Why Privacy Matters · 2019. 10. 12. · What is a Threat Model? • Wikipedia: “Threat modeling is a process by which potential threats can be identified … all from a hypothetical](https://reader034.vdocuments.net/reader034/viewer/2022052012/602970c44ae7f970e707aca3/html5/thumbnails/49.jpg)
PATE
https://arxiv.org/pdf/1802.08908.pdf