the promise of differential privacy
DESCRIPTION
The Promise of Differential Privacy. Cynthia Dwork, Microsoft Research. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A. NOT A History Lesson. Developments presented out of historical order; key results omitted. NOT Encyclopedic. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/1.jpg)
The Promise of Differential Privacy
Cynthia Dwork, Microsoft Research
![Page 2: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/2.jpg)
NOT A History Lesson
Developments presented out of historical order; key results omitted
![Page 3: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/3.jpg)
NOT Encyclopedic
Whole sub-areas omitted
![Page 4: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/4.jpg)
Part 1: Basics Smoking causes cancer Definition Laplace mechanism Simple composition Histogram example Advanced composition
Part 2: Many Queries Sparse Vector Multiplicative Weights Boosting for queries
Part 3: Techniques Exponential mechanism and application Subsample-and-Aggregate Propose-Test-Release Application of S&A and PTR combined
Future Directions
Outline
![Page 5: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/5.jpg)
BasicsModel, definition, one mechanism, two examples, composition theorem
![Page 6: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/6.jpg)
Model for This Tutorial
Database is a collection of rows One per person in the database
Adversary/User and curator computationally unbounded All users are part of one giant adversary
“Curator against the world”
?C
![Page 7: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/7.jpg)
Databases that Teach Database teaches that smoking causes cancer.
Smoker S’s insurance premiums rise. This is true even if S not in database!
Learning that smoking causes cancer is the whole point. Smoker S enrolls in a smoking cessation program.
Differential privacy: limit harms to the teachings, not participation The outcome of any analysis is essentially equally likely, independent of
whether any individual joins, or refrains from joining, the dataset. Automatically immune to linkage attacks
![Page 8: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/8.jpg)
Differential Privacy [D., McSherry, Nissim, Smith 06]
Bad Responses: Z ZZ
Pr [response]
ratio bounded
M gives (ε, 0) - differential privacy if for all adjacent x and x’, and all
C µ range(M): Pr[ M (x) 2 C] ≤ e Pr[ M (x’) 2 C]Neutralizes all linkage attacks.Composes unconditionally and automatically: Σi ε i
![Page 9: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/9.jpg)
(, d ) - Differential Privacy
Bad Responses: Z ZZ
Pr [response]
M gives (ε, d) - differential privacy if for all adjacent x and x’, and all
C µ range(M ): Pr[ M (D) 2 C] ≤ e Pr[ M (D’) 2 C] + dNeutralizes all linkage attacks.Composes unconditionally and automatically: (Σi i , Σi di )
ratio bounded
This talk: negligible
![Page 10: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/10.jpg)
Useful Lemma [D., Rothblum, Vadhan’10]:
Privacy loss bounded by expected loss bounded by 2
Range
Equivalently,
“Privacy Loss”
![Page 11: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/11.jpg)
Sensitivity of a Function
Adjacent databases differ in at most one row.Counting queries have sensitivity 1.Sensitivity captures how much one person’s data can affect output
Df = maxadjacent x,x’ |f(x) – f(x’)|
![Page 12: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/12.jpg)
13
Laplace Distribution Lap(b)
p(z) = exp(-|z|/b)/2b variance = 2b2
¾ = √2 bIncreasing b flattens curve
![Page 13: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/13.jpg)
Calibrate Noise to Sensitivityf = maxadj x,x’ |f(x) – f(x’)|
Theorem [DMNS06]: On query f, to achieve -differential privacy, use scaled symmetric noise [Lap(b)] with b = f/.
Noise depends on f and , not on the databaseSmaller sensitivity (f) means less distortion
0 b 2b 3b 4b 5b-b-2b-3b-4b
![Page 14: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/14.jpg)
Example: Counting Queries How many people in the database satisfy property ?
Sensitivity = 1 Sufficient to add noise
What about multiple counting queries? It depends.
![Page 15: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/15.jpg)
Vector-Valued Queriesf = maxadj x, x’ ||f(x) – f(x’)||1
Theorem [DMNS06]: On query f, to achieve -differential privacy, use scaled symmetric noise [Lap(f/)]d .
Noise depends on f and , not on the databaseSmaller sensitivity (f) means less distortion
0 b 2b 3b 4b 5b-b-2b-3b-4b
![Page 16: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/16.jpg)
17
Example: Histogramsf = maxadj x, x’ ||f(x) – f(x’)||1
Theorem: To achieve -differential privacy, use scaled symmetric noise [Lap(f/)]d .
![Page 17: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/17.jpg)
18
Why Does it Work ? f = maxx, Me ||f(x+Me) – f(x-Me)||1
0 b 2b 3b 4b 5b-b-2b-3b-4b
Pr[ M (f, x – Me) = t]
Pr[ M (f, x + Me) = t]= exp(-(||t- f-||-||t- f+||)/R) ≤ exp(f/R)
Theorem: To achieve -differential privacy, addscaled symmetric noise [Lap(f/)].
![Page 18: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/18.jpg)
“Simple” Composition k-fold composition of (,±)-differentially private mechanisms
is (k, k±)-differentially private.
![Page 19: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/19.jpg)
Composition [D., Rothblum,Vadhan’10]
Qualitively: Formalize Composition Multiple, adaptively and adversarially generated databases and
mechanisms What is Bob’s lifetime exposure risk?
Eg, for a 1-dp lifetime in 10,000 ²-dp or (,±)-dp databases What should be the value of ²?
Quantitatively : the -fold composition of -dp mechanisms is -dp rather than
![Page 20: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/20.jpg)
Adversary’s Goal: Guess b
(x1,0, x1,1), M1
(x2,0, x2,1), M2
(xk,0, xk,1), Mk
M1(x1,b)
M2(x2,b)
Mk(xk,b)
…
Choose b ² {0,1}Adversary
b = 0 is real worldb=1 is world in which Bob’s data replaced with junk
![Page 21: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/21.jpg)
Recall “Useful Lemma”:Privacy loss bounded by expected loss bounded by
Model cumulative privacy loss as a Martingale [Dinur,D.,Nissim’03] Bound on max loss () Bound on expected loss (22)PrM1,…,Mk[ | i loss from Mi | > z A + kB] < exp(-z2/2)
Flavor of Privacy Proof
AB
![Page 22: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/22.jpg)
Reduce to previous case via “dense model theorem” [MPRV09]
z
Extension to (,d)-dp mechanisms
Y
(, d )-dp
(, d )-dp
Y’
d - close(,0)-dp
![Page 23: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/23.jpg)
Composition Theorem : the -fold composition of -dp mechanisms is -dp
What is Bob’s lifetime exposure risk? Eg, 10,000 -dp or -dp databases, for lifetime cost of -dp What should be the value of ? 1/801 OMG, that is small! Can we do better?
Can answer low-sensitivity queries with distortion o) Tight [Dinur-Nissim’03 &ff.]
Can answer low-sensitivity queries with distortion o(n) Tight? No. And Yes.
![Page 24: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/24.jpg)
Part 1: Basics Smoking causes cancer Definition Laplace mechanism Simple composition Histogram example Advanced composition
Part 2: Many Queries Sparse Vector Multiplicative Weights Boosting for queries
Part 3: Techniques Exponential mechanism and application Subsample-and-Aggregate Propose-Test-Release Application of S&A and PTR combined
Future Directions
Outline
![Page 25: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/25.jpg)
Many Queries
Sparse Vector; Private Multiplicative Weights, Boosting for Queries
![Page 26: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/26.jpg)
Counting Queries Arbitrary Low-Sensitivity Queries
Offline
Error [Blum-Ligett-Roth’08]
Runtime Exponential in |U|-dp
Error [D.-Rothblum-Vadhan‘10]
Runtime Exp(|U|)
Online
Error [Hardt-Rothblum’10]
Runtime Polynomial in |U|
Caveat: Omitting polylog(various things, some of them big) terms
Error [Hardt-Rothblum]Runtime Exp(|U|)
![Page 27: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/27.jpg)
Sparse Vector Database size # Queries , eg, super-polynomial in # “Significant” Queries
For now: Counting queries only Significant: count exceeds publicly known threshold
Goal: Find, and optionally release, counts for significant queries, paying only for significant queries
insignificant insig insignificant insig insig
![Page 28: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/28.jpg)
Algorithm and Privacy AnalysisAlgorithm:When given query :• If : [insignificant]
– Output • Otherwise [significant]
– Output
• First attempt: It’s obvious, right?– Number of significant queries invocations of
Laplace mechanism– Can choose so as to get error
Caution: Conditional branch
leaks private information!
Need noisy threshold
[Hardt-Rothblum]
![Page 29: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/29.jpg)
Algorithm:When given query :• If : [insignificant]
– Output • Otherwise [significant]
– Output
• Intuition: counts far below T leak nothing– Only charge for noisy counts in this range:
𝑇 ∞
Caution: Conditional branch
leaks private information!
Algorithm and Privacy Analysis
![Page 30: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/30.jpg)
Let • denote adjacent databases• denote distribution on transcripts on input • denote distribution on transcripts on input
1. Sample 2. Consider 3. Show
Fact: (3) implies -differential privacy
![Page 31: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/31.jpg)
Write as
“privacy loss in round ”
Define borderline event on noiseas “a potential query release on ”
Analyze privacy loss inside and outside of
![Page 32: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/32.jpg)
Borderline eventCase
𝑓 𝑡 (𝑥)+𝐿𝑎𝑝 (𝜎 )
𝑓 𝑡 (𝑥)
Borderline event
Properties1. Conditioned on round t is a release with prob 2. Conditioned on we have 3. Conditioned on we have
Release condition:
𝐿𝑎𝑝 (𝜎 )>𝑎
𝑇
Definition of Mass to the left of = Mass to the right of
![Page 33: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/33.jpg)
Borderline eventCase
𝑓 𝑡 (𝑥)+𝐿𝑎𝑝 (𝜎 )
𝑓 𝑡 (𝑥)
Borderline event
Properties1. Conditioned on round t is a release with prob 2. Conditioned on we have 3. Conditioned on we have
Release condition:
𝐿𝑎𝑝 (𝜎 )>𝑎
𝑇
Definition of Mass to the left of = Mass to the right of
Think about x’ s.t.
![Page 34: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/34.jpg)
Borderline eventCase
𝑓 𝑡 (𝑥)+𝐿𝑎𝑝 (𝜎 )
𝑓 𝑡 (𝑥)
Borderline event
Properties1. Conditioned on round t is a release with prob 2. Conditioned on we have 3. (vacuous: Conditioned on we have )
Release condition:
𝑇
![Page 35: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/35.jpg)
Properties1. Conditioned on round t is a release with prob 2. Conditioned on we have 3. Conditioned on we have
𝑃 (𝑣𝑡|𝑣¿𝑡 ¿=Pr [𝐵𝑡|𝑣¿𝑡 ] 𝑃 (𝑣𝑡|𝐵𝑡 ,𝑣¿ 𝑡 ¿+Pr [𝐵𝑡|𝑣¿ 𝑡 ] 𝑃 (𝑣𝑡|𝐵𝑡 ,𝑣¿𝑡 ¿
By (2,3),UL,+Lemma, )
By (1), E[#borderline rounds] = #releases
𝑄 (𝑣𝑡|𝑣¿ 𝑡 ¿=Pr [𝐵𝑡|𝑣¿ 𝑡 ]𝑄 (𝑣𝑡|𝐵𝑡 ,𝑣¿ 𝑡 ¿+Pr [𝐵𝑡|𝑣¿ 𝑡 ]𝑄(𝑣𝑡∨𝐵𝑡 ,𝑣¿𝑡 )
![Page 36: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/36.jpg)
Wrapping Up: Sparse Vector Analysis
Probability of (significantly) exceeding expected number of borderline events is negligible (Chernoff)
Assuming not exceeded: Use Azuma to argue that whp actual total loss does not significantly exceed expected total loss
Utility: With probability at least all errors are bounded by . Choose
Expected total privacy loss
![Page 37: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/37.jpg)
Theorem (Main). There is an -differentially private mechanism answering linear online queries over a universe U and database of size in time per query error .
Private Multiplicative Weights [Hardt-Rothblum’10]
Represent database as (normalized) histogram on U
![Page 38: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/38.jpg)
Recipe (Delicious privacy-preserving mechanism):Maintain public histogram (with uniform)For each Receive query Output if it’s already accurate answer Otherwise, output
and “improve” histogram
How to improve ?
Multiplicative Weights
![Page 39: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/39.jpg)
. . .1 2 3 N4 5
0
1
Input
Estimate
Query
Before updateSuppose
![Page 40: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/40.jpg)
. . .1 2 3 N4 5
0
1
After update
x 1.3 x 1.3 x 0.7 x 0.7 x 1.3 x 0.7
Input
Estimate
Query
![Page 41: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/41.jpg)
Algorithm:• Input histogram with • Maintain histogram with being uniform• Parameters
• When given query :– If : [insignificant]
• Output
– Otherwise [significant; update]
• Output
– where )
• Renormalize
![Page 42: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/42.jpg)
Analysis Utility Analysis
Few update rounds
Allows us to choose
Potential argument [Littlestone-Warmuth’94]
Uses linearity
Privacy Analysis
Same as in Sparse Vector!
![Page 43: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/43.jpg)
Counting Queries Arbitrary Low-Sensitivity Queries
Offline
Error [Blum-Ligett-Roth’08]
Runtime Exponential in |U|-dp
Error [D.-Rothblum-Vadhan‘10]
Runtime Exp(|U|)
Online
Error [Hardt-Rothblum’10]
Runtime Polynomial in |U|
Caveat: Omitting polylog(various things, some of them big) terms
Error [Hardt-Rothblum]Runtime Exp(|U|)
![Page 44: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/44.jpg)
Boosting [Schapire, 1989] General method for improving accuracy of any given learning
algorithm
Example: Learning to recognize spam e-mail “Base learner” receives labeled examples, outputs heuristic Run many times; combine the resulting heuristics
![Page 45: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/45.jpg)
Base Learner
S: Labeled examples from D
A1, A2, …
Update D
A
Combine A1, A2, …
Does well on ½ + ´ of D
Terminate?
![Page 46: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/46.jpg)
Base Learner
S: Labeled examples from D
A1, A2, …
Update D
A
Combine A1, A2, …
Does well on ½ + ´ of D
How? Terminate?
Base learner only sees
samples, not
all of D
![Page 47: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/47.jpg)
Boosting for Queries? Goal: Given database x and a set Q of low-sensitivity queries,
produce an object O such that 8 q 2 Q : can extract from O an approximation of q(x).
Assume existence of (²0, ±0)-dp Base Learner producing an
object O that does well on more than half of D Pr q » D [ |q(O) – q(DB)| < ¸ ] > (1/2 + ´)
![Page 48: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/48.jpg)
Base Learner
S: Labeled examples from D
A1, A2, …
Update D
A
Combine A1, A2, …
Initially:D uniform on Q
Does well on ½ + ´ of D
![Page 49: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/49.jpg)
. . .1 2 3 |Q |4 5
0
1
(truth)
D𝑡
Before update
![Page 50: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/50.jpg)
. . .1 2 3 4 5
0
1
After update
x 0.7 x 1.3 x 0.7 x 0.7 x 1.3 x 0.7
(truth)
D𝑡+1
increased where disparity is large, decreased elsewhere
|Q |
![Page 51: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/51.jpg)
-1/+1renormalize
Base Learner
S: Labeled examples from D
A1, A2, …
Update D
A
Combine A1, A2, …median
Initially:D uniform on Q
Does well on ½ + ´ of D
Privacy?
Individual can affectmany queries at once!
Terminate?
![Page 52: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/52.jpg)
Privacy is Problematic In boosting for queries, an individual can affect the quality of
q(At) simultaneously for many q As time progresses, distributions on neighboring databases
could evolve completely differently, yielding very different distributions (and hypotheses At) Must keep secret! Ameliorated by sampling – outputs don’t reflect “too much” of the
distribution Still problematic: one individual can affect quality of all sampled
queries
![Page 53: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/53.jpg)
Privacy?
λ
Error ofSt on q
Queries q∈Q
Error xError x’
“Good enough”
![Page 54: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/54.jpg)
Privacy???
λ
Weightof q
Queries q∈Q
by xD 𝒕
by x’
![Page 55: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/55.jpg)
Private Boosting for Queries [Variant of AdaBoost] Initial distribution D is uniform on queries in Q S is always a set of k elements drawn from Qk
Combiner is median [viz. Freund92] Attenuated Re-Weighting
If very well approximated by At, decrease weight by factor of e (“-1”)
If very poorly approximated by At, increase weight by factor of e (“+1”) In between, scale with distance of midpoint (down or up):
2 ( |q(DB) – q(At)| - (¸ + ¹/2) ) / ¹ (sensitivity: 2½/¹)
+
(log |Q |3/2½ √k) / ²´4
Error increasing
![Page 56: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/56.jpg)
Private Boosting for Queries [Variant of AdaBoost] Initial distribution D is uniform on queries in Q S is always a set of k elements drawn from Qk
Combiner is median [viz. Freund92] Attenuated Re-Weighting
If very well approximated by At, decrease weight by factor of e (“-1”)
If very poorly approximated by At, increase weight by factor of e (“+1”) In between, scale with distance of midpoint (down or up):
2 ( |q(DB) – q(At)| - (¸ + ¹/2) ) / ¹ (sensitivity: 2½/¹)
+
(log |Q |3/2½ √k) / ²´4
Error increasing
Reweighting similar under x and x’Need lots of samples to detect difference
Adversary never gets hands on lots of samples
![Page 57: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/57.jpg)
Privacy???
λ
Pr of q
Queries q ∈ Q by xD 𝒕
by x’
![Page 58: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/58.jpg)
Agnostic as to Type Signature of Q Base Generator for Counting Queries [D.-Naor-Reingold-Rothblum-Vadhan’09]
Use noise for all queries An -dp process for collecting a set S of responses
Fit an n log|U|/log|Q| bit database to the set S; poly(|U|) -dp
Using a Base Generator for Arbitrary Real-Valued Queries Use noise for all queries
An -dp process for collecting a set S of responses Fit an n-element database; exponential in |U| -dp
![Page 59: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/59.jpg)
Analyzing Privacy Loss Know that “the epsilons and deltas add up”
T invocations of Base Generator (base , ±base) Tk samples from distributions (sample, ±sample)
k each from D1, … , DT
Fair because samples in iteration t are mutually independent and distribution being sampled depends only on A1, …, A t-1 (public)
Improve on (T base + Tk sample , T±base +Tk ±sample)-dp via composition theorem
![Page 60: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/60.jpg)
Counting Queries Arbitrary Low-Sensitivity Queries
Offline
Error [Blum-Ligett-Roth’08]
Runtime Exponential in |U|-dp
Error [D.-Rothblum-Vadhan‘10]
Runtime Exp(|U|)
Online
Error [Hardt-Rothblum’10]
Runtime Polynomial in |U|
Caveat: Omitting polylog(various things, some of them big) terms
Error [Hardt-Rothblum]Runtime Exp(|U|)
![Page 61: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/61.jpg)
Non-Trivial Accuracy with -DP
Stateless Mechanism Statefull Mechanism
Barrier at [D.,Naor,Vadhan]
![Page 62: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/62.jpg)
Non-Trivial Accuracy with -DP
Independent Mechanisms Statefull Mechanism
Barrier at [D.,Naor,Vadhan]
![Page 63: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/63.jpg)
Non-Trivial Accuracy with -DP
Independent Mechanisms Statefull Mechanism
Moral: To handle many databases must relax adversary assumptions or introduce coordination
![Page 64: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/64.jpg)
Part 1: Basics Smoking causes cancer Definition Laplace mechanism Simple composition Histogram example Advanced composition
Part 2: Many Queries Sparse Vector Multiplicative Weights Boosting for queries
Part 3: Techniques Exponential mechanism and application Subsample-and-Aggregate Propose-Test-Release Application of S&A and PTR combined
Future Directions
Outline
![Page 65: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/65.jpg)
Discrete-Valued Functions
Strings, experts, small databases, … Each has a utility for , denoted
Exponential Mechanism [McSherry-Talwar’07] Output with probability
![Page 66: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/66.jpg)
Exponential Mechanism AppliedMany (fractional) counting queries [Blum, Ligett, Roth’08]: Given -row database , set of properties, produce a synthetic database giving good approx to “What fraction of rows of satisfy property ?” .
is set of all databases of size |
-1/3
-1/310
-7/286-62/4589
-1/100000
![Page 67: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/67.jpg)
Counting Queries Arbitrary Low-Sensitivity Queries
Offline
Error [Blum-Ligett-Roth’08]
Runtime Exponential in |U|-dp
Error [D.-Rothblum-Vadhan‘10]
Runtime Exp(|U|)
Online
Error [Hardt-Rothblum’10]
Runtime Polynomial in |U|
What happened in 2009?
Error [Hardt-Rothblum]Runtime Exp(|U|)
![Page 68: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/68.jpg)
High/Unknown Sensitivity Functions Subsample-and-Aggregate [Nissim, Raskhodnikova, Smith’07]
![Page 69: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/69.jpg)
𝑥1 , 𝑥2 ,…,𝑥𝑛 /2
Functions “Expected” to Behave Well Propose-Test-Release [D.-Lei’09]
Privacy-preserving test for “goodness” of data set Eg, low local sensitivity [Nissim-Raskhodnikova-Smith07]
Big gap
……
Robust statistics theory: Lack of density at median is the only thing that can go wrong
… …𝑥1+𝑛 /2 ,…, 𝑥𝑛
PTR: Dp test for low sensitivity median (equivalently, for high density)if good, then release median with low noiseelse output (or use a sophisticated dp median algorithm)
⏟
![Page 70: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/70.jpg)
𝑥1 , 𝑥2 ,…….Application: Feature Selection
𝐵1 𝐵2 𝐵3 𝐵𝑇
𝑆1 𝑆2 𝑆3 𝑆𝑇
If “far” from collection with no large majority value, then output most common value.Else quit.
![Page 71: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/71.jpg)
Future Directions Realistic Adversaries(?)
Related: better understanding of the guarantee What does it mean to fail to have -dp? Large values of can make sense!
Coordination among curators? Efficiency
Time complexity: Connection to Tracing Traitors Sample complexity / database size
Is there an alternative to dp? Axiomatic approach [Kifer-Lin’10]
Focus on a specific application Collaborative effort with domain experts
![Page 72: The Promise of Differential Privacy](https://reader036.vdocuments.net/reader036/viewer/2022081512/568136df550346895d9e7aa5/html5/thumbnails/72.jpg)
0 R 2R 3R 4R 5R-R-2R-3R-4R
Thank You!