machine learning for data science (cs4786) lecture 26

91
Machine Learning for Data Science (CS4786) Lecture 26 Differential Privacy and Machine Learning Machine Learning for Data Science (CS4786) Lecture 23

Upload: others

Post on 25-Apr-2022

2 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Machine Learning for Data Science (CS4786) Lecture 26

Machine Learning for Data Science (CS4786)

Lecture 26

Differential Privacy and Machine Learning

Course Webpage :

http://www.cs.cornell.edu/Courses/cs4786/2017fa/

Machine Learning for Data Science (CS4786)Lecture 23

Approximate Inference

Course Webpage :http://www.cs.cornell.edu/Courses/cs4786/2017fa/

Page 2: Machine Learning for Data Science (CS4786) Lecture 26

Competition• Clustering Wikipedia articles: (11039 articles)

• Data provided:

• Text of articles: Both raw data and feature extracted data provided

• Graph linking articles to one another

• Competition is hosted on Vocareum

• Vocareum submission + 5 page report due at end of exam week

Page 3: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data

Page 4: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data• By definition, ML is the task of automatically learning from

examples or instances (Data)

Page 5: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data• By definition, ML is the task of automatically learning from

examples or instances (Data)

• Often privacy concerns about Data used:

Page 6: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data• By definition, ML is the task of automatically learning from

examples or instances (Data)

• Often privacy concerns about Data used:

• Medical records of patients (Eg. learn how much smoking affects chances of getting cancer)

Page 7: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data• By definition, ML is the task of automatically learning from

examples or instances (Data)

• Often privacy concerns about Data used:

• Medical records of patients (Eg. learn how much smoking affects chances of getting cancer)

• User search logs (Eg. learning personalized query retrieval for searches)

Page 8: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data• By definition, ML is the task of automatically learning from

examples or instances (Data)

• Often privacy concerns about Data used:

• Medical records of patients (Eg. learn how much smoking affects chances of getting cancer)

• User search logs (Eg. learning personalized query retrieval for searches)

• Genetic information (Eg. to learn genetic predispositions)

Page 9: Machine Learning for Data Science (CS4786) Lecture 26

AOL Data Release• AOL released internet query dataset (after anonymizing)

Page 10: Machine Learning for Data Science (CS4786) Lecture 26

AOL Data Release• AOL released internet query dataset (after anonymizing)

Page 11: Machine Learning for Data Science (CS4786) Lecture 26
Page 12: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

Page 13: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

Page 14: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Given ratings by users for some movies

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1

Page 15: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Given ratings by users for some movies

• Predict remaining ratings

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 16: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Given ratings by users for some movies

• Predict remaining ratings

• Though users were anonymized, some users on dataset were identified

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 17: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Given ratings by users for some movies

• Predict remaining ratings

• Though users were anonymized, some users on dataset were identified

•How?!!!

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 18: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 19: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Some of the users posted reviews (for few movies) on IMDB

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 20: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Some of the users posted reviews (for few movies) on IMDB

• Only a very small overlap with IMDB was required

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 21: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Some of the users posted reviews (for few movies) on IMDB

• Only a very small overlap with IMDB was required

• You pretty much get the persons viewing record from /Netflix without consent

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 22: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML

Page 23: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

Page 24: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

• Of course in these cases data set was released.

Page 25: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

• Of course in these cases data set was released.

• What if we didn’t release data set:

Page 26: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

• Of course in these cases data set was released.

• What if we didn’t release data set:

• Trusted party uses data to learn classifiers or general statistics

Page 27: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

• Of course in these cases data set was released.

• What if we didn’t release data set:

• Trusted party uses data to learn classifiers or general statistics

• Only releases general statistics?

Page 28: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

• Of course in these cases data set was released.

• What if we didn’t release data set:

• Trusted party uses data to learn classifiers or general statistics

• Only releases general statistics?

• Or classifier learnt from data?

Page 29: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment

Page 30: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

Page 31: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

Page 32: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

Page 33: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

• Likelihood of Cancer in the two groups

Page 34: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

• Likelihood of Cancer in the two groups

• Average number of jobs held by people in the two groups

Page 35: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

• Likelihood of Cancer in the two groups

• Average number of jobs held by people in the two groups

What is the problem?

Page 36: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

• Likelihood of Cancer in the two groups

• Average number of jobs held by people in the two groups

Page 37: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

• Likelihood of Cancer in the two groups

• Average number of jobs held by people in the two groups

Say “Fill Nates’’ from WA was in the dataset, and is very very rich.

Page 38: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

Page 39: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

• Building classifier and releasing only the classifier

Page 40: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

• Building classifier and releasing only the classifier

• “Assume” chain smoking has some correlation with lower income

Page 41: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

• Building classifier and releasing only the classifier

• “Assume” chain smoking has some correlation with lower income

• Say we have classifier from two or more counties/hospital, one of them has “Fill Nates”

Page 42: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

• Building classifier and releasing only the classifier

• “Assume” chain smoking has some correlation with lower income

• Say we have classifier from two or more counties/hospital, one of them has “Fill Nates”

• Say we use regression for learning the classifier

Page 43: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

• Building classifier and releasing only the classifier

• “Assume” chain smoking has some correlation with lower income

• Say we have classifier from two or more counties/hospital, one of them has “Fill Nates”

• Say we use regression for learning the classifier

• By looking at weight put on income column of dataset, we can infer if “Fill Nates” was part of study and which hospital

Page 44: Machine Learning for Data Science (CS4786) Lecture 26

Defining Privacy

Page 45: Machine Learning for Data Science (CS4786) Lecture 26

Defining Privacy

Dataset +

Page 46: Machine Learning for Data Science (CS4786) Lecture 26

Defining Privacy

Dataset + Learning Algorithm

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Distribution of outcome

Page 47: Machine Learning for Data Science (CS4786) Lecture 26

Defining Privacy

Dataset + Learning Algorithm

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Dataset + Learning Algorithm

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Distribution of outcome

Distribution of outcome

Page 48: Machine Learning for Data Science (CS4786) Lecture 26

Defining Privacy

Dataset + Learning Algorithm

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Dataset + Learning Algorithm

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Distribution of outcome

Distribution of outcome

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Similar

Page 49: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Page 50: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy• A deterministic algorithm cannot preserve privacy

Page 51: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy• A deterministic algorithm cannot preserve privacy

• Say S = (Data1,…,Datan) is the data provided to learning algorithm (be it clustering, supervised learning etc).

Page 52: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy• A deterministic algorithm cannot preserve privacy

• Say S = (Data1,…,Datan) is the data provided to learning algorithm (be it clustering, supervised learning etc).

• Say (randomized) learning algorithm A takes this training data and returns solution as A(S)

Page 53: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy• A deterministic algorithm cannot preserve privacy

• Say S = (Data1,…,Datan) is the data provided to learning algorithm (be it clustering, supervised learning etc).

• Say (randomized) learning algorithm A takes this training data and returns solution as A(S)

• Algorithm A is (𝜖,ẟ)- differentially private if for all samples S and S’ that only differ by one data point and any set C

P (A(S) 2 C) e✏P (A(S0) 2 C) + �

Page 54: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy• A deterministic algorithm cannot preserve privacy

• Say S = (Data1,…,Datan) is the data provided to learning algorithm (be it clustering, supervised learning etc).

• Say (randomized) learning algorithm A takes this training data and returns solution as A(S)

• Algorithm A is (𝜖,ẟ)- differentially private if for all samples S and S’ that only differ by one data point and any set C

P (A(S) 2 C) e✏P (A(S0) 2 C) + �

• ẟ=0 is called pure differential privacy

Page 55: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Page 56: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Formally defining privacy

S

CDataset + Dataset +

Page 57: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Formally defining privacy

S

CDataset + Dataset +

P (A(S) 2 C) e✏P (A(S0) 2 C) + �

Page 58: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Formally defining privacy

S

CDataset + Dataset +

P (A(S) 2 C) e✏P (A(S0) 2 C) + �

1 (for small 𝜖)

Page 59: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Page 60: Machine Learning for Data Science (CS4786) Lecture 26

• Any deterministic algorithm either has to produce constant outcome (making it useless)

Differential Privacy

Page 61: Machine Learning for Data Science (CS4786) Lecture 26

• Any deterministic algorithm either has to produce constant outcome (making it useless)

• If it doesn’t, define C to be the singleton set of outcome under say S.

Differential Privacy

Page 62: Machine Learning for Data Science (CS4786) Lecture 26

• Any deterministic algorithm either has to produce constant outcome (making it useless)

• If it doesn’t, define C to be the singleton set of outcome under say S.

• Then probability of this set under S’ is 0

Differential Privacy

Page 63: Machine Learning for Data Science (CS4786) Lecture 26

• Any deterministic algorithm either has to produce constant outcome (making it useless)

• If it doesn’t, define C to be the singleton set of outcome under say S.

• Then probability of this set under S’ is 0

• But under S probability is 1

Differential Privacy

Page 64: Machine Learning for Data Science (CS4786) Lecture 26

• Any deterministic algorithm either has to produce constant outcome (making it useless)

• If it doesn’t, define C to be the singleton set of outcome under say S.

• Then probability of this set under S’ is 0

• But under S probability is 1

• Hence cannot be differentially private

Differential Privacy

Page 65: Machine Learning for Data Science (CS4786) Lecture 26

Obtaining Differential Privacy

Page 66: Machine Learning for Data Science (CS4786) Lecture 26

Obtaining Differential Privacy

• Typical mechanism: Add noise to outcome or inside algorithm

Page 67: Machine Learning for Data Science (CS4786) Lecture 26

Obtaining Differential Privacy

• Typical mechanism: Add noise to outcome or inside algorithm

• More privacy we want the more noise we add

Page 68: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example I

Page 69: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example I

• First lets begin with the example of releasing mean incomes (smokers Vs non-smokers)

Page 70: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example I

• First lets begin with the example of releasing mean incomes (smokers Vs non-smokers)

• Say incomes I1,…,In are the income of subjects in the sample

Page 71: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example I

• First lets begin with the example of releasing mean incomes (smokers Vs non-smokers)

• Say incomes I1,…,In are the income of subjects in the sample

• First compute mean M =1

n

nX

t=1

It

Page 72: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example I

• First lets begin with the example of releasing mean incomes (smokers Vs non-smokers)

• Say incomes I1,…,In are the income of subjects in the sample

• First compute mean

• Add noise to it M + max_income Laplace(0,1)/ n 𝜖

M =1

n

nX

t=1

It

Page 73: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

Page 74: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

• Take any arbitrary (possibly deterministic) function f(S).

Page 75: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

• Take any arbitrary (possibly deterministic) function f(S).

• Say where S and S’ differ on one data point

B = maxS,S0

|f(S)� f(S0)|<latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit>

Page 76: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

• Take any arbitrary (possibly deterministic) function f(S).

• Say where S and S’ differ on one data point

• A(S) = f(S) + B Laplace(0,1)/𝜖

B = maxS,S0

|f(S)� f(S0)|<latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit>

Page 77: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

• Take any arbitrary (possibly deterministic) function f(S).

• Say where S and S’ differ on one data point

• A(S) = f(S) + B Laplace(0,1)/𝜖

• A is (𝜖,0)-differentially private

B = maxS,S0

|f(S)� f(S0)|<latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit>

Page 78: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

P (A(S) = x)

P (A(S0) = x)=

e�✏|f(S)�x|/B

e�✏|f(S0)�x|/B

= e✏|f(S0)�x|�|f(S)�x|

B

e✏|f(S0)�f(S)|

B

e✏

p(x; f(S))

p(x; f(S0))<latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit>

Page 79: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

P (A(S) = x)

P (A(S0) = x)=

e�✏|f(S)�x|/B

e�✏|f(S0)�x|/B

= e✏|f(S0)�x|�|f(S)�x|

B

e✏|f(S0)�f(S)|

B

e✏

p(x; f(S))

p(x; f(S0))<latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit>

Page 80: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

P (A(S) = x)

P (A(S0) = x)=

e�✏|f(S)�x|/B

e�✏|f(S0)�x|/B

= e✏|f(S0)�x|�|f(S)�x|

B

e✏|f(S0)�f(S)|

B

e✏

p(x; f(S))

p(x; f(S0))<latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit>

Page 81: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

P (A(S) = x)

P (A(S0) = x)=

e�✏|f(S)�x|/B

e�✏|f(S0)�x|/B

= e✏|f(S0)�x|�|f(S)�x|

B

e✏|f(S0)�f(S)|

B

e✏

p(x; f(S))

p(x; f(S0))<latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit>

Page 82: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

P (A(S) = x)

P (A(S0) = x)=

e�✏|f(S)�x|/B

e�✏|f(S0)�x|/B

= e✏|f(S0)�x|�|f(S)�x|

B

e✏|f(S0)�f(S)|

B

e✏

p(x; f(S))

p(x; f(S0))<latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit>

Page 83: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II

Page 84: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II

• For the classification/regression problem we can of course use the Laplace mechanism

Page 85: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II

• For the classification/regression problem we can of course use the Laplace mechanism

• Can we do better?

Page 86: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II

Page 87: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II• Yes! For instance for linear classifiers.

Page 88: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II• Yes! For instance for linear classifiers.

• Say we use SVM or logistic regression as follows:

f(S) = argminw1

n

nX

t=1

`(w>xt, yt) + �kwk2

Page 89: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II• Yes! For instance for linear classifiers.

• Say we use SVM or logistic regression as follows:

• It can be shown that if ||x||’s < 1:

f(S) = argminw1

n

nX

t=1

`(w>xt, yt) + �kwk2

kf(S)� f(S0)k 1

�n

Page 90: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II• Yes! For instance for linear classifiers.

• Say we use SVM or logistic regression as follows:

• It can be shown that if ||x||’s < 1:

• Add vector version of noise to f, only scale now is of order O(1/𝜖ƛn)

f(S) = argminw1

n

nX

t=1

`(w>xt, yt) + �kwk2

kf(S)� f(S0)k 1

�n

Page 91: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy in ML

• Differential private versions of PCA, clustering algorithms, deep learning etc. have been explored

• Nice properties of Differential Privacy

• post processing is ok

• compostability lemma

• Recently Differential Privacy was used as tool to allow statistically safe reuse of data