privacy leak in egonets - sharad · kumar sharad, computer lab, university of cambridge george...

Privacy Leak in Egonets Kumar Sharad, Computer Lab, University of Cambridge George Danezis, Microsoft Research Cambridge http://www.cl.cam.ac.uk/ ~ ks640/ [email protected] Abstract A major Telecom released anonymized communication sub-graphs of a few thousand randomly selected subscribers for scientific analysis. To obfuscate the interactions between the sub-graphs the common nodes were renamed. However, this is not enough to preserve privacy. Extracted Sub-graphs The sub-graphs tend to have nodes in common which can be re-identified by analysing the graph topology. Recovering the full communication graph could lead to severe privacy breaches. Re-identification By observing the neighbourhood of nodes we can assign a match probability to pairs of nodes across ego-nets. The nodes at a distance of 1-hop from the ego are easier to re-identify. After we have a match score between all possible pairs across two ego-nets, an optimal matching can be found. Results I Almost all the common 1-hop nodes were re-identified with over 98% success probability. I A significant portion of the 2-hop nodes were re-identified with over 75% (often over 90%) success probability. Conclusion I It is hard to release high dimensional data whilst preserving privacy. I Anonymizing real world data does not work in general.

Upload: others

Post on 21-Sep-2020

1 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Privacy Leak in EgonetsKumar Sharad, Computer Lab, University of Cambridge

George Danezis, Microsoft Research Cambridge

http://www.cl.cam.ac.uk/~ks640/ [email protected]

Abstract

A major Telecom released anonymizedcommunication sub-graphs of a fewthousand randomly selectedsubscribers for scientific analysis. Toobfuscate the interactions between thesub-graphs the common nodes wererenamed. However, this is not enoughto preserve privacy.

Extracted Sub-graphs

The sub-graphs tend to have nodes incommon which can be re-identified byanalysing the graph topology.

Recovering the full communicationgraph could lead to severe privacybreaches.

Re-identification

By observing the neighbourhood ofnodes we can assign a matchprobability to pairs of nodes acrossego-nets. The nodes at a distance of1-hop from the ego are easier tore-identify.

After we have a match score betweenall possible pairs across two ego-nets,an optimal matching can be found.

Results

I Almost all the common 1-hop nodeswere re-identified with over 98%success probability.

I A significant portion of the 2-hopnodes were re-identified with over 75%(often over 90%) success probability.

Conclusion

I It is hard to release high dimensionaldata whilst preserving privacy.

I Anonymizing real world data does notwork in general.

http://www.cl.cam.ac.uk/~ks640/

Vedanta Adhyayana Mandali Sharad Navaratri …...Vedanta Adhyayana Mandali Sharad Navaratri Sessions The inaugural class of the Sharad Navaratri Sessions of the Vedanta Course organized

Sharad Sir Leader

Stress & Rehabilitation Dr. Sharad

Sharad v Alwaleed