privacy leak in egonets - sharad · kumar sharad, computer lab, university of cambridge george...
TRANSCRIPT
Privacy Leak in EgonetsKumar Sharad, Computer Lab, University of Cambridge
George Danezis, Microsoft Research Cambridge
http://www.cl.cam.ac.uk/~ks640/ [email protected]
Abstract
A major Telecom released anonymizedcommunication sub-graphs of a fewthousand randomly selectedsubscribers for scientific analysis. Toobfuscate the interactions between thesub-graphs the common nodes wererenamed. However, this is not enoughto preserve privacy.
Extracted Sub-graphs
The sub-graphs tend to have nodes incommon which can be re-identified byanalysing the graph topology.
Recovering the full communicationgraph could lead to severe privacybreaches.
Re-identification
By observing the neighbourhood ofnodes we can assign a matchprobability to pairs of nodes acrossego-nets. The nodes at a distance of1-hop from the ego are easier tore-identify.
After we have a match score betweenall possible pairs across two ego-nets,an optimal matching can be found.
Results
I Almost all the common 1-hop nodeswere re-identified with over 98%success probability.
I A significant portion of the 2-hopnodes were re-identified with over 75%(often over 90%) success probability.
Conclusion
I It is hard to release high dimensionaldata whilst preserving privacy.
I Anonymizing real world data does notwork in general.