private analysis of graph structure

20
Private Analysis of Graph Structure With Vishesh Karwa, Sofya Raskhodnikova and Adam Smith Pennsylvania State University Grigory Yaroslavtsev http://grigory.us 1

Upload: ryo

Post on 05-Jan-2016

64 views

Category:

Documents


1 download

DESCRIPTION

Private Analysis of Graph Structure. Grigory Yaroslavtsev http://grigory.us. With Vishesh Karwa , Sofya Raskhodnikova and Adam Smith Pennsylvania State University. Publishing network data. Many data sets can be represented as a graph : Friendship in online social network - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Private Analysis of Graph Structure

1

Private Analysis of Graph Structure

With Vishesh Karwa, Sofya Raskhodnikova and Adam Smith

Pennsylvania State University

Grigory Yaroslavtsev http://grigory.us

Page 2: Private Analysis of Graph Structure

2

• Publish information about a graph

• Preserve privacy of relationships

Publishing network dataMany data sets can be represented as a graph:• Friendship in online social network• Financial transactions • Romantic relationships

American J. Sociology,  Bearman, Moody, Stovel

Naïve approach: anonymization

Page 3: Private Analysis of Graph Structure

3

Goal: Publish structural information about a graph

Publishing network data

DatabaseRelationships Users

Aqueries

answers)(

Government,researchers,businesses

(or) Maliciousadversary

• Anonymization not sufficient [Backström, Dwork, Kleinberg ’07, Narayanan, Shmatikov ’09, Narayanan, Shi, Rubinstein ’11]• Ideal: Algorithms with rigorous privacy guarantee, no assumptions about attacker’s prior information/algorithm

Page 4: Private Analysis of Graph Structure

4

• Limits incremental information by hiding presence/absence of an individual relationship

DatabaseRelationships Users

Aqueries

answers)(

Government,researchers,businesses

(or) Maliciousadversary

Differential privacy [Dwork, McSherry, Nissim, Smith ’06]

• Neighbors: Graphs G and G’ that differ in one edge• Answers on neighboring graphs should be similar

Page 5: Private Analysis of Graph Structure

5

-differential privacy (edge privacy)For all pairs of neighbors and all events S:

Differential privacy for relationships

𝑃𝑟 [ 𝐴 (𝐺 )∈𝑺 ]≤𝑒𝜖Pr [𝐴 (𝐺′ )∈𝑺 ]

A(G) A(G’)

• Probability is over the randomness of A• Definition requires that the distributions are close:

Page 6: Private Analysis of Graph Structure

6

For graphs G and H: # of occurrences of H in GSubgraph counts

Example:

Total: 40

Total: 2

Total: 1

Triangle:

2-star:

2-triangle:

k-star

…k

k-triangle

k…

Page 7: Private Analysis of Graph Structure

7

• Subgraph counts are used in:– Exponential random graph models– Descriptive graph statistics, e.g.:

Clustering coefficient =

• Our focus: efficient differentially private algorithms for releasing subgraph counts

Subgraph counts

##

Page 8: Private Analysis of Graph Structure

8

Previous work• Smooth Sensitivity [Nissim, Raskhodnikova, Smith ‘07]– Differentially private algorithm for triangles– Open: private algorithms for other subgraphs?

• Private queries with joins [Rastogi, Hay, Miklau, Suciu ‘09]– Works for a wide range of subgraphs– Weaker privacy guarantee, applies only for specific

class of adversaries• Private degree sequence [Hay, Li, Miklau, Jensen ’09]– Guarantees differential privacy– Works for k-stars, but not for other subgraphs

Page 9: Private Analysis of Graph Structure

9

Laplace Mechanism and Sensitivity[Dwork, McSherry, Nissim, Smith ‘06]

• Add noise: mean = 0, standard deviation , where is sensitivity => -differential privacy:

• Local sensitivity ([NRS’07], not differentially private!):

• Previous work (mostly): Global sensitivity

𝑓 ′ (𝐺 )= 𝑓 (𝐺 )+𝐿𝑎𝑝 (𝑺 𝒇 /𝝐)

𝐿𝑆𝑓 (𝐺 )= max𝐺′ :𝐍𝐞𝐢𝐠𝐡𝐛𝐨𝐫 𝑜𝑓 𝐺

|𝑓 (𝐺 )− 𝑓 (𝐺′ )|

differentially private!

Page 10: Private Analysis of Graph Structure

10

𝑆 𝑓∗( G )

𝐿𝑆 𝑓 ( G )

Instance-Specific Noise = set of all graphs on n vertices. d(G,G’) = # edges in which G and G’ differ.

• Add Cauchy noise: median = 0, median absolute value (where ) => -differential privacy:

• Naïve computation requires exponential time• [NRS’07]: Compute smooth sensitivity for triangles

Smooth Sensitivity [Nissim, Raskhodnikova, Smith ’07]: =

𝑓 ′ (𝐺 )= 𝑓 (𝐺 )+ h𝐶𝑎𝑢𝑐 𝑦 (𝑺 𝒇 ,𝜷∗ / 𝜷)

Page 11: Private Analysis of Graph Structure

11

Our contributions

• Differentially private algorithms for k-stars and k-triangles– Efficiently compute smooth sensitivity for k-stars– NP-hardness for k-triangles and k-cycles– Different approach for k-triangles

• Average-case analysis in G(n,p) • Theoretical comparison with previous work• Experimental evaluation

Page 12: Private Analysis of Graph Structure

12

Smooth Sensitivity for k-stars ( )

This paper: near-linear time algorithm for smooth sensitivity• Algorithm also reveals structural results, e.g.:– Proposition:

If ) and (maximum degree > then (smooth sensitivity) = (local sensitivity)

• Algorithm optimal for large class of graphs– Proposition: error > (local sensitivity)

• Compared to [HLMJ’09] (private degree sequence):– Our error never worse by more than a constant factor– For 2-stars, our error can be better by factor

Page 13: Private Analysis of Graph Structure

13

Private Approximation to Local Sensitivity: k-triangles ( )

Approximate differential privacy, -privacy [Dwork, Kenthapadi, McSherry, Mironov, Naor ‘06]:

Pr [ 𝐴 (𝐺 )∈𝑺 ]≤𝑒𝜖Pr [𝐴 (𝐺′ )∈𝑺 ]+𝛿

Idea: Private upper bound on local sensitivity ().Release: , ).

If• is -differentially private and • Then A is ()-differentially private.

Page 14: Private Analysis of Graph Structure

14

Evaluating our algorithms

• Theoretical evaluation in G(n,p) model– All of our algorithms have relative error -> 0

when the average degree = grows• Empirical evaluation– Synthetic graphs from G(n,p) model– Variety of real data sets

Page 15: Private Analysis of Graph Structure

Experimental results for G(n,p)• Comparison with previous work for

0 100 200 300 400 500 600 700 800 90010000.001

0.01

0.1

1

10 2-StarsLS Barrier

Our algorithms

HLMJ

RHMS Lower

RHMS upper

Relative Error = 1

5 % Relative Error

Nodes

Rela

tive

Med

ian

Erro

r

Page 16: Private Analysis of Graph Structure

Experimental results for G(n,p)• Comparison with previous work for

0 100 200 300 400 500 600 700 800 900 10000.1

13-Stars

LS Barrier

Our algorithms

HLMJ

RHMS Lower

RHMS upper

Relative Error = 1

20 % Relative Error

Nodes

Rela

tive

Med

ian

Erro

r

Page 17: Private Analysis of Graph Structure

Experimental results for G(n,p)

0 200 400 600 800 1000

10

2 Triangles

0 200 400 600 800 10001

100 Triangles

Rela

tive

Med

ian

Erro

r

01000

10

LS Barrier Our algorithms RHMS Lower

RHMS upper Relative Error = 1

• Comparison with [RHMS’09] for

Page 18: Private Analysis of Graph Structure

Experimental results (SNAP)

ca-G

rQc

ca-H

epTh

ca-C

ondM

at

ca-H

epPh

Emai

l-Enr

on

ca-A

stro

Ph

n=5K n=10K n=23K n=12K n=37K n=19Km=29K m=52K m=187K m=237K m=368K m=396K

0.0001

0.001

0.01

0.1

1

102-triangles triangles

Rela

tive

Med

ian

Erro

r

Page 19: Private Analysis of Graph Structure

Experimental results (SNAP)

ca-G

rQc

ca-H

epTh

ca-C

ondM

at

ca-H

epPh

Emai

l-Enr

on

ca-A

stro

Ph

n=5K n=10K n=23K n=12K n=37K n=19Km=29K m=52K m=187K m=237K m=368K m=396K

0.0001

0.001

0.01

0.1 2-stars HLMJ - 2-stars

Rela

tive

Med

ian

Erro

r

Page 20: Private Analysis of Graph Structure

20

Summary

• Private algorithms for subgraph counts– Rigorous privacy guarantee (differential privacy)– Running time close to best algorithms for computing

the subgraph counts• Improvement in accuracy and (for some graph

counts) in privacy over previous work• Techniques:– Fast computation of smooth sensitivity– Differentially private upper bound on local sensitivity