lecture 15 private information retrieval stefan dziembowski mim uw 25.01.13ver 1.0

37
Lecture 15 Private Information Retrieval Stefan Dziembowski www.dziembowski.net MIM UW 25.01.13 ver 1.0

Upload: ilene-mills

Post on 16-Dec-2015

225 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Lecture 15 Private Information Retrieval

Stefan Dziembowskiwww.dziembowski.net

MIM UW

25.01.13 ver 1.0

Page 2: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Plan

1. Motivation and definition2. Information-theoretic impossibility3. A construction of Kushilevitz and

Ostrovsky4. Overview of some other related

results

Page 3: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

AOL search data scandal (2006)

#4417749:• clothes for age 60 • 60 single men • best retirement city • jarrett arnold • jack t. arnold • jaylene and jarrett arnold• gwinnett county yellow pages • rescue of older dogs • movies for dogs

• sinus infection

Thelma Arnold62-year-old widowLilburn, Georgia

Page 4: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

ObservationThe owners of databases know a lot about the users!

This poses a risk to users’ privacy.

E.g. consider database with stock prices…

Can we do something about it?

We can:• trust them that they will protect our secrecy, or• use cryptography!

problematic

problematic!

Page 5: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

How can crypto help?

Note: this problem has nothing to do with secure communication!

user U database D

Page 6: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Our settings

user U database D

A new primitive:Private Information Retrieval (PIR)

Page 7: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Plan

1. Definition of PIR2. An ideal PIR doesn’t exist3. Construction of a computational PIR 4. Open problems

Literature:

• B. Chor, E. Kushilevitz, O. Goldreich and M. Sudan, Private Information Retrieval, Journal of ACM, 1998

• E. Kushilevitz and R. Ostrovsky Replication Is NOT Needed: SINGLE Database, Computationally-Private Information Retrieval, FOCS 1997

Page 8: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Question

How to protect privacy of queries?

user U database D

wants to retrieve somedata from D

shouldn’t learn what Uretrieved

Page 9: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Let’s make things simple!

B1 B2 … Bwindex i = 1,…,w

the user should learn Bi

Bi

?

each Bi є {0,1}

database B:

(he may also learn other Bi’s)

Page 10: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Trivial solution

B1 B2 … Bw

The database simply sends everything to the user!

Page 11: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Non-triviality

The previous solution has a drawback:the communication complexity is huge!

Therefore we introduce the following requirement:

“Non-triviality”:the number of bits communicated between U and D

has to be smaller than w.

Page 12: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

input:

Private Information Retrieval (PIR)

B1 B2 … Bwinput:index i = 1,…,w

• at the end the user learns Bi

• the database does not learn i

• the total communication is < w

Note: secrecy of the database is not required

correctness

secrecy (of the user)

non-triviality

This property needs to be defined more formally

polynomial time randomized interactive algorithms

Page 13: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

How to define secrecy of the user [1/2]?

i B

Def. T(i,B) – transcript of the conversation.

query Q(i)

reply A(Q(i),B)

For fixed i and BT(i,B)

is a random variable(since the parties are

randomized)

Page 14: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

multi-round case:

it is impossible to distinguish betweenT(i,B) and T(j,B)

even if the adversary is malicious

How to define secrecy of the user [2/2]?

Secrecy of the user: for every i,j є {0,1}

single-round case:

it is impossible to distinguish between Q(i) and Q(j)

?

For simplicity say that for any i and j

the distributions of T(i,B) and T(j,B)

have to be identical

Page 15: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Plan

1. Motivation and definition2. Information-theoretic impossibility3. A construction of Kushilevitz and

Ostrovsky4. Overview of some other related

results

Page 16: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

PIR doesn’t exists [1/4]

We now show that correctness, non-triviality and secrecy cannot be satisfied simultaneously.

Def: A transcript T is possible for (i,B) if P(T(i,B) = T) > 0

Take some T’, and look where it is possible:

T’ T’

T’ T’

indices i

data

base

s B

Page 17: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

PIR doesn’t exists [2/4]Observation: secrecy → if

T’ is possible for some B and i then

it is possible for B and all the other i’s

T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’

T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’

indices i

data

base

s B

T’ T’

T’ T’

Page 18: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

PIR doesn’t exists [3/4]

non-triviality → length(transcript) < length(database)↓

# transcripts < #databases↓

there has to exist T’ that is possible for two databases B0 and B1

T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’

T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’

data

base

s B

← B0

← B1

indices i

Page 19: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

PIR doesn’t exists [4/4]

B0 and B1 differ on at least one index i’So, if i’ is the input of the user then

correctness → contradiction

T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’

T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’ T’

data

base

s B

← B0

← B1

i’↓

indices i

Page 20: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

So PIR doesn’t exist!

• How to bypass the impossibility result?• Two ideas:

– limit the computing power of a cheating database

– use a larger number of “independent” databases

we show this

Page 21: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Plan

1. Motivation and definition2. Information-theoretic impossibility3. A construction of Kushilevitz and

Ostrovsky4. Overview of some other related

results

Page 22: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Computationally-secure PIR

secrecy:

For every i,j є {0,1}

it is impossible to distinguish efficiently

between T(i,B) and T(j,B)

?

computational-secrecy:

Formally: for every polynomial-time probabilistic algorithm A the value:|P(A(T(i,B)) = 0) – P(A(T(j,B))=0)|

should be negligible.

Page 23: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Hardness assumptions?

Kushilevitz and R. Ostrovsky Replication Is NOT Needed: SINGLE Database, Computationally-Private Information Retrieval, FOCS 1997

construct PIR based on theQuadratic Residuosity Assumption

Page 24: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Quadratic Residuosity Assumption (QRA)

QR(p)

QR(q)

ZN+:

Quadratic Residuosity Assumption (QRA):

For a random a є ZN+ it is computationally hard to determine if a є QR(N).

Formally: for every polynomial-time probabilistic algorithm G the value:|P(G(a) = Q(a)) – 0.5|

(where a is random) is negligible.

QNR(q)

QNR(p)? a є ZN

+ ↓

QR(N)

N=pq

Page 25: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Homomorphism of QR(pq)

Q(N,a) :=

Homomorphism: for all a,b є ZN+

Q(N,ab) = Q(N,a) xor Q(N,b)

1 if a є QR(N)

0 otherwise

Page 26: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

We are ready to construct PIR!

Our PIR will work in the group ZN+, where N=pq.

What’s so good about this group?:

testing membership in ZN+ is easy,

testing membership in QR(N) is hard for random elements on ZN

+, unless one knows p and q.

homomorphism of Q!

Page 27: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

First (wrong) ideai

QRX1

QRX2

...QRXi-1

NQRXi

QRXi+1

...QRXw-1

QRXw

B1 B2 ... Bi-1 Bi Bi+1 ... Bw-1 Bw

for every j = 1,...,w the database sets

Yj = Xj

2 if Bj = 0Xj otherwise{

QRY1

QRY2

...QRYi-1 Yi

QRYi+1

...QRYw-1

QRYw

i↓

Yi is a QR iff Bi=0

Set M = Y1 · Y2 · ... · YwM

M is a QR iff Bi=0

the user checksif M is a QR

Page 28: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Problems!

PIR from the previous slide:• correctness √• security? To learn i the database would need to distinguish NQR

from QR. √

QRX1

QRX2

...QRXi-1

NQRXi

QRXi+1

...QRXw-1

QRXw

• non-triviality? doesn’t hold!

communication: user → database: |B| · |Z*n|database → user: |Z*n|

Call it:(|B|, 1) - PIR

Page 29: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

How to fix it?

Idea:Given:

construct

Suppose that |B| = v2 and present B as a v×v-matrix:

B13 B14 B15 B16B9 B10 B11 B12B5 B6 B7 B8B1 B2 B3 B4

consider each row as a

separate database

Page 30: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

An improved idea

B13 B14 B15 B16

B9 B10 B11 B12

B5 B6 B7 B8

B1 B2 B3 B4

v

v

Let j be the column where Bi is.

In every “row” the user asks for the jth element

So, instead of sending v queries the user can send one!

Observe: in this way the user learnsall the elements in the jth column!

Bi

j↓

execute v(v,1) - PIRsin parallel

Looks even worse:communication:

user → database: v2 · |Z*n|database → user: v · |Z*n|

The method

Page 31: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Putting things together

B1 ... Bj-1 Bj Bj+1 ... Bv

Bi

... ... Bvv

i

QRX1

...QRXj-1

NQRXj

QRXj+1

...QRXv X1 ... Xj-1 Xj Xj+1 ... Xv

X1 ... Xj-1 Xj Xj+1 ... Xv

Y1 ... Yj-1 Yj Yj+1 ... Yv

... Yvv

M1

...

Mv

multiplyelements

in each row

kth row

M1

Mk

Mv

Bj=0 iff Mk is QR

for every j = 1,...,v set

Yj = Xj2 if Bj = 0

Xj otherwise{

jth column

here the same row is copied v times:

only thiscounts

Page 32: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

So we are done!PIR from the previous slide:• correctness √• non-triviality:

communication complexity = 2√|B| · |Zn| √• security? The to learn i the database would need to

distinguish NQR from QR.Formally:

fromany adversary that breaks our scheme

we can construct an algorithm that breaks QRA

Page 33: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Improvements

user U database D

(X1,…,Xv)

(M1,…,Mv)

the user is interestedjust in one Mi.

Idea: apply PIR recursively!

Page 34: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Plan

1. Motivation and definition2. Information-theoretic impossibility3. A construction of Kushilevitz and

Ostrovsky4. Overview of some other related

results

Page 35: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Complexity of PIRs – overview of the resultsCommunication:

• “recursive” PIR of [KO97]:for every c: O(|B|c)

• [Cachin, Micali, Stadler, 1999]:poly-logarithmic in |B|

• [Lipmaa, 2005]:O(log2|B|)

For practical analysis see:• [Sion, Carbunar]

On the Computational Practicality of Private Information Retrieval.

their conclusion:It is the time-complexity that matters.

In real-life:it is still more practical

to transmit the entire database.

Page 36: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

Extensions

• Symmetric PIR (also protect privacy of the database).[Gertner, Ishai, Kushilevitz, Malkin. 1998]

• Searching by key-words[Chor, Gilboa, Naor, 1997]

• Public-key encryption with key-word search[Boneh, Di Crescenzo, Ostrovsky, Persiano]

Page 37: Lecture 15 Private Information Retrieval Stefan Dziembowski  MIM UW 25.01.13ver 1.0

©2013 by Stefan Dziembowski. Permission to make digital or hard copies of part or all of this material is currently granted without fee provided that copies are made only for personal or classroom use, are not distributed for profit or commercial advantage, and that new copies bear this notice and the full citation.