private information retrieval yuval ishai computer science department technion

40
Private Information Retrieval Yuval Ishai Computer Science Department Technion

Upload: meagan-higgins

Post on 25-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Private Information Retrieval

Yuval IshaiComputer Science Department

Technion

Talk Overview

• Intro to PIR– Motivation and problem definition– Toy examples– State of the art

• Relation with other primitives– Locally Decodable Codes– (Oblivious Transfer, CRHF)

• Constructions

• Open problems

Private Information Retrieval (PIR) [CGKS95]

• Goal: allow a user to access a database while hiding what she is after.

• Motivation: patent databases, web searches, etc.• Paradox(?): imagine buying in a store without the

seller knowing what you buy. Note: Encrypting requests is useful against third parties;

not against server holding the data.

Server

User

n}1,0{x

},...,1{ ni

xi

???

Modeling

Some “solutions”

1. User downloads entire database. Drawback: n communication bits (vs. logn+1 w/o privacy).Main research goal: minimize communication complexity.

2. User masks i with additional random indices.Drawback: gives a lot of information about i.

3. Enable anonymous access to database.Addresses a different concern: hides identity of user, not the fact that xi is retrieved.

Fact: PIR as described so far requires (n) communication bits.

Two Approaches

Information-Theoretic PIR [CGKS95,Amb97,...] Replicate database among k servers.

Unconditional privacy against t servers.

Default: t=1

Computational PIR [KO97,CMS99,...] Computational privacy, based on cryptographic assumptions.

Model for I.T. PIR

S1

User

X

i

S2

X

Sk

xi

??? ??? ???

X

Information-Theoretic PIR for Dummies

S2

i

U

i

X

n1/2

n1/2

q2 q1

a2=X·q2 a1=X·q1

S1

q1 + q2 = ei

2-server PIR with O(n1/2) communication

a1+a2=X·ei

0 1 1 0 1 1 1 0 1 1 0 0 0 0 0 1

Computational PIR for DummiesTool: homomorphic encryption

Protocol:

a b a+b =

n1/2

n1/2

i

X=

• User sends E(ei)

E(0) E(0) E(1) E(0) (=c1 c2 c3 c4)

• Server replies with E(X·ei)c2c3

c1 c2c3

c1c2

c4

• User recovers ith column of X PIR with ~ O(n1/2) communication

Bounds for Computational PIR

servers comm. assumption

[CG97] 2 O(n) one-way function

[KO97] 1 O(n) QRA /

[CMS99] 1 polylog(n) -hiding

… DCRA [Lipmaa]

[KO00] 1 n-o(n) trapdoor permutation

homomorphic encryption

Bounds for I.T. PIR Upper bounds: – O(log n / loglog n) servers, polylog(n) [BF90,BFKR91,CGKS95]– 2 servers, O(n1/3); k servers, O(n1/k) [CGKS95]– k servers, O(n1/(2k-1)) [Amb97,Ito99, IK99, BI01,WY05]

• t-private, O(nt/(2k-1)) [BI01,WY05] – k servers, O(ncloglogk /(klogk)) [BIKR02].

Lower bounds:– log n +1 (no privacy)– 2 servers, ~5log n ; k servers, ck log n [Man98,WdW04]– Better for restricted 2-server protocols [CGKS95, GKST02, BFG02, KdW03, WdW04]

CGKS

AMB

IKBI

BIKR

efficient

“clean”

“dirty”

inefficient

WY

Why Information-Theoretic PIR?

Cons:• Requires multiple servers

• Privacy against limited collusions• Worse asymptotic complexity (with const. k):

O(nc) vs. polylog(n)

Pros:• Neat question

• Unconditional privacy

• Better “real-life” efficiency• Allows very short queries or very short answers (+apps [DIO98,BIM99] )

• Closely related to a natural coding problem [KT00]

Locally Decodable Codes [KT00]

Requirements:• High fault-tolerance• Local decoding

x y

i

Question: how large should m(n) be in a k-query LDC?

n}1,0{ m

k=2: 2(n) k=3: 2O(n^ 0.5) (n2)

Recover from m faults…

… with ½+ probability

From I.T. PIR to LDC

k-server PIR with -bit queries and -bit answers

k-query LDC of length 2

over ={0,1}

• Converse relation also holds.• Binary LDC PIR with one answer bit per server• Best known LDC are obtained from PIR protocols.

– const. q: m=exp(nc loglogq / qlogq)

y[q]=Answer(x,q)

A Question about MPC

Beaver, Micali, Rogaway, 1990B., Feigenbaum, Kilian, R., 1990

Can k computationally unbounded players compute an arbitrary f with communication = poly(input-length)?

Open question:

… or with work = poly(formula-size) and constant rounds [BB89,…]

Can this be done using a constant number of rounds?

Ben-Or, Goldwasser, Wigderson, 1988Chaum, Crépeau, Damgård, 1988

k3 players can compute any function f of their inputs with total work = poly(circuit-size)

Information-theoretic MPC is feasible!

Question ReformulatedIs the communication complexity of MPC strongly correlated with the computational complexity of the function being computed?

efficientlycomputable

functions

All functions

=communication-efficient MPC

=no communication-efficient MPC

Connecting MPC and LDC

[KT00]

1990 1995 2000

• The three problems are “essentially equivalent”– up to considerable deterioration of parameters

[IK04]

cPIR and the Crypto World

cPIR

CRHF*OT

SecureComputation

NI UHCommitment

KA

Homomorphic Encryption

TrapdoorPermutation*

PIR as a Building Block• Private storage [OS98]

• Sublinear-communication secure computation– 1-out-of-n Oblivious Transfer (SPIR) [GIKM98,NP99,…]

– Keyword search [CGN99,FIPR05]

– Statistical queries [CIKRRW01]

– Approximate distance [FIMNSW01, IW04]

– Communication-preserving secure function evaluation [NN01]

Time Complexity of PIRFocus so far: communication complexity

Obstacle: time complexity • Server/s must spend at least linear time.

Workarounds:• Preprocessing [BIM00]

• Amortization [BIM00, IKOS04]

Single user

Multiple users

AdaptiveNon-adaptive

? ?

?

Protocols

• High level structure of all known protocols– User maps i into a point zFm

– User secret-shares z between servers using some t-threshold LSSS over F

– Server j responds with a linear function of x determined by its share of z.

• Two types of protocols:

– Polynomial-based [BF90,BFKR91,CGKS95,…,WY05]

• LSSS = Shamir• Scale well with k,t

– Replication-based [IK99,BI01,BIKR02]

• LSSS = CNF• Do not scale well with k,t - involve (k choose t) replication overhead• However, dominate over polynomial-based up to (k choose t) factor [CDI05]

• Best known protocols for constant k

Polynomial-Based Protocols

• Step 1: Arithmetization– Fix a degree parameter d (will be determined by k)

• Goal: Communication = O(n1/d)

– User maps i[n] into a weight-d vector z of length m=O(n1/d).

• 1 11100….0 2 11010…0 n 00…0111

– Servers view x as a degree-d m-variate polynomial P(Z1,…,Zm)= x1Z1Z2Z3 + x2Z1Z2Z4 + … + xnZm-2Zm-1Zm

– Privately retrieving i-th bit of x privately evaluating P on z.

Basic Protocol: t=1• Goal: user learns P(z) without revealing z.• Step 2: Secret Sharing of z

Fmz+z+2

z+3z+4

z

t =1t=1:

• Pick random “direction” Fm

• zj = z+j goes to Sj

• Step 3: Sj responds with P(zj)

– User can extrapolate P(z) from P(z1),…,p(zk) if k>d.• Define deg-d univariate poly Q(W)=P(z+W)

• Q(0)=P(z) can be extrapolated from d+1 distinct values Q(j)=P(z j)

Query length m=O(n1/d), answer length 1Using k servers, O(n1/k-1) communication

Basic Protocol: General t• Goal: user learns P(z) without revealing z.• Step 2: Secret Sharing of z

• Step 3: Sj responds with P(zj)

– User can extrapolate P(z) from P(z1),…,P(zk) if k>dt.

• Define deg-dt univariate poly Q(W)=P(z+W1+W22+…+Wtt)

• P(z)=Q(0) can be extrapolated from dt+1 distinct values q(j)

O(n1/d) communication using k=dt+1 servers.

General t:

• zj = z+j1+j22+…+jtt

Fmz

Improved Variant [WY05]

• Goal: user learns P(z) without revealing z.• Step 2: Secret Sharing of z

• Step 3: Sj responds with P(zj)along with all m partial derivatives of P evaluated at zj

– User can extrapolate P(z) if k>dt/2.• Define deg-dt univariate poly Q(W)=P(z+W1+W22+…+Wtt)

• P(z)=Q(0) can be extrapolated from 2k>dt distinct values Q(j),Q’(j)

• Complexity: O(m) communication both waysSame communication using half as many servers!

General t:

• zj = z+j1+j22+…+jtt

Breaking the O(n1/(2k-1)) Barrier [BIKR02]

communication in k-server PIR

length of k-query binary LDC

previous new previous new O(n1/3) - 2^O(n) - O(n1/5) O(n1/5.25) 2^O(n1/2) - O(n1/7) O(n1/7.87) 2^O(n1/3) 2^O(n3/10) O(n1/9) O(n1/10.83) 2^O(n1/4) 2^O(n1/5) O(n1/11) O(n1/13.78) 2^O(n1/5) 2^O(n1/7)

k = 2

k = 3

k = 4

k = 5

k = 6

Arithmetization

• As before, except that now F=GF(2)– Fix a degree parameter d (will be determined by k)

• Goal: Communication = O(n1/d)

– User maps i[n] into a weight-d vector z of length m=O(n1/d).

• 1 11100….0 2 11010…0 n 00…0111

– Servers view x as a degree-d m-variate polynomial P(Z1,…,Zm)= x1Z1Z2Z3 + x2Z1Z2Z4 + … + xnZm-2Zm-1Zm

– Privately retrieving i-th bit of x privately evaluating P on z.

Effect of Degree Reduction

degree d, m variablessize n

degree d/c, m variablessize O(n1/c)

Degree Reduction Using Partial Information [BKL95,BI01]

Each entry of y is known to all but one server.

S1

User

Q Q

y

S2 SkQ

Q(y)

S1

User

P P

z

S2 SkP

P(z)

z is hidden from servers

Q = + +S1 S2 S3

+ +

k=3,d=6

S1 S3 S1 S2 S3

Q(y)=Q1(y)+Q2(y)+Q3(y)degQj d/k = 2

Q1 Q3 Q2

Back to PIR

Each entry of y is known to all but one server.O(n1/k) comm. bits

S1

User

Q Q

y

S2 SkQ

Q(y)

S1

User

P P

z

S2 Sk

P

P(z)

z is hidden from serversn comm. bits

Let z=y1+…+ yk , where the yj are otherwise randomQ(Y1,…,Yk)= P(Y1+… +Yk)

Initial Protocol

• User picks random y1,…, yk s.t. y1+…+ yk= z,

and sends to Sj all y’s except yj.

• Servers define an mk-variate degree-d polynomial Q(Y1,…,Yk)= P(Y1+… +Yk) .

• Each Sj computes degree-(d/k) poly. Qj , such that Q(y)= Q1(y)+…+Qk(y).

• Sj sends a description of Qj to User.

• User computes Qj(y)=xi .

O(m) = O(n1/d)

O(n1/k)

M Sj missing at most d/k variables.

A Closer Look

Useful parameters:• d=k-1 query length O(n1/(k-1))

d/k =0 answer length 1• d=2k-1 query length O(n1/(2k-1))

d/k =1 answer length O(n1/(2k-1))

Bestprevious

binaryPIR

Bestprevious

PIR

deg Qj d/k

Boosting the Integer Truncation Effect • Idea: apply multiple “partial” degree reduction steps.

• Generalized degree reduction:

Assign each monomial to the k’ servers V which jointly miss the least number of variables.

Q = + +S1 S2 S3

+ +

k=3,d=6, k’=2

'||

)()(kV

VQQ yyS1S2 S2S3 S1S2 S1S2 S1S3

replication degree size k d n k’ d’ n’=O(n d’/d) k” d” n”=O(n’ d ”/d’) … … …

reduction

reduction

reduction

1 d/k n d/k /d

k d n k d’ n

The missing operator: #vars m m’=O(md/d’)

conversion

Additional cost: re-distribute new point y’

Queries Answers

O(n4/21) communication

replication degree #vars size 3 7 O(n1/7) n

Example: k=3

reduction 2 4 O(n1/7) O(n4/7)

conversion 2 3 O(n4/21) O(n4/7)

reduction 1 1 O(n4/21) O(n4/21)

In Search of the Missing Operator

Must have m’=(md/d’).Question: For which d’<d can get m’=O(md/d’)?•Possible when d’|d .•Open otherwise. Positive result Better PIR •Simplest open case: d=3, d’=2, m’=O(m3/2)

d, m d’, m’P(y)=P’(y’)

P P’

y y’

A Workaround

• Can’t solve the polynomial conversion problem in general.• … but easy to solve given the promise weight(y)=const.• Stronger degree reduction:

'||

11 ),...,(),...,(kV

kVk QQ yyyy

V

kVk QQ )...(),...,( 11 yyyy

• Main technical lemma: good parameters for strong degree reduction.

Open Problems

• Better upper bounds

– Known: O(ncloglogk /(klogk)) – What is the true limit of our technique?

• Generalize best upper bound to t>1

• Tight bounds for polynomial conversion

• Lower bounds

– Known: clogn

– Simplest cases:• k=2

• k=3, single answer bit per server