private information retrieval. what is private information retrieval (pir) ? reduction from private...
Post on 21-Dec-2015
218 views
TRANSCRIPT
Private Information Retrieval
• What is Private Information retrieval (PIR) ?
• Reduction from Private Information Retrieval (PIR) to Smooth Codes
• Constructions (Achieving the Barrier)
• Construction (Breaking the Barrier)
Contents
)( 12
1
knO
)( 12
1
knO
Private Information Retrieval (PIR)
• Query a public database, without revealing the queried record.
• Example: A broker needs to query NASDAQ database about a stock, but doesn’t want anyone to know he is interested.
PIR
• The Single Server Case• Chor et all have shown in their
1995 paper that for a single server the it is necessary to send the whole content of the database.
PIR
• A k server PIR scheme of one round, for database length n consists of:
}1,0{){0,1}({0,1}[n]:R :function tionReconstruc
}1,0{→{0,1}:A,...,A functionsanswer K
}1,0{{0,1}[n]:Q,...,Q functionsquery K
kll
lk1
lk1
aq
q
rnd
a
q
l
l
PIR – definition
• These functions should satisfy:
q].r)(j,[QPr q]r)(i,[Q Pr
,{0,1}q and [k]s , [n] ji,every For :Privacy
x r)),...)(i,Q(AR(i,...,
{0,1}],[,{0,1}x : sCorrectnes
srsr
q
ijj
rndn
rni
Simple Construction of PIR
• 2 servers, one round• Each server holds bits x1,…, xn.• To request bit i, choose uniformly A
subset of [n]• Send A to the first server. • Send the second server A+{i} (add i to
A if it is not there, remove if it is there)• Servers return the xor of the bits in the
indices of the requests.• Xor the answers.
Smoothly Decodable Code
C:{0,1}nm is a (q,c,) smoothly decodable code if there exists a prob. algorithm, A, such that:
x {0,1}n and i {1,..,n}, Pr[ A(C(x),i)=xi ] > ½ +
A reads at most q indices of y (of its choice)
The Probability is over the coin tosses of A
Queries are not allowed to be adaptive
A has access to a non corrupted codeword
i {1,..,n} and j {1,..,m}, Pr[ A(·,i) reads j ] ≤ c/m
LDC is Smooth
• Claim: Every (q,δ,ε) LDC is a (q,q/ δ, ε) smooth code.
• Intuition – If the code is resilient against linear number of errors, then no bit of the output can be queried too often (or else adversary will choose it)
Smooth Code is LDC
• A bit can be reconstructed using q uniformly distributed queries, with ε advantage , when no errors
• With probability (1-qδ) all the queries are to non-corrupted indices.
Remember: Adversary does not know decoding procedure’s random coins
Reduction from PIR to SDC [Gol,Ka,Sch,Tr 02]
• A codeword is a Concatenation of all possible answers from the servers
• A query procedure is made of k queries to the codeword corresponding to the answers of k servers on the requested bit (for queries generated as in the PIR)
• From the PIR properties it follows that the distribution of queries to the indices of the codeword are independent of the requested bit
Reduction from PIR to SDC
• Let a be the length of an answer from a server, k the number of servers and q the length of a query
• Let l= be the length of a codeword • Let Pj be the probability of querying bit
j. Note that • Set . And duplicate bit j Nj times.
When querying for bit j choose at random one of the Nj bits
kPj
j
ak q2
jj PN
Reduction from PIR to SDC
• The probability of accessing each bit is now less than 1/l
• The new length of the encoding is less than (k+1)l
• We have a (ka,k+1,1/2) LDC
• Ingredients:• X – the database string• E : • Px(Z1,…,Zm) – A polynomial in
m=(n^d) variables of degree d s.t. Px(E(i))=xi
• s.t.
Achieving the Barrier)( 12
1
knO
mn }1,0{][
mkYY }1,0{,...,1 )(
1iEY
k
j j
• The user generates the Yj and sends all Yq q!=j to server j
• We can view Px as a polynomial in the km variables Yjl where the Yjl sum to Zj
• Each server knows the value of (k-1)m variables
• Let d=k-1, hence each monomial of Px has at most k-1 different variables
Achieving the Barrier)( 12
1
knO
• Each variable is known to k-1 servers, hence there exists a server who knows the values of all the variables in the monomial.
• Assign each monomial to one of the servers who know all its variables.
Achieving the Barrier)( 12
1
knO
• Each server calculates the xor of the monomials assigned to it and sends to the user
• The user calculates the xor of all the answers.
Achieving the Barrier)( 12
1
knO
ix
jxjkxjk
XiEP
ZPYPYM
))((
)()()(
• Security - each server received k-1 vectors which are random independent strings of length m
• Communication Complexity – each server received k-1 vectors, each of length m=O(n^(1/d)) = O(n^(1/(k-1)) by choice of m and d.
Achieving the Barrier)( 12
1
knO
• Now take d=1/(2k-1) • Each monomial has a server who misses
at most 1 variable, assign the monomial to that server
• Each server sends the 1-bit coefficients of the polynomial which is the sum of all monomials assigned to it
• The user evaluates the polynomial on the variables Y
Achieving the Barrier)( 12
1
knO
• The query complexity is the same O(n^(1/d))
• The answer complexity is (k^2)m=O(n^1/d)
• Total complexity : O(n^1/d)=O(n^(2k-1)) by choice of d
Achieving the Barrier)( 12
1
knO
• The first idea that comes to mind is to try and increase the degree d even further.
• Unfortunately this does not work due to the increasing size of the polynomials the servers return.
• The novelty of the paper is how to go around this difficulty.
Breaking the Barrier)( 12
1
knO
• Assume that each polynomial is known not to one server but to a group of servers.
• Now we do not need to receive the polynomials themselves but can use the PIR scheme (on those servers) to evaluate them on the required input.
Breaking the Barrier)( 12
1
knO
• Suppose that we could write Px as a sum of Pv where v ranges over all subsets of the servers. The problem of evaluating Px reduces to evaluating each Pv which (we hope) is of lower degree.
• On the other hand, also the number of servers is smaller which is a disadvantage.
• The paper comes to find such Pv with good properties
Breaking the Barrier)( 12
1
knO
• Define k’ to be a lower bound on the size of the sets V and the maximum number of variables a server misses in Pv.
• All together V misses at most |V| variables in Pv.
Breaking the Barrier)( 12
1
knO
• We will choose an encoding E such that the hamming weight of E(i) (and therefore the number of monomials) will be bounded by d (the number of monomials is bounded by 2^d).
• If we had Pv as specified then we could apply the PIR recursively on all sets of size more than k’ with communication complexity:
Breaking the Barrier)( 12
1
knO
k
kl
dlP
dkP lnC
l
knOknC
'
//1' ),((),(
• Let E be an encoding to all strings of length m and weight d.
• We can encode different values thus is sufficient to encode n values.
• Define it holds that• Define V(M) to be all servers who
miss at most variables in M
Breaking the Barrier)( 12
1
knO
d
m
)( /1 dnm
n
i iElikx
l
ZxZZP1 1)(
1 ),...,( ix xiEP ))((
• Lemma: for ,k’<=k and d<=(+1)k-(-1)k’+(-2) and M a monomial of degree d in Yj,h then either there is a server who misses at most one variable or |V(M)|>=k’
• Proof: Counting argument
Breaking the Barrier)( 12
1
knO
• Claim: Let k,,k’ be as before then there are polynomials Pv,Pj for every V[k] s.t. |V|>=k’ and j[k] s.t.– Pv is of degree |V| and can be
computed from Px and {Yj}jV– Pj is of degree 1 and can be computed
from Px and {Yj}ji –
Breaking the Barrier)( 12
1
knO
][||],[
)()()(kj
jkVkVVx zPzPzP
• Proof: It is sufficient to prove for P consisting of a single monomial, then we can sum over all monomials.
• Denote • Define (M) to be the number of
variables in M for which
Breaking the Barrier)( 12
1
knO
)(
,)(
)(MVj
qjMVj
q
q
q
q
YZMT
qjqY , )(MVjq
• WLOG take • Define a
polynomial in mk variables.• Q has k^d monomials each of the
form
Breaking the Barrier)( 12
1
knO
),...,()(1
,1
1,][
][,
k
jmj
k
jjx
mh
kjhjx YYPYQ
dkjjx ZZZZP ...)( 21][
djjj dYYYM ...21 21
1. Set Q’=Q, for all V Pv=02. Find V=V(M) for some monomial M
in Q’ s.t. V is of maximal size, if |V|<k’ stop.
3. While there is M’ s.t. V(M’)=V:• Pick M’ from Q’ which maximizes
(M’)• Pv=Pv+T(M’), Q’=Q’-T(M’)
4. Goto 2
Breaking the Barrier)( 12
1
knO
• If the algorithm halts then the Pv are of the desired degree and their sum is equal to P-Q’ for Q’ at the end of the execution.
• Likewise, for each M in Q’ there exists a server j who misses at most one variable, add M to Pj
Breaking the Barrier)( 12
1
knO
• Define MM’ if V(M)=V(M’) and
for all q<=d either or • If M’ is a monomial in T(M) then
1. V(M’)V(M)2. (M’)<=(M)3. Equality in 1,2 implies MM’4. M1M2 implies either both are in T(M) of
both aren’t
Breaking the Barrier)( 12
1
knO
djjj dYYYM ...21 21
)(', MVjj qq 'qq jj
• Each time step 3 is applied we either add to Q’ monomials M’ with smaller V(M’) or (M’) which will be dealt with later.
• Or M’M so it already exists in Q’ and is removed.
Breaking the Barrier)( 12
1
knO
• Lemma: For all i>0 and k>(i-1)! there exists a PIR protocol Pi with communication complexity O
(n^2/ik)• Corollary : there exists a PIR
protocol with communication complexity
Breaking the Barrier)( 12
1
knO
)( )log/(loglog kkkcnO
• For every PIR scheme we have a related smooth code
• Upper bound for PIR is raised to
• Likewise the upper bound for smooth codes is raised to
Summary
)2()log/(loglog kkkcnO
)( )log/(loglog kkkcnO
• T-collusion PIR, the protocol must maintain security against collusions of T servers. General results appear in “Information-Theoretic Private Information Retrieval: A Unified Construction” [Beimel, Ishai]
• CPIR – Computational PIR in which the security definition is relaxed to a computational one.
• There exist polylog single server CPIR protocols [Cachin, Micali, Stadler]
Related Topics