privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud

40
Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud Bing Wang, Shucheng Yu, Wenjing Lou, and Y. Thomas Hou IEEE 33rd International Conference on Computer Communications INFOCOM 2014 Toronto - Canada - 2014 SWIM Seminar April 28, 2016 Mateus Cruz

Upload: mateus-s-h-cruz

Post on 27-Jan-2017

76 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Privacy-PreservingMulti-Keyword Fuzzy Search

over Encrypted Datain the Cloud

Bing Wang, Shucheng Yu, Wenjing Lou, and Y. Thomas Hou

IEEE 33rd International Conference on Computer CommunicationsINFOCOM 2014

Toronto - Canada - 2014

SWIM SeminarApril 28, 2016Mateus Cruz

Page 2: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

OUTLINE

1 Introduction

2 Preliminaries

3 Proposal

4 Experiments

5 Conclusion

Page 3: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

OUTLINE

1 Introduction

2 Preliminaries

3 Proposal

4 Experiments

5 Conclusion

Page 4: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

MOTIVATION

Compute over encrypted dataSearchable encryption

I Single keyword exact searchI Multi-keyword exact searchI Single keyword fuzzy search

1 / 29

Page 5: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

PROPOSAL

Multi-keyword fuzzy searchContribution

I No need for predefined keyword dictionaryI Use of LSH and Bloom filtersI Experiments using real datasets

2 / 29

Page 6: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

ARCHITECTURE

3 / 29

Page 7: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

SECURITY MODEL

Honest-but-curious serverTrusted usersThreat models

I Known Ciphertext Model– Limited server knowledge

I Known Background Model– Background information available

4 / 29

Page 8: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

PRIVACY REQUIREMENTS

File content privacyIndex privacyUser query privacyKeyword privacyTrapdoor unlinkability

I Unable to link two trapdoorsI Even if they are for the same query

5 / 29

Page 9: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

OUTLINE

1 Introduction

2 Preliminaries

3 Proposal

4 Experiments

5 Conclusion

Page 10: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

BLOOM FILTER

m-bit arrayGiven...

I S = {a1,a2, . . . ,an}I H = {hi |hi : S → [1,m],1 ≤ i ≤ `}

` independent hashfunctions

Inserts a ∈ S into the filterI Set bits at all the hi(a)-th positions to 1

Can generate false positives

6 / 29

Page 11: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

LOCALITY SENSITIVE HASHING (LSH)

Hash close items to the same valueI With high probability

(r1, r2,p1,p2)-sensitive familyI If dist(x , y) ≤ r1,Pr [h(x) = h(y)] ≥ p1I If dist(x , y) ≥ r2,Pr [h(x) = h(y)] ≤ p2

Use of p-stable LSH familyI ha,b(v) = ba·v+b

w c– a and v are vectors– b and w are real numbers

7 / 29

Page 12: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

OUTLINE

1 Introduction

2 Preliminaries

3 Proposal

4 Experiments

5 Conclusion

Page 13: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

BASIC MULTI-KEYWORD FUZZY SEARCH

One index per file (ID)I Each index is a Bloom filterI Contains all keywords in D

Scheme steps1 Bigram vector representation of keyword2 Bloom filter representation of index/query3 Inner product based matching

8 / 29

Page 14: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

BASIC MULTI-KEYWORD FUZZY SEARCH

8 / 29

Page 15: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

BIGRAM VECTOR REPRESENTATION

Enables use of LSH functionsTransform keyword in a bigram set

I “network”: {ne,et , tw ,wo,or , rk}I 262-bit vector for each keyword

9 / 29

Page 16: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

BIGRAM VECTOR REPRESENTATION

9 / 29

Page 17: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

BLOOM FILTER REPRESENTATION

Represent queries and keywordsRegular bloom filter

I Hash to independent valuesBloom filter with LSH functions

I Hash similar inputs to the same outputI Enables fuzzy search

10 / 29

Page 18: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

BLOOM FILTER REPRESENTATION

10 / 29

Page 19: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

INNER PRODUCT BASED MATCHING

Transforms the query into a Bloom filterI Represent indices and queries in the same way

Use inner product to searchI Between index vector and query vectorI High result means that keywords are in D

11 / 29

Page 20: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

INNER PRODUCT BASED MATCHING

11 / 29

Page 21: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

ALGORITHMS

Key generationIndex generation

I Index encryptionTrapdoor generation

I Query encryption

Search

12 / 29

Page 22: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

KEY GENERATION

KeyGen(m)I Security parameter m

OutputI Secret key SK (M1,M2,S)

– M1,M2 ∈ Rm×m

– S ∈ {0,1}m is a bit vector

13 / 29

Page 23: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

INDEX GENERATION

BuildIndex(D,SK , `)

Choose ` LSH functions from the family HI H = {h : {0,1}262→{0,1}m}

Construct a m-bit Bloom filter ID1 Extract keywordsWD = {w1,w2, . . . },wi ∈ {0,1}262

2 Insert wi into ID using hj ∈ H,1 ≤ j ≤ `3 Encrypt ID using Index Enc(SK , ID)

OutputI EncSK (ID)

14 / 29

Page 24: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

INDEX ENCRYPTION

Index Enc(SK , I)I Secret key SKI Index I

Split I into I ′ and I ′′I Given a random r , for each ij ∈ I

– If sj ∈ S is 1, i ′j = i ′′j = ij– If sj ∈ S is 0, i ′j =

12 ij + r , and i ′′j = 1

2 ij − r

OutputI Secure index EncSK (I) = {MT

1 · I ′,MT2 · I ′′}

15 / 29

Page 25: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

TRAPDOOR GENERATION

Trapdoor(Q,SK )I Query QI Secret key SK

For each keyword qi ∈ QI Insert qi into the Bloom filter

– Using the same ` LSH functionsI Encrypt Q using Query Enc(SK ,Q)

OutputI EncSK (Q)

16 / 29

Page 26: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

QUERY ENCRYPTION

Query Enc(SK ,Q)Split Q into Q′ and Q′′

I Given another random r ′, for each qj ∈ Q– If sj ∈ S is 0, q′j = q′′j = qj

– If sj ∈ S is 1, q′j =12 qj + r ′, and q′′j = 1

2 qj − r ′

OutputI Trapdoor EncSK (Q) = {M−1

1 · Q′,M−12 · Q′′}

17 / 29

Page 27: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

SEARCH

Search(EncSK (Q),EncSK (ID))Calculation using inner product

I Between index vector and query vectorI Shows matching bit in the Bloom filter

OutputI MT

1 I ′ ·M−11 Q′ + MT

2 I ′′ ·M−12 Q′′

– Equivalent to I ′T · Q′ + I ′′T · Q′′ = IT · Q

18 / 29

Page 28: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

ANALYSIS OF THE BASIC SCHEME

CorrectnessI Exact search

– Maximum value of inner productI Fuzzy search

– High inner product for similar items

False positives and false negativesI Trade-off tuned using parameters m, `

VulnerabilityI Under Known Background Model

– Keyword frequency and distributionI Association between w and trapdoor EncSK (w)I The index might be recovered

19 / 29

Page 29: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

ENHANCED FUZZY SEARCH SCHEME

Introduction of a new security layerI Pseudo-random function f

Modified algorithmsI Key generationI Index generationI Trapdoor generationI Search

20 / 29

Page 30: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

ENHANCED KEY GENERATION

KeyGen(m, s)Generate secret key

I SK (M1,M2,S)

Generate hash key poolI HK = {ki |ki

R← {0,1}s,1 ≤ i ≤ `}

21 / 29

Page 31: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

ENHANCED INDEX GENERATION

BuildIndex(D,SK , `)Pseudo-random function f for each D

I f : {0,1}∗ × {0,1}s → {0,1}∗

For each DI ExtractW = {w1,w2, . . . } from DI InsertW into ID

– {gi |gi = fki ◦ hi ,hi ∈ H,1 ≤ i ≤ `}I Encrypt ID with SK

OutputI EncSK (ID)

22 / 29

Page 32: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

ENHANCED TRAPDOOR GENERATION

Trapdoor(Q,SK )

Generate a m-bit Bloom filterInsert Q using the same functions gi

I gi = fki ◦ hi ,hi ∈ H,1 ≤ i ≤ `

Encrypt Q with SKOutput

I EncSK (Q)

23 / 29

Page 33: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

ENHANCED SEARCH

Search(EncSK (Q),EncSK (ID))Output

I Inner product 〈EncSK (Q),EncSK (ID)〉The result is not affected by f

I Collision free (e.g.: HMAC-SHA1)

24 / 29

Page 34: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

OUTLINE

1 Introduction

2 Preliminaries

3 Proposal

4 Experiments

5 Conclusion

Page 35: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

SETUP

DatasetI INFOCOM publicationsI 3600 filesI 5734 keywords

EnvironmentI Intel i3 3.3GHz 4GB RAM

Performance evaluationI Index and trapdoor generationI Search over the encrypted index

Accuracy EvaluationI PrecisionI Recall

25 / 29

Page 36: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

PERFORMANCE (1/2)

Computation from hash functionsEncryption involves matrix multiplications

26 / 29

Page 37: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

PERFORMANCE (2/2)

Inner product computationI Related to the file lengthI Not the number of keywords

27 / 29

Page 38: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

ACCURACY

False positives accumulate

28 / 29

Page 39: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

OUTLINE

1 Introduction

2 Preliminaries

3 Proposal

4 Experiments

5 Conclusion

Page 40: Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud

Introduction Preliminaries Proposal Experiments Conclusion

SUMMARY

Multi-keyword fuzzy searchI Using encrypted data

Bloom filter as indexI Built using LSH functions

Inner product for similarity calculationEnhanced search scheme

I Secure against background knowledge

29 / 29