introduction to searchable encryption -...

53
Introduction to Searchable Encryption Prof. Ja-Ling Wu Dept. CSIE & GINM National Taiwan University

Upload: others

Post on 02-Jun-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Introduction to Searchable

Encryption

Prof. Ja-Ling Wu

Dept. CSIE & GINM

National Taiwan University

Page 2: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

1

What kind of Cryptography?

• Privacy-Preserving Computation (PPC)

– Function (including inputs and outputs) does

not reveal private information :

– Medical records

– Financial Data

– Sensitive Data

– Data Owner

– One who is severed

– …

Page 3: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

2

MPC vs. PPC

• MPC (Message-Preserving Computation) is

general – it captures all applications.

• Regarding privacy, MPC aims for the following:

– A secure protocol must reveal no more information

than the output of the function itself

– That is, the process of protocol computation reveals

nothing.

• MPC does not deal with the question of

whether or not the function reveals much

information – which is the focus of Privacy-

Preserving Computation (PPC)

Page 4: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

3

Privacy-Preserving Computation

Search

query

Data repository

Client wants to preserve search privacy: Private Information Retrieval

Data repository is huge and Valuable Privacy-preserving data mining

Data are encrypted: Search and/or Computation over encrypted data

Page 5: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

4

Untrusted Remote Storage

• Remote storage is ubiquitous :

– E-mail, backups,

– Department servers, Yahoo Mail, Gmail

– Cloud Storage (Amazon, iCloud, Dropbox, Google) ,

– Social Web-site (Facebook, Flickr, Youtube),…

Page 6: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

5

Untrusted Remote Storage

• Google’s Search Across Computers feature – “In order to share your indexed files between your

computers, we first copy this content to Google Desktop servers located at Google. This is necessary, for example, if one of your computers is turned off or otherwise offline when new or updated items are indexed on another of your machines. We store this data temporarily on Google Desktop servers and automatically delete older flies, and your data is never accessible by anyone doing a Google search.”

• Do you trust this? (the prism project)

Page 7: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

6

Searchable Encryption

• Store data externally

– encrypted

– want to search data easily

– avoid downloading everything then decrypt

– allow others to search data without having

access to plaintext

Page 8: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

7

Searchable Encryption - Factors

• When searching, what must be protected? – retrieved data

– search query

– search query outcome (was anything found?)

• Scenario – single query vs multiple queries

– non-adaptive: series of queries, each independent of the others

– adaptive: form next query based on previous results

• # of participants – single user (owner of data) can query data

– multiple users can query the data, possibly with access rights defined by the owner

Page 9: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

8

Single User Symmetric Searchable

Encryption (SSE)

Non-Adaptive Adaptive

Page 10: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

9

Search Over Encrypted Data

• Applications: Storage outsourcing, mail gateways, Google Desktop

(“search across computers”), …

• Untrusted servers and/or Sensitive data Data has to be encrypted

• Encryption hides all information about the data Server cannot

search!

• Client must download entire document collection:

Page 11: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

10

Search Over Encrypted Data (cont’d) • Searchable Symmetric Key Encryption where client

performs encryption before storing data

– Recall that public key algorithms might be too slow for

encrypting large data

• Secure index (SI): Auxiliary data structure that allows the

remote server to perform searches efficiently, while keeping

queries and data confidential

• Documents are encrypted; SI is encrypted — “two-layer”

searches performed using trapdoors.

Page 12: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

11

Build Lists

Austin

Baltimore

Washington

Determine words in each D to create D(w)’s

Build linked lists, where D(wi) is the set of identifiers of

documents containing the word wi ordered in lexicographic order – Dictionary!

Page 13: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

12

Create Lists

Austin

Baltimore

Washington

Encrypt linked lists: establish keys, pointers, encrypt

Page 14: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

13

Build Index Table

Austin

Baltimore

Washington

f( )

f( )

f( )

g( )

g( )

g( )

Build lookup table T to manage the keys for the

first keywords

Page 15: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

14

Create Array

Merge, scramble linked lists to form A

Page 16: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

15

Query

Baltimore

Page 17: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

16

Extensions

•Can I share my document collection?

•Malicious servers

•Updates

Page 18: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

17

Multi-User SSE Encryption

Page 19: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

18

Multi-User SSE (cont’d)

• Similar security notions to single-user SSE’s

– Secure indexes and trapdoors

• Revocation: owner can revoke searching

privileges

– Robust against user collusions

• Anonymity: server should not know who initiated

search

• Secure Buyer-Seller Protocol might be useful!

Page 20: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

EXAMPLES OF SEARCHING OVER

ENCRYPTED DATA

Searchable Symmetric Encryption: Improved Definitions

and Efficient Constructions---CCS’06, Proceedings of

the Conference on Computer and Communications

security. Reza Curtmola, Juan Garay, Seny Kamara, Rafail Ostrovsky

Privacy-Preserving Multi-keyword Ranked Search over

Encrypted Cloud Data---INFOCOM'11, Shanghai, China,

April 10-15, 2011.Ning Cao, Cong Wang, Ming Li, Kui Ren, and Wenjing Lou

19

Page 21: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Cloud service is convenient!

20

Page 22: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

But…

21

Page 23: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Two approaches

22

Cryptography

based approach Other approach

secure efficient

Page 24: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Threat model

• Cloud server is considered as “honest-but-curious”.

• honest---server follows the designed protocol

• curious---server may do extra inference or analysis

• Known ciphertext model:

• Server knew encrypted data and searchable index only

• (both of which are outsourced from data owner)

• Known background model:

• Server possess more information, such as the correlation relationship

among given search requests or the data set related statistical

information

23

Page 25: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Privacy description

• Data privacy: traditional cryptography to encrypt data

• Index privacy: prevent server to learn from search indexes

• Search privacy---

• Keyword privacy: Database is in plaintext form while the query

is encrypted--- challenge: statistical (frequency) analysis attack

• Trapdoor privacy: Server can NOT generate valid trapdoor

function (one-way function used for encryption/decryption).

• Search pattern privacy: Are two queries about the same

keywords?- reduce security space that needs to be analyzed

• Access pattern privacy: not reveal the sequence of search

results to the server

24

Page 26: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Traditional Approach:

• Internet

• Client Cloud Server

• ---------------------------Inter --------------------

• X Y

• Encrypt (X)--- Internet --- Stored in Y

• If a search is issued, there are two possible approaches: • (1) Download Encrypt(X) from Y, decrypt it back to X and then do search in

client--- huge computation and storage loads in Client

• (2) Send decryption key to Y, decrypt Encrypt(X) at Y, do search in Y and send the result back to X---both internet and service providers (server) know your secret key

25

Page 27: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Searchable Symmetric Encryption(SSE)

• In 2006, Searchable Symmetric Encryption--- with complete Definitions and Efficient Constructions, was proposed by Curtmola et. al.

• Curtmola’s search scheme can be executed in constant time (O(1)).

• The construction is composed of a combination of a look-up table (T) and an array (A).

A : stores the set D(w) in Encrypted form for each word w

T : stores information that enables one to locate and decrypt the appropriate elements from A, for each w

26

Page 28: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Searchable Symmetric Encryption (SSE)

• more efficient search schemes can be achieved by

revealing certain amount of information--- tradeoffs

between search efficiency and amount of revealed

information!

• Strict Security definition

• A secure SSE scheme should not leak anything beyond the

outcome of a search

• Revisit Security definition

• A secure SSE scheme should not leak anything beyond the

outcome and the pattern of a search

27

Page 29: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

SSE needs 4 Algorithms

• Keygen : run by user, probabilistic

• BuildIndex : run by user, possibly probabilistic

• function of (K,D) and the output index I

• Trapdoor : run by user, probabilistic

• generate a trapdoor for a given word w, Tw(K,W)

• Search : run by server, search for D that contains w

• It takes an index I for a collection D and a

• trapdoor Tw for the word w as input and returns

• D(w), the set of identifiers of documents

• containing w.

• Each of the above is a Polynomial-time algorithm!

28

Page 30: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Searchable Symmetric Encryption(SSE)

• Keygen(k)

• K = (s, y, z)

• BuildIndex(K, D)

• D(wi) is the set of identifiers of documents containing the word wi

ordered in lexicographic order – Dictionary!

• Assume the dictionary of the document collection includes three

different words.

‘Where’

‘What’

‘Why’ three linked lists are constructed!

29

Document

identifier

NULL

NULL

NULL

Page 31: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Searchable Symmetric Encryption(SSE)

• Each node generates a key used to encrypt the next node and saves the

index of the next element in the list to the array A.

• This index is generated using a pseudo-random function (hash) and takes

ctr + 1 as its input. (initial ctr = 0 )

30

Linked lists will be ‘scrambled’ into array A

〈id(Dij ), keyij , address(ctr + 1)〉

NULL

Page 32: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Searchable Symmetric Encryption(SSE)

• This tuple is encrypted and then inserted into the array A at index

address(ctr), before ctr is incremented.

• If the array A contains any empty cells, they should be set to random

(padding) values, generated using a pseudo-random function.

• The key for encrypting the first element will be located in

the Look-up table.

31

NULL

〈address(A[N0]), key〉 ⊕ fy(wi)

T[Fz(wi)]

Page 33: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Searchable Symmetric Encryption(SSE)

• Trapdoor(w) • Tw = ( Fz(wi) , fy(wi) )

• Search(D, Tw)

value = T[Fz(wi)]

〈address(A[N0]), key〉= value⊕ fy(wi)

〈id(D), key , address(ctr + 1)〉 for every node

return id(D(wi))

32

T[Fz(wi)]

NULL

Page 34: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

However, the system is…

• Not suitable for large scale Cloud data utilization system.

• We need more!

How to do Multi-user Symmetric Searchable Encryption?

6 polynomial-time algorithms are required:

M-Keygen

M-BuildIndex

AddUser

RevokeUser

M-Trapdoor

M-Search

33

Page 35: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

What would we need in the system?

34

• Ranked search : Eliminates

unnecessary network traffic – fit for

“ pay-as-you-use” Cloud paradigm

• Multi-keyword search: improves search result accuracy and

enhances user search experience

How to design an efficient encrypted

data search mechanism that

supports multi-keyword semantics

without privacy breaches still

remains a challenging open problem!

Page 36: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

So, the problem is…

• Multi-keyword Ranked Search over Encrypted Cloud Data (MRSE) :

Coordinate Matching Principle--- as many matches as possible!

• Documents and queries are described as binary vectors. Each bit in the vector means whether corresponding keyword is contained in the document or the query!

• Use “inner product similarity” to evaluate the similarity

between documents in the database and the query. Basic idea: Secure Inner Product Computation/ Secure K-nearest neighbor (KNN) technique

35

1001010…………… 1001000……………

Document vector Query vector

Wow!

Similar!

Page 37: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

System framework

• Basic Architecture:

36

Page 38: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

System framework

37

cipher

index

Database

keywords trapdoor Database

Page 39: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

We need 4 algorithms again!

• Keygen: taking a security parameter k as input, data owner outputs a

symmetric key as SK

• BuildIndex: Based on the dataset F, data owner builds a searchable

index I, which is encrypted by SK and then outsourced to Cloud server. After

the index construction, the document collection can be independently

encrypted and outsourced.

• Trapdoor: With t keywords of interest in 𝑊 as input, a corresponding

trapdoor (token), 𝑇𝑊 , is generated.

• Search: When Cloud server receives a query request as (𝑇𝑊 ,L), it

performs the ranked search on the index I with the help of trapdoor 𝑇𝑊 and

finally returns 𝐹𝑊 , the ranked ID list of top-L documents stored by their

similarity with 𝑊 .

38

Page 40: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

System framework

39

cipher

index

Database

𝐹𝑖 𝐶𝑖

𝐼𝑖

keywords trapdoor Database

𝑊 𝑇𝑊

𝐶𝑊

𝐹𝑊

BuildIndex

Trapdoor

Search

Page 41: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Basic idea

• 𝐷𝑖: binary data vector for document 𝐹𝑖

• Each bit represents the existence of corresponding

keyword in the document.

• Q: query vector, the same as data vector.

• 𝐷𝑖‧Q (inner product) is the similarity score of document i

to the query.

• By this way, we can achieve “multi-keyword” and

“ranked” search.

40

1001010…………… 1001000……………

Document vector Query vector

Keyword 1 exists in

document & query

Page 42: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Recall---Definition of Privacy

• Keyword Privacy: • hide what the user is searching, i.e., the keywords

• indicated by the corresponding trapdoor.

• Trapdoor Privacy ( Unlinkability):

• the cloud server should not able to link the relationship

• of any given trapdoor, e.g., to determine whether the two

• trapdoors are formed by the same search request.

• Access Pattern Privacy: the sequence of search results

• the proposed scheme is not designed to protect access

• pattern for the efficiency concerns

41

Page 43: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

MRSE_I Scheme

• Keygen

• BuildIndex

42

1001000……101

5 19 -4 3 2 …. 1

-13 6 2 ….. 2 3 4

………………….

………………….

3 1 68 43 ……3

8 10 -3 7 …. 57

-1 69 2 ….. 3 7 4

………………….

………………….

7 -9 22 11 ……7 d+2

d+2

d+2 d+2

S

𝑴𝟏 𝑴𝟐

d+2

1001010……10

Document vector

d

𝛆𝐢 𝟏

+2

𝐃𝐢

1001000……101 S

𝐃𝐢′

𝐃𝐢′′

-8

9

S[0] = 1, let 𝐷𝑖′[0] + 𝐷𝑖 ′′[0] = 𝐷𝑖 [0]

0

0

S[1] = 0, 𝐷𝑖′[1] = 𝐷𝑖′′[1] = 𝐷𝑖[1] 𝐈𝐢 = {𝐌𝟏𝐓𝐃𝐢′, 𝐌𝟐

𝐓𝐃𝐢′′ }

Page 44: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

MRSE_I Scheme

• Keygen: generate secret key {S,𝑀1, 𝑀2} • S: (d+2) bit vector

• 𝑀1, 𝑀2: (d+2) × (d+2) invertible matrices

• BuildIndex: generate index 𝐼𝑖 for document i

• 𝐷𝑖: {𝐷𝑖, 𝜀𝑖, 1}, 𝜀𝑖 is a random number (dummy keyword)

• Split 𝐷𝑖 into two vectors 𝐷𝑖′ and 𝐷𝑖′′:

if S[j] = 0, then 𝐷𝑖′[j] = 𝐷𝑖′′[j] = 𝐷𝑖[j]

if S[j] = 1, then let 𝐷𝑖′[j] + 𝐷𝑖′′[j] = 𝐷𝑖[j]

• 𝐼𝑖 = {𝑀1𝑇𝐷𝑖′, 𝑀2

𝑇𝐷𝑖′′ }: • subindex is built for every encrypted document Ci

43

d: number of keywords

d+2 bit vector

Page 45: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

MRSE_I Scheme

• Trapdoor

44

r00r0r0……r0

d

𝒓 𝒕

+2

𝐐

1001000……101 S

𝑸′

𝑸′′

r

r

S[0] = 1, 𝑄′[0] = 𝑄′′[0] = 𝑄[0]

-2

2

S[1] = 0, let 𝑄′[1] + 𝑄′′[1] = 𝑄[1]

𝐓𝐖 = {𝐌𝟏−𝟏𝐐′,𝐌𝟐

−𝟏𝐐′′}

rQ

Page 46: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

MRSE_I Scheme

• Search

45

Ii = {M1TDi′, M2

TDi′′ } (on server)

TW = {M1−1Q′,M2

−1Q′′} (client upload)

Ii‧ TW = M1TDi′‧M1

−1Q′ + M2TDi′′ ‧M2

−1Q′′

= Di′‧Q′ + Di′′‧Q′′

𝐃𝐢′

𝐃𝐢′′

𝑸′

𝑸′′

let 𝑄′[j] + 𝑄′′[j] = 𝑄[j]

S[j] = 0, 𝐷𝑖′[j] = 𝐷𝑖′′[j] = 𝐷𝑖[j]

Page 47: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

MRSE_I Scheme

• Search

46

Ii = {M1TDi′, M2

TDi′′ } (on server)

TW = {M1−1Q′,M2

−1Q′′} (client upload)

Ii‧ TW = M1TDi′‧M1

−1Q′ + M2TDi′′ ‧M2

−1Q′′

= Di′‧Q′ + Di′′‧Q′′ = Di‧Q

𝐃𝐢′

𝐃𝐢′′

𝑸′

𝑸′′

𝐃𝐢

𝑸

1001010……10 𝛆𝐢 𝟏

r00r0rr0……r0 𝒓 𝒕

= 𝐫 𝑫𝒊 ∙ 𝐐 + 𝜺𝒊 + 𝐭

Page 48: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

MRSE_I Scheme

• Trapdoor

• 𝑄 = {rQ, r, t }, r and t are random numbers

• Split 𝑄 into two vectors 𝑄′ and 𝑄′′

S[j] = 1, 𝑄′[j] = 𝑄′′[j] = 𝑄[j]

S[j] = 0, let 𝑄′[j] + 𝑄′′[j] = 𝑄[j]

• 𝑇𝑊 = {𝑀1−1𝑄′,𝑀2

−1𝑄′′}

• Search

47

Page 49: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Properties

• r 𝐷𝑖 ∙ Q + 𝜀𝑖 + t is “nearly” a linear function of 𝐷𝑖 • 𝜀 follows a normal distribution

• Larger variance of 𝜀 may decrease the precision, but increase the

security.

48

Page 50: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Why Privacy is Preserved?

• With the randomness introduced by the splitting process

and the random numbers r and t, the proposed scheme

can generate two totally different trapdoors for the same

query 𝑊 . This nondeterministic trapdoor generation can

guarantee the trapdoor unlinkability which is an unsolved

privacy leakage problem in related symmetric key based

searchable encryption schemes because of the

deterministic property of trapdoor generation.

• With properly selected parameter, even the final score

results can be obfuscated very well, preventing the cloud

server from learning the relationships of given trapdoors

and the corresponding keywords.

49

Page 51: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

MRSE_II Scheme

• Keygen: extend S,𝑀1, 𝑀2 to (d+U+1) dimension

• BuildIndex: extend U random numbers instead a random

number in MRSE_I.

• Trapdoor: randomly select V out of U entries in Q, and

set 1 in these position.

• Search: the similarity score is r(𝐷𝑖 ∙ 𝑄+ 𝜀𝑖(𝑣)) + t

50

Page 52: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

Conclusion

• To meet the challenging “encrypted data search” problem.

• Two approaches – SSE and MRSE – have been introduced.

• What’s next: • multi-keyword semantics over encrypted data.

• Extend the idea to multimedia.

important references:

W.K. Wong, D. W. Cheung, B. Kao, and N. Mamoulis,” Secure KNN Computation on Encrypted Database,” SIGMOD International Conference on Management of Data, 2009, pp. 139-152.

Privacy Preserving Search on Multimedia (Prof. Min Wu’s work)

51

Page 53: Introduction to Searchable Encryption - 國立臺灣大學cmlab.csie.ntu.edu.tw/~ipr/mmsec2013/data/lecture...• Documents are encrypted; SI is encrypted — “two-layer” searches

52

Fig. 1: System model for secure image retrieval