enabling secure and efficient ranked keyword

7
Enabling Secure and Efficient Ranked Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data Keyword Search over Outsourced Cloud Data Abstract— Abstract— Cloud computing economically enables the paradigm of Cloud computing economically enables the paradigm of data service outsourcing. However, to protect data data service outsourcing. However, to protect data privacy, sensitive cloud data have to be encrypted privacy, sensitive cloud data have to be encrypted before outsourced to the commercial public cloud, which before outsourced to the commercial public cloud, which makes effective data utilization service a very makes effective data utilization service a very challenging task. Although traditional searchable challenging task. Although traditional searchable encryption techniques allow users to securely search encryption techniques allow users to securely search over encrypted data through keywords, they support only over encrypted data through keywords, they support only Boolean search and are not yet sufficient to meet the Boolean search and are not yet sufficient to meet the effective data utilization need that is inherently effective data utilization need that is inherently demanded by large number of users and huge amount of demanded by large number of users and huge amount of data files in cloud. In this paper, we define and solve data files in cloud. In this paper, we define and solve the problem of secure ranked keyword search over the problem of secure ranked keyword search over encrypted cloud data. Ranked search greatly enhances encrypted cloud data. Ranked search greatly enhances system usability by enabling search result relevance system usability by enabling search result relevance ranking instead of sending undifferentiated results, ranking instead of sending undifferentiated results, and further ensures the file retrieval accuracy. and further ensures the file retrieval accuracy. Specifically, we explore the statistical measure Specifically, we explore the statistical measure

Upload: impulsetechnology

Post on 08-May-2015

1.385 views

Category:

Education


0 download

DESCRIPTION

http://www.slideshare.net

TRANSCRIPT

Page 1: Enabling secure and efficient ranked keyword

Enabling Secure and Efficient Ranked Keyword Search overEnabling Secure and Efficient Ranked Keyword Search over

Outsourced Cloud DataOutsourced Cloud Data

Abstract—Abstract—

Cloud computing economically enables the paradigm of data service outsourcing.Cloud computing economically enables the paradigm of data service outsourcing.

However, to protect data privacy, sensitive cloud data have to be encrypted beforeHowever, to protect data privacy, sensitive cloud data have to be encrypted before

outsourced to the commercial public cloud, which makes effective data utilizationoutsourced to the commercial public cloud, which makes effective data utilization

service a very challenging task. Although traditional searchable encryptionservice a very challenging task. Although traditional searchable encryption

techniques allow users to securely search over encrypted data through keywords,techniques allow users to securely search over encrypted data through keywords,

they support only Boolean search and are not yet sufficient to meet the effectivethey support only Boolean search and are not yet sufficient to meet the effective

data utilization need that is inherently demanded by large number of users anddata utilization need that is inherently demanded by large number of users and

huge amount of data files in cloud. In this paper, we define and solve the problemhuge amount of data files in cloud. In this paper, we define and solve the problem

of secure ranked keyword search over encrypted cloud data. Ranked search greatlyof secure ranked keyword search over encrypted cloud data. Ranked search greatly

enhances system usability by enabling search result relevance ranking instead ofenhances system usability by enabling search result relevance ranking instead of

sending undifferentiated results, and further ensures the file retrieval accuracy.sending undifferentiated results, and further ensures the file retrieval accuracy.

Specifically, we explore the statistical measure approach, i.e., relevance score,Specifically, we explore the statistical measure approach, i.e., relevance score,

from information retrieval to build a secure searchable index, and develop a one-from information retrieval to build a secure searchable index, and develop a one-

to-many order-preserving mapping technique to properly protect those sensitiveto-many order-preserving mapping technique to properly protect those sensitive

score information. The resulting design is able to facilitate efficient server-sidescore information. The resulting design is able to facilitate efficient server-side

ranking without losing keyword privacy. Thorough analysis shows that ourranking without losing keyword privacy. Thorough analysis shows that our

proposed solution enjoys “asstrong- as-possible” security guarantee compared toproposed solution enjoys “asstrong- as-possible” security guarantee compared to

previous searchable encryption schemes, while correctly realizing the goal ofprevious searchable encryption schemes, while correctly realizing the goal of

ranked keyword search. ranked keyword search.

Reasons for the Proposal :Reasons for the Proposal :

As Cloud Computing becomes prevalent, more and more sensitive information areAs Cloud Computing becomes prevalent, more and more sensitive information are

being centralized into the cloud, such as e-mails, personal health records, companybeing centralized into the cloud, such as e-mails, personal health records, company

Page 2: Enabling secure and efficient ranked keyword

finance data, and government documents, etc. The fact that data owners and cloudfinance data, and government documents, etc. The fact that data owners and cloud

server are no longer in the same trusted domain may put the outsourcedserver are no longer in the same trusted domain may put the outsourced

unencrypted data at risk [4], [33]: the cloud server may leak data information tounencrypted data at risk [4], [33]: the cloud server may leak data information to

unauthorized entities [5] or even be hacked [6]. It follows that sensitive data haveunauthorized entities [5] or even be hacked [6]. It follows that sensitive data have

to be encrypted prior to outsourcing for data privacy and combating unsolicitedto be encrypted prior to outsourcing for data privacy and combating unsolicited

accesses. However, data encryption makes effective data utilization a veryaccesses. However, data encryption makes effective data utilization a very

challenging task given that there could be a large amount of outsourced data files.challenging task given that there could be a large amount of outsourced data files.

Besides, in Cloud Computing, data owners may share their outsourced data with aBesides, in Cloud Computing, data owners may share their outsourced data with a

large number of users, who might want to only retrieve certain specific data fileslarge number of users, who might want to only retrieve certain specific data files

they are interested in during a given session. One of the most popular ways to dothey are interested in during a given session. One of the most popular ways to do

so is through keyword-based search. Such keyword search technique allows usersso is through keyword-based search. Such keyword search technique allows users

to selectively retrieve files of interest and has been widely applied in plaintextto selectively retrieve files of interest and has been widely applied in plaintext

search scenarios [7]. Unfortunately, data encryption, which restricts user’s abilitysearch scenarios [7]. Unfortunately, data encryption, which restricts user’s ability

to perform keyword search and further demands the protection of keyword privacy,to perform keyword search and further demands the protection of keyword privacy,

makes the traditional plaintext search methods fail for encrypted cloud data.makes the traditional plaintext search methods fail for encrypted cloud data.

Existing system & demerits :Existing system & demerits :

Although traditional searchable encryption schemes (e.g., [8], [9], [10], [11], [12],Although traditional searchable encryption schemes (e.g., [8], [9], [10], [11], [12],

to list a few) allow a user to securely search over encrypted data through keywordsto list a few) allow a user to securely search over encrypted data through keywords

without first decrypting it, these techniques support only conventional Booleanwithout first decrypting it, these techniques support only conventional Boolean

keyword search,1 without capturing any relevance of the files in the search result.keyword search,1 without capturing any relevance of the files in the search result.

When directly applied in large collaborative data outsourcing cloud environment,When directly applied in large collaborative data outsourcing cloud environment,

they may suffer from the following two main drawbacks. On the one hand, forthey may suffer from the following two main drawbacks. On the one hand, for

each search request, users without preknowledge of the encrypted cloud data haveeach search request, users without preknowledge of the encrypted cloud data have

to go through every retrieved file in order to find ones most matching their interest,to go through every retrieved file in order to find ones most matching their interest,

which demands possibly large amount of postprocessing overhead; On the otherwhich demands possibly large amount of postprocessing overhead; On the other

hand, invariably sending back all files solely based on presence/ absence of thehand, invariably sending back all files solely based on presence/ absence of the

Page 3: Enabling secure and efficient ranked keyword

keyword further incurs large unnecessary network traffic, which is absolutelykeyword further incurs large unnecessary network traffic, which is absolutely

undesirable in today’s pay-as-you-use cloud paradigm. In short, lacking ofundesirable in today’s pay-as-you-use cloud paradigm. In short, lacking of

effective mechanisms to ensure the file retrieval accuracy is a significant drawbackeffective mechanisms to ensure the file retrieval accuracy is a significant drawback

of existing searchable encryption schemes in the context of Cloud Computing.of existing searchable encryption schemes in the context of Cloud Computing.

Nonetheless, the state of the art in information retrieval (IR) community hasNonetheless, the state of the art in information retrieval (IR) community has

already been utilizing various scoring mechanisms [13] to quantify and rank orderalready been utilizing various scoring mechanisms [13] to quantify and rank order

the relevance of files in response to any given search query. Although thethe relevance of files in response to any given search query. Although the

importance of ranked search has received attention for a long history in the contextimportance of ranked search has received attention for a long history in the context

of plaintext searching by IR community, surprisingly, it is still being overlookedof plaintext searching by IR community, surprisingly, it is still being overlooked

and remains to be addressed in the context of encrypted data search.and remains to be addressed in the context of encrypted data search.

Proposed system :Proposed system :

Therefore, how to enable a searchable encryption system with support of secureTherefore, how to enable a searchable encryption system with support of secure

ranked search is the problem tackled in this paper. Our work is among the first fewranked search is the problem tackled in this paper. Our work is among the first few

ones to explore ranked search over encrypted data in Cloud Computing. Rankedones to explore ranked search over encrypted data in Cloud Computing. Ranked

search greatly enhances system usability by returning the matching files in asearch greatly enhances system usability by returning the matching files in a

ranked order regarding to certain relevance criteria (e.g., keyword frequency), thusranked order regarding to certain relevance criteria (e.g., keyword frequency), thus

making one step closer toward practical deployment of privacy-preserving datamaking one step closer toward practical deployment of privacy-preserving data

hosting services in the context of Cloud Computing. To achieve our design goalshosting services in the context of Cloud Computing. To achieve our design goals

on both system security and usability, we propose to bring together the advance ofon both system security and usability, we propose to bring together the advance of

both crypto and IR community to design the ranked searchable symmetricboth crypto and IR community to design the ranked searchable symmetric

encryption (RSSE) scheme, in the spirit of “as-strong-as-possible” securityencryption (RSSE) scheme, in the spirit of “as-strong-as-possible” security

guarantee. Specifically, we explore the statistical measure approach from IR andguarantee. Specifically, we explore the statistical measure approach from IR and

text mining to embed weight information (i.e., relevance score) of each file duringtext mining to embed weight information (i.e., relevance score) of each file during

the establishment of searchable index before outsourcing the encrypted filethe establishment of searchable index before outsourcing the encrypted file

collection. As directly outsourcing relevance scores will leak lots of sensitivecollection. As directly outsourcing relevance scores will leak lots of sensitive

frequency information against the keyword privacy, we then integrate a recentfrequency information against the keyword privacy, we then integrate a recent

Page 4: Enabling secure and efficient ranked keyword

crypto primitive [14] order-preserving symmetric encryption (OPSE) and properlycrypto primitive [14] order-preserving symmetric encryption (OPSE) and properly

modify it to develop a oneto- many order-preserving mapping technique for ourmodify it to develop a oneto- many order-preserving mapping technique for our

purpose to protect those sensitive weight information, while providing efficientpurpose to protect those sensitive weight information, while providing efficient

ranked search functionalities. ranked search functionalities.

Our contribution can be summarized as follows:Our contribution can be summarized as follows:

1. For the first time, we define the problem of secure ranked keyword search over1. For the first time, we define the problem of secure ranked keyword search over

encrypted cloud data, and provide such an effective protocol, which fulfills theencrypted cloud data, and provide such an effective protocol, which fulfills the

secure ranked search functionality with little relevance score information leakagesecure ranked search functionality with little relevance score information leakage

against keyword privacy.against keyword privacy.

2. Thorough security analysis shows that our ranked searchable symmetric2. Thorough security analysis shows that our ranked searchable symmetric

encryption scheme indeed enjoys “as-strong-as-possible” security guaranteeencryption scheme indeed enjoys “as-strong-as-possible” security guarantee

compared to previous searchable symmetric encryption (SSE) schemes.compared to previous searchable symmetric encryption (SSE) schemes.

3. We investigate the practical considerations and enhancements of our ranked3. We investigate the practical considerations and enhancements of our ranked

search mechanism, including the efficient support of relevance score dynamics, thesearch mechanism, including the efficient support of relevance score dynamics, the

authentication of ranked search results, and the reversibility of our proposed oneto-authentication of ranked search results, and the reversibility of our proposed oneto-

many order-preserving mapping technique.many order-preserving mapping technique.

4. Extensive experimental results demonstrate the effectiveness and efficiency of4. Extensive experimental results demonstrate the effectiveness and efficiency of

the proposed solution. the proposed solution.