[dpm 2015] perfectdedup - secure data deduplication for cloud storage

18
PerfectDedup Secure Data Deduplication Pasquale PUZIO [email protected] SecludIT & EURECOM Refik Molva (EURECOM) Melek Önen (EURECOM) Sergio Loureiro (SecludIT) 10th DPM International Workshop on Data Privacy Management Vienna, Austria, September 21st 2015

Upload: pasquale-puzio

Post on 22-Jan-2018

1.259 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

PerfectDedupSecure Data Deduplication

Pasquale [email protected]

SecludIT & EURECOM

Refik Molva (EURECOM)Melek Önen (EURECOM)Sergio Loureiro (SecludIT)

10th DPM International Workshop on Data Privacy ManagementVienna, Austria, September 21st 2015

Page 2: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Agenda

• Problem Statement– Data Deduplication for Cloud Storage

– Convergent Encryption

• Our solution– Data Popularity

– Perfect Hashing

– PerfectDedup: Secure Popularity Detection

– Security

– Performance Evaluation

2

Page 3: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Deduplication

• Storing duplicate data only once

• Cross-user + Client-side + Block-level

3

Page 4: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Deduplication vs Encryption

… but it does not work on encrypted data!

D = Hello

World

D = Hello

World

ENCRYPTION with K1 ENCRYPTION with K2

owhfgr0wgr[w

hfrw0[h0[ergh

e0[gh0[eg

dfjl;dbfrwbfirbf

roepthwobgfr

ugtwertgrtwu

4

Page 5: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Convergent Encryption

• Data Encryption key derived from DataK = hash(Data)

• Deterministic & Symmetric Encryption

D = Hello

World

D = Hello

World

ENCRYPTION with H(D) ENCRYPTION with H(D)

klfgwilegfiorw

egtriegtiergiei

ergriegrigfifiw

klfgwilegfiorw

egtriegtiergiei

ergriegrigfifiw

5

Douceur, John R., et al. "Reclaiming space from duplicate files in a serverless distributed file system." Distributed Computing Systems, 2002. Proceedings. 22nd International Conference on. IEEE, 2002.

Page 6: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Convergent Encryption

MISSINGINFORMATION

How to achieve safe Convergent Encryption

in the Cloud ?6

Drew Perttula, Brian Warner, and Zooko Wilcox-O'Hearn, 2008-03-20https://tahoe-lafs.org/hacktahoelafs/drew_perttula.html

Page 7: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Data Popularity

• Different protection based on data-segment popularity

• Popular data Not confidential To bededuplicated Convergent Encryption

• Unpopular data Confidential To beprotected Semantically-Secure Encryption

7

Stanek, Jan, et al. "A secure data deduplication scheme for cloud storage." Financial Cryptography and Data Security. Springer Berlin Heidelberg, 2014. 99-118.

Page 8: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

How to securely detect popularity ?

CSP

.

.

.

B...

Is block B popular ?

YES / NO

• Block B must not be disclosed if it is unpopular (sensitive)

CLIENT

8

Page 9: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

PHF-based Lookup

9

ID

Belazzougui, Djamal, Fabiano C. Botelho, and Martin Dietzfelbinger. "Hash, displace, and compress." Algorithms-ESA 2009. Springer Berlin Heidelberg, 2009. 682-693.

Page 10: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

PerfectDedup

• Based on «Secure» Perfect Hashing– One-wayness

• Popular block IDs Collision-free hash function (PHF)

• BENEFITS:– Efficient (linear) generation of a new PHF

(outsourced to the Cloud)

– Compact representation of PHF

– Very efficient (constant) evaluation on a block ID

10

Page 11: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Security

UNPOPULARP

POPULARP

CSP

.

.

.

.

.

.

PHF(ID) = ii ID

Block is popular1-to-1 mapping

No confidentiality issue

11

Page 12: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Security

UNPOPULARP

POPULARP

CSP

.

.

.

.

.

.

PHF(ID) = i

i ID’

Block is unpopularCollisions are well-distributed

One-wayness property

12

Page 13: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

PerfectDedup

CSP

.

.

.

B...

Is block B popular ?

YES / NO

INDEX SERVICE

If NO

POPULARITYTRANSITION ? YES / NO

CLIENT

13

Page 14: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Prototype Implementation

CSP

INDEX SERVICE

CMPH

CMPH

CLIENT

14

Page 15: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Performance Evaluation

0

1

2

3

4

5

6

7

8

9

10

UNPOPULAR FILE POPULARITY TRANSITION POPULAR FILE

Tim

e (

in s

eco

nd

s)

Scenario

Client File Split Client Convergent Encryption

Client Popularity Check Client Symmetric Encryption

Idx Service Update Cloud Generate PHF

Cloud Store Hash Table Cloud Popularity Check

Cloud Upload Processing

15

Page 16: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Conclusions

• Popularity-based Deduplication

• Secure Perfect Hashing

• Secure & Lightweight for the client

• Costly tasks outsourced to the Cloud

• Low overhead

16

Page 17: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

Future Work

• Optimization of PHF generation

• Deployment in real production environments

17

Page 18: [DPM 2015] PerfectDedup - Secure Data Deduplication for Cloud Storage

THANK YOUQuestions ?

Don’t be shy !

[email protected]