on compression of data encrypted with block ciphers

46
On Compression of Data Encrypted with Block Ciphers Demijan Klinc * Carmit Hazay Ashish Jagmohan ** Hugo Krawczyk ** Tal Rabin ** * Georgia Institute of Technology ** IBM T.J. Watson Research Labs Weizmann Institute and IDC

Upload: sharne

Post on 24-Feb-2016

40 views

Category:

Documents


0 download

DESCRIPTION

On Compression of Data Encrypted with Block Ciphers. Demijan Klinc * Carmit Hazay † Ashish Jagmohan ** Hugo Krawczyk ** Tal Rabin ** * Georgia Institute of Technology ** IBM T.J. Watson Research Labs † Weizmann Institute and IDC. Traditional Model. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: On Compression of Data Encrypted with Block Ciphers

On Compression of Data Encrypted with Block CiphersDemijan Klinc* Carmit Hazay† Ashish

Jagmohan**

Hugo Krawczyk** Tal Rabin**

* Georgia Institute of Technology** IBM T.J. Watson Research Labs

† Weizmann Institute and IDC

Page 2: On Compression of Data Encrypted with Block Ciphers

Traditional ModelTransmitting redundant data over

insecure and bandwidth-constrained channel• Traditionally, data first compressed

and then encrypted key (k)

Xsource

compress

encrypt

encoder

C(X) EK(C(X))

Page 3: On Compression of Data Encrypted with Block Ciphers

Traditional ModelWhat if encryptor and compressor

are two entities with different goals?• E.g., storage provider wants to compress

data to minimize storage space but does not have access to the key

Can we reverse the order of these steps?

Page 4: On Compression of Data Encrypted with Block Ciphers

Compression and Encryption in Reverse Order

key (k)

Xsource

encrypt compress

Ek(X) C(Ek(X))

Does not

know k!

Can we encrypt first and only then compress without knowing the key?

Page 5: On Compression of Data Encrypted with Block Ciphers

Compression and Encryption in Reverse OrderFor a fixed key, encryption scheme

is a bijection, therefore the entropy is preserved• It follows that it is theoretically

possible to compress the source to the same level as before encryption

In practice, encrypted data appears to be random • Conventional compression techniques

do not yield desirable results

Page 6: On Compression of Data Encrypted with Block Ciphers

Compression and Encryption in Reverse OrderFully homomorphic encryption

shows that one can compress optimally without decrypting• Simply run the compression

algorithm on the plaintext

Fully homomorphic encryption supports addition and multiplication:E(m1), E(m2) → E(m1+m2)E(m1), E(m2) → E(m1∙m2)Stating differently:C, E(m) → E(C(m))

Page 7: On Compression of Data Encrypted with Block Ciphers

OutlinePreliminariesSource Coding with Side

InformationCompressing Stream CiphersCompressing Block CiphersSimulation resultsImpossibility Result

Page 8: On Compression of Data Encrypted with Block Ciphers

Private Key EncryptionTriple of algorithms: (Gen,Enc,Dec)• Same key for encryption and

decryption

Security – CPA security (informally):• It should be infeasible to distinguish

an encryption of m from an encryption of m’

Page 9: On Compression of Data Encrypted with Block Ciphers

Private Key EncryptionTwo categories:• Stream ciphers

Plaintext encrypted one symbol at a time, typically by summing it with a key (XOR operation for binary alphabets), e.g., one-time pad

• Block ciphers Encryption is accomplished by means of

nonlinear mappings on input blocks of fixed length E.g., AES, DES

Page 10: On Compression of Data Encrypted with Block Ciphers

Binary Symmetric Channel

0

1

p

p

1-p

1-p

0

1

Communication model where each sent bit is flipped with probability p

Entropy is: H(p)= - (p log p +(1-p) log

(1-p))

X Y

Pr( Y = 0 | X = 0 ) = 1−pPr( Y = 0 | X = 1) = pPr( Y = 1 | X = 0 ) = pPr( Y = 1 | X = 1 ) = 1−p

Page 11: On Compression of Data Encrypted with Block Ciphers

OutlinePreliminariesSource Coding with Side

InformationCompressing Stream CiphersCompressing Block CiphersSimulation resultsImpossibility Result

Page 12: On Compression of Data Encrypted with Block Ciphers

Source Coding with Side Information

Xsource

compress

decompressC(X) X

Y

X,Y : random variables over a finite alphabet with a joint probability distribution PXYGoal: losslessly compress X with Y known only to the decoder

Page 13: On Compression of Data Encrypted with Block Ciphers

Source Coding with Side InformationFor sufficiently large block length,

this can be done at rates arbitrarily close to H[X|Y] [SlepianWolf73]• Non constructive theorem• Practical coding schemes use

constructions based on good linear error-correcting codes e.g. LDPC code [RichardsonUrbanke08]

Page 14: On Compression of Data Encrypted with Block Ciphers

Linear Error Correcting CodesError correcting codes:• Communication is over a noisy channel• Add redundancy to source to correct

errors

A linear code of length n and dimension r is a linear subspace of the vector space (F2)m

• Encoding: using generating matrix• Decoding: using parity check matrix

Page 15: On Compression of Data Encrypted with Block Ciphers

Linear Error Correcting CodesMinimum distance:• The weight of the lowest-weight

nonzero codeword

In order to correct i errors the minimum distance should be 2i+1

Page 16: On Compression of Data Encrypted with Block Ciphers

Linear Error Correcting CodesCosets:

Suppose that C is [m, r] linear code over F2 and that a is any vector in (F2)m

• Then the set a+C = {a+x | xC} is called a coset of C• Every vector of (F2)m is in some coset of C • Every coset contains exactly 2r vectors• Two cosets are either disjoint

or equal

Page 17: On Compression of Data Encrypted with Block Ciphers

Source Coding with Side InformationExample:

Assume Y known to encoder and decoder Ham(X,Y)≤1

Xsource

compress

decompressC(X) X

Y

Page 18: On Compression of Data Encrypted with Block Ciphers

Source Coding with Side InformationLet X=010, then Y{010, 011,

000, 110}Goal:

encode XY using less than 3 bits

How?Let e= XY, then e{000, 001, 010, 100} encoder sends index of coset in which e occurs

Page 19: On Compression of Data Encrypted with Block Ciphers

Source Coding with Side InformationLet C={Y,Y} be a linear code

with distance 3 that can fix one error

The space is partitioned into 4 cosets:

• Coset 1 = {000,111}• Coset 2 = {001, 110}• Coset 3 = {010, 101}• Coset 4 = {100, 011}

Recall:e{000, 001, 010,

100}

Each index requires 2 bits

decoding: output Ye’where e’ is the leader

000001010100

Page 20: On Compression of Data Encrypted with Block Ciphers

Source Coding with Side Information

Xsource

compress

decompressC(X) X

Y

Without Y the encoder cannot compute e!• e= XY

Page 21: On Compression of Data Encrypted with Block Ciphers

Source Coding with Side InformationStill possible: • Encode coset in which X occurs

• Coset 1 = {000,111}• Coset 2 = {001, 110}• Coset 3 = {010, 101}• Coset 4 = {100, 011}

Each index requires 2 bits

decoding: output e’where the hamming

distance of e’ and Y is smallest

Slepian-Wolf codes over finite block

lengths have nonzero error which implies

that the decoder will sometimes fail

Page 22: On Compression of Data Encrypted with Block Ciphers

Source Coding with Side InformationIn practice:1. Fix p and determine the

compression rate of a Slepian-Wolf code that satisfies the target error

2. Pick Slepian-Wolf code and determine the maximum p for which target error is satisfiedNeed to know the source statistics!

Page 23: On Compression of Data Encrypted with Block Ciphers

OutlinePreliminariesSource Coding with Side

InformationCompressing Stream CiphersCompressing Block CiphersSimulation resultsImpossibility Result

Page 24: On Compression of Data Encrypted with Block Ciphers

Compression Stream CiphersThis problem can be formulated as a

Slepian-Wolf coding problem [JohnsonWagnerRamchandran04]

key (k)

Xsource

compress

Ek(X) C(Ek(X))

The ciphertext is cast as a

source

The shared key k is cast as the decoder-only

side-information

Page 25: On Compression of Data Encrypted with Block Ciphers

Compression Stream Ciphers• Compression is achievable due to

correlation between the key K and the ciphertext C=XK

• The joint distribution of the source and side-information can be determined from the statistics of the source

Xsource

compress

Ek(X) C(Ek(X))

key (k)

Page 26: On Compression of Data Encrypted with Block Ciphers

Compression Stream Ciphers

key (k)

C(Ek(X))source

Joint decryptionand

decompression

decoder

X

The decoder knows k and source statisticsCompression rate H(Ek(X)|K)=H(XK|K)=H(X) is asymptotically achievable

Page 27: On Compression of Data Encrypted with Block Ciphers

EfficiencyEncoding: finding coset of Ek(X) can

be done by multiplying Ek(X) with parity check matrix• I.e., Ek(X)∙HT is the syndrome of Ek(X)

Decoding: exhaustive search through the coset of Ek(X)• Is improved using LDPC codes, decoding

is polynomial in the block length

Page 28: On Compression of Data Encrypted with Block Ciphers

SecurityCompression that operates on

top of one time pad does not compromise security of the encryption scheme• Compressor does not know K

Page 29: On Compression of Data Encrypted with Block Ciphers

OutlinePreliminariesSource Coding with Side

InformationCompressing Stream CiphersCompressing Block CiphersSimulation resultsImpossibility Result

Page 30: On Compression of Data Encrypted with Block Ciphers

Compressing Block CiphersWidely used in practiceThe correlation between the key

ciphertext is more complex• Previous approach is not directly

applicable

Does data encrypted with block ciphers can be compressed without access to the key?

Page 31: On Compression of Data Encrypted with Block Ciphers

Electronic Code Book (ECB) Mode The simplest mode of operation where each block is

evaluated separately Compression in this mode is theoretically possible, is

it also practical?

block cipher

X1

k

Ek(X1)

block cipher

X2

k

Ek(X2)

block cipher

Xn

k

Ek(Xn)

The compression schemes that we present rely on the

specifics of chaining operations

Page 32: On Compression of Data Encrypted with Block Ciphers

Cipher Block Chaining (CBC) Mode

block cipher

k

Ek(X1)

block cipher

k

Ek(X2)

block cipher

k

Ek(Xn)IV

IV

Xn

Xn

X2

X2

X1

X1

Correlation between Ek(Xi) and Xi+1 is easier to characterize and can be exploit for compression

Page 33: On Compression of Data Encrypted with Block Ciphers

Compressing Block Ciphers

IV, Ek(X1)…Ek(Xn) compressor

Last block is left uncompressed, while IV

is compressed

C(IV,) C(Ek(X1))…Ek(Xn)

Recalling that Xi+1= Ek(Xi)Xi+1Ek(Xi) is cast as the source and Xi+1 is cast as the side information

Page 34: On Compression of Data Encrypted with Block Ciphers

Decoding

decryptionK

Ek(Xn)

Xn

Xn

Slepian-Wolf

decoder

C(Ek(Xn-1))

Ek(Xn-1)

decryptionK

Xn-1

Slepian-Wolf

decoder

C(Ek(Xn))

Ek(Xn)

Ek(Xn-1)

Xn-1

Page 35: On Compression of Data Encrypted with Block Ciphers

OutlinePreliminariesSource Coding with Side

InformationCompressing Stream CiphersCompressing Block CiphersSimulation resultsImpossibility Result

Page 36: On Compression of Data Encrypted with Block Ciphers

Compression Factorlet {Cm,R,Dm,R} denote an order m

Slepian-Wolf code with compression rate R• Compressor Cm,R: {0,1}m → {0,1}mR

• Decompressor Dm,R: {0,1}mR x {0,1}m

→ {0,1}m

compression factor:R1

m+R•m•nm•)1+n(

Page 37: On Compression of Data Encrypted with Block Ciphers

Compression ResultsIrregular LDPC codes were used in our

performance evaluation

Table: Attainable compression rates for m = 128 bits

Source Entropy

Compression Rate

Target Error

P

0.1739 0.50 10-3 0.0260.1301 0.50 10-4 0.0180.3584 0.75 10-3 0.0680.3032 0.75 10-4 0.054

Page 38: On Compression of Data Encrypted with Block Ciphers

Compression ResultsIrregular LDPC codes were used in our

performance evaluation

Table: Attainable compression rates for m = 1024 bits

Source Entropy

Compression Rate

Target Error

P

0.3195 0.50 10-3 0.0580.2778 0.50 10-4 0.0480.5710 0.75 10-3 0.1340.5464 0.75 10-4 0.126

Page 39: On Compression of Data Encrypted with Block Ciphers

OutlinePreliminariesSource Coding with Side

InformationCompressing Stream CiphersCompressing Block CiphersSimulation resultsImpossibility Result

Page 40: On Compression of Data Encrypted with Block Ciphers

Recall -- ECB Mode

block cipher

m1

K

Ek(m1)

block cipher

m2

K

Ek(m2)

block cipher

mn

K

Ek(mn)

Page 41: On Compression of Data Encrypted with Block Ciphers

Notable ObservationsExhaustive strategies are infeasible in

most cases• Except for very low-entropy plaintext

distributions or compression ratios• By truncating the ciphertext

For example, consider plaintext distribution consisting of 1,000 128-bit values uniformly distributed• One can compress the output of a 128-bit

block cipher by truncating the 128-bit ciphertext to 40 bits

Can we construct a better strategy?

Page 42: On Compression of Data Encrypted with Block Ciphers

Impossibility ResultThere does not exist generic

(C,D) for block ciphers unless (C,D)• Either exhaustive or• Computationally infeasible

There does not exist efficient (C,D) for ECB

mode!

Page 43: On Compression of Data Encrypted with Block Ciphers

The Public-Key SettingHybrid encryption• Using public-key scheme to encrypt

a symmetric key and then encrypt the data with this key

El Gamal encryption• Similar technique when using xor

Page 44: On Compression of Data Encrypted with Block Ciphers

Concluding RemarksData encrypted with block ciphers

are practically compressible, when chaining modes are employed

Notable compression factors were demonstrated with binary memoryless sources

Short block sizes limit the performance, but that could change in the future

Generic compression is impossible

Page 45: On Compression of Data Encrypted with Block Ciphers

Future WorkAn interesting question refers to

whether compression is possible without any preliminary knowledge on the data• Can compression be achieved using

algorithms that do not rely on the source statistics, i.e., universal algorithms

The error:• Can we consider less limited setting

where the error is not independent?

Page 46: On Compression of Data Encrypted with Block Ciphers

Thank You!