1 undetectable stegosystem based on noisy channels v. korzhik, g. morales luna, k. loban singapor,...

1

Undetectable Stegosystem Based on Noisy Channels

V. Korzhik, G. Morales Luna, K. Loban

Singapor, NTU, 2010

2

1. Introduction

Steganography (SG) is the information hiding technique that embeds the hidden

information into an innocent cover message (CM) under the conditions that the CM is not

corrupted significantly and that the presence of the additional information into the CM may

not be detected.

Basic principle

of undetectability:

The CM and the SG signal have to be indistinguishable even with

the use of the best statistical methods.

3

Designer’s problem: He (or She) should know the full statistics of the CM, but it is a

rather very hard to study completely the probability distribution of

such CM as video or audio signals.

In order to be successful within this risky situation (which is, indeed, a “bottleneck” of

any SG system) we propose to move into another concept of SG system setting, namely to

SG based on noisy channels.

This setting can be justified if there exists in a natural manner a noisy channel and the

attacker (“stegobreaker”) is able to receive the stegosignal just over this channel and

nothing else.

4

Attacker’s problem: To distinguish statistically the CM after its passing over the noisy

channel and SG signal passing over the same noisy channel.

Reducing of the steganalysis problem: Recognition of channel noise and the sum of the

channel noise and the embedded signal.

Since the channel noise distribution is, as a rule, known much better than the CM

distribution, the problem to design SG systems which are resistant to their detection is

simplified.

New assumption: CM can be publicized; such assumption is impossible for

conventional SG, because otherwise they can be detected trivially – if

SG ≠ CM, then something has been embedded.

5

Model of SG design for a Gaussian noisy channel:

1,2,..., ( ) ( ) ( 1) ( ),bw wn N C n C n n

1

1

where ( ) are the samples of CM,

( ) is a zero mean Gaussian pseudorandom i.i.d. reference sequence with variance 1,

is the lenght of both sequences,

is the depth of embedding

N

n

N

n

w

C C n

n

N

After a passing of the watermarked signal through the Gaussian channel we get:1,2,..., ( ) ( ) ( ),w wn N C n C n n

2

1where ( ) is a zero mean Gaussian i.i.d. noise sequence with variance

N

nn

(1)

(2)

6

More early results [1] for the case when an attacker knows CM:

110.72 ln 1 1 w

w

D N

Where D is the relative entropy (introduced in [2] as a measure of SG system security, 2 2

w w

Approximation of D for large :w

It is well known from Information Theory [2] the following inequality for any hypothesis testing:

1log (1 )log ,

1fa fa

fa famm

P PP P D

PP

20.36

w

ND

1 2

where is the probability of signal false alarm,

is the probability of signal missing system is secure (or ideal) if 0 and then that is equivalenty to random

guessing of SG system p

fa

m

fa m

P SG

PSG D P P

resence or absence.If we let then by (5) we get

fa mPP P

2 1 ln1

PP DP

From eq. (4) it follows that0.6w

N

D

(4)

(3)

(5)

(6)

(7)

7

1

0 if 0( ( ) ( )) ( )

1 otherwise

N

wn

C n C n n b

The bit error probability [1]:

( ( ) ):Legal informed decoder C n is known

exp

2 2 2ew w

N NP Q

2

21where : ( )

2

t

x

Q x Q x e dt

The general relation that comprises both security and reliability [1]:

1 4 1 2

1 2

( ) ( )1.29 exp 0.83e

ND NDP Q

mm

0where is the number of secure embedded bits into samples.m N N N

(8)

(10)

(9)

8

Optimistic draw:

Our proposal: To use so called spread-time stegosystem (STS) – see next Section.

For any security level D and the number of secure bits m there can

be chosen an appropriated N such that the SG system provides any

given reliability .eP

Pessimistic draw: The more is N for given D the less (see eq.(7)) should be “signal-to-

noise” ratio and this results in a problem for practical

implementation (especially for digital processing).

1 2 2w w

Example.Let D=0.1 (that provides an acceptable level of security) and let m=10 be

the number of the secure embedded bits

If we choose then by (10) which is acceptable. But

. by (7) and if the CM signal-to-noise ratio .

where then which is indeed

unacceptable.

42.5 10eP 510 ,N

600w

2

1var ( ) ,

N

c nC n

24

26 10

w

22

210c

9

2. Description of STS and its performance evolution.

2.1. Uncoded system.

0

0

Pr[ ( ) ( ) ( 1) ( )]

Pr[ ( ) ( )] 1 , 1,2,...

bw w

w

C n C n n P

C n C n P n N

Practical implementation: Let be an increasing sequence of indexes, generated by a secret stegokey K, determining the samples in which the WM’s are to be embedded (see Fig.1.) Then for a large value of N we may assume that

Fig.1. STS system with embedding into pseudorandom samples.

0( Here 8, 38, 8 38).sN N P

In order to embed one secret bit b are used consecutive chosen samples.

Hence the total number of secret bits embedded into samples is

0N

SN 0t SN N N

1

SN

m mn

SN N

0 .SP N N

(11)

10

Legal user knows the stegokey K, hence he knows exactly the samples with embedding and

can execute the decision rule (8).

The error probability can be found by (9).

Optimal STS detecting by an attacker:

The two hypothesis have to be tested:

20

20

1 20

1

2 2 2

: [ (0, ) and it is an i.i.d.]

with the probability :[ (0, ) and it is an i.i.d.]:

with the probability 1 :[ (0, ) and it is an i.i.d.]

where ( ) ( ) ( ) ,

s

N

w n

s w

H

PH

P

n C n C n

(12)

11

Let us assume that (for a good security guarantee), then we get from (14) after simple normalization.

The optimal hypothesis testing based on maximum livelihood ratio [3] is

1 0 1 1 0 0

2 221

1 0 0 02 2 210

( ) ; ( )

( )where ( ) exp ( ) (1 )

( ) 2

Nw

i s s

H H

P HP n P

P H

2 22

1 0 0 02 2 21

( ) log exp ( ) (1 )2

Nw

Ln s s

P n P

By changing the threshold in (13) it is possible to pass to the logarithmic livelihood ratio:

2 2w

21 0 0 02 2

1

1 1( ) log exp ( ) (1 )

2

N

Ln w

P n PN

(13)

(14)

(15)

12

The series expansion of up to its linear term and next the series expansion of . up to its linear term renders the following decision rule:

exp( )x xexp( )x x

1 0

2

1

;

1where , is some new threshold

N

nn

H H

N

Let us estimate (using the Central Limit Theorem) the missing and false alarm probabilities, and , respectively:m faP P

21

2200

20

2200

2

( )1exp

22

( )1exp

22

where [ ], var( ) for 0,1

m

fa

j j j j

x mP dx

x mP dx

m H H j

(16)

(17)

(18)

13

Let us select (for simplicity) the threshold in such away to be

Then after simple transforms of eq’s (17)-(18) we get

= = ,m faP P P

1 0

0

2 2 2 2 40 1 0 0

2

where , , 2w

m mP Q

m m P N

Substituting (20) into (19) we obtain:

0

2 2 2 2S

w w

N P NP Q Q

N

0 0

3

If , then 1 2 and an undetectable stegosystem results.

In Table 1 we show the calculation results for some values of parameters: , , ,

providing 10 , 0.4 given the same values of and

S

S

e w

N N P

N N m P

P P N .

(20)

(21)

(19)

14

20 210 1431 6 0.1431

50 496 3578 7 0.3578

100 973 7156 7 0.7156

20 210 4526 21 0.04526

50 496 11310 22 0.1131

100 973 22630 23 0.2263

20 210 14310 68 0.01431

50 496 35780 72 0.03578

100 973 71560 73 0.07156

20 210 45260 215 0.004526

50 496 113100 228 0.01131

100 973 226300 232 0.02263

3Table 1. Sets of parameters for STS providing 10 , 0.4

given different values and .e

w

P P

N

We can see that for large enough N it is possible to provide a good undetectability ( ) and reliability ( ) of STS and embed up to 232 secure bits.

0 0.4P 310eP

N w0N SN m 0 SP N N

410

510

610

710

15

2.2. Coded system. We restrict our attention to binary linear systematic 0( , , ) codes.N k d

Embedding:

Pr[ ( ) ( ) ( 1) ( )],

where is the sample index with embedding.

ijb

w j j w j

j

C n C n n

n

Decoding:

0

1 2 1

0

arg max ( ) ( ) ( 1) ( )

where is the -th bit in the -th codeword of length with a positive integer value.

Total number of secure embedded bits

in

k

Nb

wi n

in S

i C n C n n

b n i N N l l

0

0

is .

The block-error probability based on union bound [5]:

2 1 exp ln 22 2 2

where the code rate.

kbe

w w

m k l

d dP Q RN

R k N

(24)

(23)

(22)

16

1Since signal-to-noise ratio is typically small, we will restrict our consideration only to two

classes of linear error correcting codes:w

The simplex codes (SC):1

0 02 1, , 2 , , where is some integerN k d R N

The Reed-Muller codes (RMC): 01

2 , , 2 , where 3 and -is integer, the so

called order of the RMC.

rr

i

vN k d r

i

Example: 710 , 0.4, 20, 45257.w SN P N

3

0

9

the optimal parameters: 10, 10, 10 , 442

the optimal parameters: 14, 2, 105, 10 , 290.

Sbe

be

Nk P m k

N

r k P m

SC :

RMC :

17

3. Optimal SG system detecting rule.

Let us verify if the use of the optimal decision rule (15) can provide an appreciable

improvement of STS detecting in comparison with the suboptimal decision rule (16)?

Since N is sufficiently large, we can apply the Central Limit Theorem [4] to the sum in (15).

Similar to the proof of (19) we get for such a choice of threshold , which provides = = :m faP P P

1 0

02

m mP Q

2

0 02

1

22 2

2 20 0 0 0 0 0 02 2

( )log exp (1 )

2

1 1( ) ( )Var log exp (1 ) log exp (1 )

2 2

N

j jw n

jw w

nm E P P H

n nP P H E P P H m

N N

where random values have the probability distributions by (12)

where, for 0,1:j

( )n

18

0 1 0Since it is very hard to find analytically the values , and , we estimate them by the simulation:m m

We can see that the use of the optimal decision rule does not break undetectability of STS.

0.4014610.0002283970.0001577970.0001576860.07156100

0.4017110.000231440.0001585980.0001584790.0357850

0.4017520.0002431870.0001624350.0001622880.0143120

5

0.4015480.0002283970.0001578080.0001576860.07156100

0.4018620.0002314400.0001585850.0001584790.0357850

0.4018350.0002433410.0001623930.0001622880.0143120

1

0.4017450.0007184830.0004977950.0004966720.2263100

0.4018060.0007277690.0005002770.0004990620.113150

0.4017720.0007670010.0005138540.0005126180.0452620

5

0.4017370.0007184830.0004978300.0004966720.2263100

0.4018090.0007297120.0005014490.0005003320.113150

0.4017410.0007670010.0005137370.0005126180.0452620

1

0.4017260.002254450.001575830.001564620.7156100

0.4017450.002290170.001588210.001576750.357850

0.4017540.002403980.001625900.001614140.143120

5

0.4017370.002254450.001575850.001564620.7156100

0.4017590.002290170.001588010.001576740.357850

0.4017530.002403980.001626670.001614140.143120

1

N2

0 SP N Nw1m 0 1 0

02

m mP Q

0m

410

510

610

19

10 dBc

4. Simulation of STS for audio cover messages.

We use audio music file with duration about 29 sec in format wav where the sampling

frequency is 44.1 kHz.

The CM signal-to-noise ratio , whereas watermark-to-noise ratio (WNR)

The embedding rule was taken by (11), where .

In Fig. 2 the wave forms of original audio signal, audio signal after passing over a noisy channel

and after secret message embedding at the same time interval are presented.

One can see that noise corrupts slightly the audio signal and this fact can also be appreciated by

human ear, whilist, at the same time, the embedding procedure is not observable by human ear.

1 20 dBw

0 0.1P

20

Fig.2. The waveforms of audio signal (a), audio signal after its passing over noisy channel with

CM signal-to-noise ratio (b), and after embedding by STS algorithm with (c)10c dB 20WNR dB

21

In Fig.3 the waveforms of channel noise are shown, as well as this noise after embedding.

Fig.3. The waveform of channel noise (a) and

the same channel noise after embedding (b).

0

In Table 3 we present the results of simulation for the error probability versus

the parameters and . is the theorical error probability.e

w e

P

N P

20 210 0.001

50 496 0.001

100 973 0.001

w 0N ePeP

45 1046 10

45.5 10

22

5. Conclusion.

1. Some modification of the stegosystem based on noisy channel called spread-time

stegosystem (STS) has been proposed.

2. Both STS security and reliability can be provided by an appropriate selection of the system

parameters.

3. The main defect of STS is its low embedding rate. The use of error correcting codes improve

this situation but only slightly.

4. The suboptimal system detection by (16) is practically as much efficient as the optimal

by (15).

5. Simulation of the STS with audio CM shows that its detection by ear and eye is impossible,

whereas the embedded bits can be extracted reliably.

23

Open problems.

1. To specify security of STS for digital CM and after a saving the stegosignal in digital

formats.

2. Consider applications of STS in real noisy channels (including optical fiber channels).

3. An extraction of secret bits by a “blind” decoder (see [6]) while keeping a good

undetectability of STS.

4. Investigation of attack to prevent an extraction of secure embedded bits (additive noise,

compression/decompression, resynchronization ).

5. Improvement of coded STS (farther optimization of codes and decoding algorithms).

24

References.

[1] Valery Korzhik, Moon Ho Lee, Gullermo Morales Luna, “Stegosystem Based on Noisy

Channels”, Trans. VII Spanish Meeting on Cryptography and Information Security, 2006;

[2] Cachin C., “An information theoretic model for stegonography”. In: International workshop

on IH, 1998, pp. 306-318.

[3] Van Der Warden, “Mathematische Statistik”, Springer-Verlag, 1957.

[4] Papoulis, A. “Pobability, Random Variables, and Stochastic Processes”, McGraw-Hill, New-

York, 1984.

[5] MacWilliams, F. Sloan, N. “The Theory of Error Correcting Codes”. Bell Labs, 1991.

[6] Malvar, H.S. Florencio, D. “Improved spread spectrum: A new modulation technique for

robust watermarking”. IEEE Transaction on Signal Processing 51 (2003), pp. 898-905/

[7] Valery Korzhik, Guillermo Morales-Luna, Ksenia Loban, “Undetectable Spread-time

Stegosystem Based on Noisy Channels”. Submitted for journal “Information Hiding”,

Lecture Notes in Computer Science, 2009

1 undetectable stegosystem based on noisy channels v. korzhik, g. morales luna, k. loban singapor,...

Documents