1 undetectable stegosystem based on noisy channels v. korzhik, g. morales luna, k. loban singapor,...
TRANSCRIPT
1
Undetectable Stegosystem Based on Noisy Channels
V. Korzhik, G. Morales Luna, K. Loban
Singapor, NTU, 2010
2
1. Introduction
Steganography (SG) is the information hiding technique that embeds the hidden
information into an innocent cover message (CM) under the conditions that the CM is not
corrupted significantly and that the presence of the additional information into the CM may
not be detected.
Basic principle
of undetectability:
The CM and the SG signal have to be indistinguishable even with
the use of the best statistical methods.
3
Designer’s problem: He (or She) should know the full statistics of the CM, but it is a
rather very hard to study completely the probability distribution of
such CM as video or audio signals.
In order to be successful within this risky situation (which is, indeed, a “bottleneck” of
any SG system) we propose to move into another concept of SG system setting, namely to
SG based on noisy channels.
This setting can be justified if there exists in a natural manner a noisy channel and the
attacker (“stegobreaker”) is able to receive the stegosignal just over this channel and
nothing else.
4
Attacker’s problem: To distinguish statistically the CM after its passing over the noisy
channel and SG signal passing over the same noisy channel.
Reducing of the steganalysis problem: Recognition of channel noise and the sum of the
channel noise and the embedded signal.
Since the channel noise distribution is, as a rule, known much better than the CM
distribution, the problem to design SG systems which are resistant to their detection is
simplified.
New assumption: CM can be publicized; such assumption is impossible for
conventional SG, because otherwise they can be detected trivially – if
SG ≠ CM, then something has been embedded.
5
Model of SG design for a Gaussian noisy channel:
1,2,..., ( ) ( ) ( 1) ( ),bw wn N C n C n n
1
1
where ( ) are the samples of CM,
( ) is a zero mean Gaussian pseudorandom i.i.d. reference sequence with variance 1,
is the lenght of both sequences,
is the depth of embedding
N
n
N
n
w
C C n
n
N
After a passing of the watermarked signal through the Gaussian channel we get:1,2,..., ( ) ( ) ( ),w wn N C n C n n
2
1where ( ) is a zero mean Gaussian i.i.d. noise sequence with variance
N
nn
(1)
(2)
6
More early results [1] for the case when an attacker knows CM:
110.72 ln 1 1 w
w
D N
Where D is the relative entropy (introduced in [2] as a measure of SG system security, 2 2
w w
Approximation of D for large :w
It is well known from Information Theory [2] the following inequality for any hypothesis testing:
1log (1 )log ,
1fa fa
fa famm
P PP P D
PP
20.36
w
ND
1 2
where is the probability of signal false alarm,
is the probability of signal missing system is secure (or ideal) if 0 and then that is equivalenty to random
guessing of SG system p
fa
m
fa m
P SG
PSG D P P
resence or absence.If we let then by (5) we get
fa mPP P
2 1 ln1
PP DP
From eq. (4) it follows that0.6w
N
D
(4)
(3)
(5)
(6)
(7)
7
1
0 if 0( ( ) ( )) ( )
1 otherwise
N
wn
C n C n n b
The bit error probability [1]:
( ( ) ):Legal informed decoder C n is known
exp
2 2 2ew w
N NP Q
2
21where : ( )
2
t
x
Q x Q x e dt
The general relation that comprises both security and reliability [1]:
1 4 1 2
1 2
( ) ( )1.29 exp 0.83e
ND NDP Q
mm
0where is the number of secure embedded bits into samples.m N N N
(8)
(10)
(9)
8
Optimistic draw:
Our proposal: To use so called spread-time stegosystem (STS) – see next Section.
For any security level D and the number of secure bits m there can
be chosen an appropriated N such that the SG system provides any
given reliability .eP
Pessimistic draw: The more is N for given D the less (see eq.(7)) should be “signal-to-
noise” ratio and this results in a problem for practical
implementation (especially for digital processing).
1 2 2w w
Example.Let D=0.1 (that provides an acceptable level of security) and let m=10 be
the number of the secure embedded bits
If we choose then by (10) which is acceptable. But
. by (7) and if the CM signal-to-noise ratio .
where then which is indeed
unacceptable.
42.5 10eP 510 ,N
600w
2
1var ( ) ,
N
c nC n
24
26 10
w
22
210c
9
2. Description of STS and its performance evolution.
2.1. Uncoded system.
0
0
Pr[ ( ) ( ) ( 1) ( )]
Pr[ ( ) ( )] 1 , 1,2,...
bw w
w
C n C n n P
C n C n P n N
Practical implementation: Let be an increasing sequence of indexes, generated by a secret stegokey K, determining the samples in which the WM’s are to be embedded (see Fig.1.) Then for a large value of N we may assume that
Fig.1. STS system with embedding into pseudorandom samples.
0( Here 8, 38, 8 38).sN N P
In order to embed one secret bit b are used consecutive chosen samples.
Hence the total number of secret bits embedded into samples is
0N
SN 0t SN N N
1
SN
m mn
SN N
0 .SP N N
(11)
10
Legal user knows the stegokey K, hence he knows exactly the samples with embedding and
can execute the decision rule (8).
The error probability can be found by (9).
Optimal STS detecting by an attacker:
The two hypothesis have to be tested:
20
20
1 20
1
2 2 2
: [ (0, ) and it is an i.i.d.]
with the probability :[ (0, ) and it is an i.i.d.]:
with the probability 1 :[ (0, ) and it is an i.i.d.]
where ( ) ( ) ( ) ,
s
N
w n
s w
H
PH
P
n C n C n
(12)
11
Let us assume that (for a good security guarantee), then we get from (14) after simple normalization.
The optimal hypothesis testing based on maximum livelihood ratio [3] is
1 0 1 1 0 0
2 221
1 0 0 02 2 210
( ) ; ( )
( )where ( ) exp ( ) (1 )
( ) 2
Nw
i s s
H H
P HP n P
P H
2 22
1 0 0 02 2 21
( ) log exp ( ) (1 )2
Nw
Ln s s
P n P
By changing the threshold in (13) it is possible to pass to the logarithmic livelihood ratio:
2 2w
21 0 0 02 2
1
1 1( ) log exp ( ) (1 )
2
N
Ln w
P n PN
(13)
(14)
(15)
12
The series expansion of up to its linear term and next the series expansion of . up to its linear term renders the following decision rule:
exp( )x xexp( )x x
1 0
2
1
;
1where , is some new threshold
N
nn
H H
N
Let us estimate (using the Central Limit Theorem) the missing and false alarm probabilities, and , respectively:m faP P
21
2200
20
2200
2
( )1exp
22
( )1exp
22
where [ ], var( ) for 0,1
m
fa
j j j j
x mP dx
x mP dx
m H H j
(16)
(17)
(18)
13
Let us select (for simplicity) the threshold in such away to be
Then after simple transforms of eq’s (17)-(18) we get
= = ,m faP P P
1 0
0
2 2 2 2 40 1 0 0
2
where , , 2w
m mP Q
m m P N
Substituting (20) into (19) we obtain:
0
2 2 2 2S
w w
N P NP Q Q
N
0 0
3
If , then 1 2 and an undetectable stegosystem results.
In Table 1 we show the calculation results for some values of parameters: , , ,
providing 10 , 0.4 given the same values of and
S
S
e w
N N P
N N m P
P P N .
(20)
(21)
(19)
14
20 210 1431 6 0.1431
50 496 3578 7 0.3578
100 973 7156 7 0.7156
20 210 4526 21 0.04526
50 496 11310 22 0.1131
100 973 22630 23 0.2263
20 210 14310 68 0.01431
50 496 35780 72 0.03578
100 973 71560 73 0.07156
20 210 45260 215 0.004526
50 496 113100 228 0.01131
100 973 226300 232 0.02263
3Table 1. Sets of parameters for STS providing 10 , 0.4
given different values and .e
w
P P
N
We can see that for large enough N it is possible to provide a good undetectability ( ) and reliability ( ) of STS and embed up to 232 secure bits.
0 0.4P 310eP
N w0N SN m 0 SP N N
410
510
610
710
15
2.2. Coded system. We restrict our attention to binary linear systematic 0( , , ) codes.N k d
Embedding:
Pr[ ( ) ( ) ( 1) ( )],
where is the sample index with embedding.
ijb
w j j w j
j
C n C n n
n
Decoding:
0
1 2 1
0
arg max ( ) ( ) ( 1) ( )
where is the -th bit in the -th codeword of length with a positive integer value.
Total number of secure embedded bits
in
k
Nb
wi n
in S
i C n C n n
b n i N N l l
0
0
is .
The block-error probability based on union bound [5]:
2 1 exp ln 22 2 2
where the code rate.
kbe
w w
m k l
d dP Q RN
R k N
(24)
(23)
(22)
16
1Since signal-to-noise ratio is typically small, we will restrict our consideration only to two
classes of linear error correcting codes:w
The simplex codes (SC):1
0 02 1, , 2 , , where is some integerN k d R N
The Reed-Muller codes (RMC): 01
2 , , 2 , where 3 and -is integer, the so
called order of the RMC.
rr
i
vN k d r
i
Example: 710 , 0.4, 20, 45257.w SN P N
3
0
9
the optimal parameters: 10, 10, 10 , 442
the optimal parameters: 14, 2, 105, 10 , 290.
Sbe
be
Nk P m k
N
r k P m
SC :
RMC :
17
3. Optimal SG system detecting rule.
Let us verify if the use of the optimal decision rule (15) can provide an appreciable
improvement of STS detecting in comparison with the suboptimal decision rule (16)?
Since N is sufficiently large, we can apply the Central Limit Theorem [4] to the sum in (15).
Similar to the proof of (19) we get for such a choice of threshold , which provides = = :m faP P P
1 0
02
m mP Q
2
0 02
1
22 2
2 20 0 0 0 0 0 02 2
( )log exp (1 )
2
1 1( ) ( )Var log exp (1 ) log exp (1 )
2 2
N
j jw n
jw w
nm E P P H
n nP P H E P P H m
N N
where random values have the probability distributions by (12)
where, for 0,1:j
( )n
18
0 1 0Since it is very hard to find analytically the values , and , we estimate them by the simulation:m m
We can see that the use of the optimal decision rule does not break undetectability of STS.
0.4014610.0002283970.0001577970.0001576860.07156100
0.4017110.000231440.0001585980.0001584790.0357850
0.4017520.0002431870.0001624350.0001622880.0143120
5
0.4015480.0002283970.0001578080.0001576860.07156100
0.4018620.0002314400.0001585850.0001584790.0357850
0.4018350.0002433410.0001623930.0001622880.0143120
1
0.4017450.0007184830.0004977950.0004966720.2263100
0.4018060.0007277690.0005002770.0004990620.113150
0.4017720.0007670010.0005138540.0005126180.0452620
5
0.4017370.0007184830.0004978300.0004966720.2263100
0.4018090.0007297120.0005014490.0005003320.113150
0.4017410.0007670010.0005137370.0005126180.0452620
1
0.4017260.002254450.001575830.001564620.7156100
0.4017450.002290170.001588210.001576750.357850
0.4017540.002403980.001625900.001614140.143120
5
0.4017370.002254450.001575850.001564620.7156100
0.4017590.002290170.001588010.001576740.357850
0.4017530.002403980.001626670.001614140.143120
1
N2
0 SP N Nw1m 0 1 0
02
m mP Q
0m
410
510
610
19
10 dBc
4. Simulation of STS for audio cover messages.
We use audio music file with duration about 29 sec in format wav where the sampling
frequency is 44.1 kHz.
The CM signal-to-noise ratio , whereas watermark-to-noise ratio (WNR)
The embedding rule was taken by (11), where .
In Fig. 2 the wave forms of original audio signal, audio signal after passing over a noisy channel
and after secret message embedding at the same time interval are presented.
One can see that noise corrupts slightly the audio signal and this fact can also be appreciated by
human ear, whilist, at the same time, the embedding procedure is not observable by human ear.
1 20 dBw
0 0.1P
20
Fig.2. The waveforms of audio signal (a), audio signal after its passing over noisy channel with
CM signal-to-noise ratio (b), and after embedding by STS algorithm with (c)10c dB 20WNR dB
21
In Fig.3 the waveforms of channel noise are shown, as well as this noise after embedding.
Fig.3. The waveform of channel noise (a) and
the same channel noise after embedding (b).
0
In Table 3 we present the results of simulation for the error probability versus
the parameters and . is the theorical error probability.e
w e
P
N P
20 210 0.001
50 496 0.001
100 973 0.001
w 0N ePeP
45 1046 10
45.5 10
22
5. Conclusion.
1. Some modification of the stegosystem based on noisy channel called spread-time
stegosystem (STS) has been proposed.
2. Both STS security and reliability can be provided by an appropriate selection of the system
parameters.
3. The main defect of STS is its low embedding rate. The use of error correcting codes improve
this situation but only slightly.
4. The suboptimal system detection by (16) is practically as much efficient as the optimal
by (15).
5. Simulation of the STS with audio CM shows that its detection by ear and eye is impossible,
whereas the embedded bits can be extracted reliably.
23
Open problems.
1. To specify security of STS for digital CM and after a saving the stegosignal in digital
formats.
2. Consider applications of STS in real noisy channels (including optical fiber channels).
3. An extraction of secret bits by a “blind” decoder (see [6]) while keeping a good
undetectability of STS.
4. Investigation of attack to prevent an extraction of secure embedded bits (additive noise,
compression/decompression, resynchronization ).
5. Improvement of coded STS (farther optimization of codes and decoding algorithms).
24
References.
[1] Valery Korzhik, Moon Ho Lee, Gullermo Morales Luna, “Stegosystem Based on Noisy
Channels”, Trans. VII Spanish Meeting on Cryptography and Information Security, 2006;
[2] Cachin C., “An information theoretic model for stegonography”. In: International workshop
on IH, 1998, pp. 306-318.
[3] Van Der Warden, “Mathematische Statistik”, Springer-Verlag, 1957.
[4] Papoulis, A. “Pobability, Random Variables, and Stochastic Processes”, McGraw-Hill, New-
York, 1984.
[5] MacWilliams, F. Sloan, N. “The Theory of Error Correcting Codes”. Bell Labs, 1991.
[6] Malvar, H.S. Florencio, D. “Improved spread spectrum: A new modulation technique for
robust watermarking”. IEEE Transaction on Signal Processing 51 (2003), pp. 898-905/
[7] Valery Korzhik, Guillermo Morales-Luna, Ksenia Loban, “Undetectable Spread-time
Stegosystem Based on Noisy Channels”. Submitted for journal “Information Hiding”,
Lecture Notes in Computer Science, 2009