information theory and codes exercises

13
Information Theory and Codes Vittorio Giovara, Alberto Grand, Marina Lemessom 19/10/2007

Upload: project-symphony-collection

Post on 15-Oct-2014

1.497 views

Category:

Documents


5 download

DESCRIPTION

Here is our solution to 3 execises we needed to resolve in our Information Theory and Codes course. Very slow steps taken for the simple part and very detailed explanation for the more complex one. You'll find Hentropy and Quantity of Information, binary encoding through a Huffman tree and minimum distance and weight enumerator polynomial for words decoding and generator matrix.

TRANSCRIPT

Page 1: Information Theory and Codes Exercises

Information Theory and Codes

Vittorio Giovara, Alberto Grand, Marina Lemessom

19/10/2007

Page 2: Information Theory and Codes Exercises

Chapter 1

Entropy and Quantity ofInformation

1.1 Handout

Two fair six-faced dices are thrown by Alice and only their sum is commu-nicated to Bob. Which is the amount of information given with respect tocommunicating the distinct occurrences of the two dices?

1.2 Resolution

First we calculate the entropy of the sum of the dices by filling this table, Abeing the possible symbol to be sent and pa the probability of that symbolto be sent.

So we compute the entropy with this formula

H(A) =M∑i=1

pi log1pi

where M is the number of different symbols and pi the probability associatedto the i-th symbol. The base of the logarithm is 2 because we want entropyto be expressed in bits.

H(A) =11∑i=1

pi log2

1pi

=136

log

(1136

)+

236

log

(1236

)+

336

log

(1336

)+

+436

log

(1436

)+

536

log

(1536

)+

636

log

(1636

)+

536

log

(1536

)+

+436

log

(1436

)+

336

log

(1336

)+

236

log

(1236

)+

136

log

(1136

)=

1

Page 3: Information Theory and Codes Exercises

A pa

2(

136

)3

(236

)4

(336

)5

(436

)6

(536

)7

(636

)8

(536

)9

(436

)10

(336

)11

(236

)12

(136

)

= 2136

log(36) + 2236

log

(1118

)+ 2

336

log

(1112

)+ 2

436

log

(119

)+

+ 2536

log(

365

)+

636

log

(116

)=

118

log(36) +19

log(18) +

+16

log(12) +29

log(9) +518

log(

365

)+

16

log(6) =

= 3.27 bits

Then we compute the entropy of a roll of a single dice, using this B1

alphabet.

B pb

1(

16

)2

(16

)3

(16

)4

(16

)5

(16

)6

(16

)

H(B1) =6∑

i=1

pi log2

1pi

=

2

Page 4: Information Theory and Codes Exercises

= 616

log

(116

)=

= log(6) == 2.58 bits

Since the symbols of the two dices have the same probability, the alpha-bet of the second dice B2 is equal to the first.

B1 = B2

Thus also the entropy of the second dice is equal to the first.

H(B1) = H(B2)

The entropy of submitting the distinct results of the two dices Btot is thesum of the two entropies.

H(Btot) = H(B1) + H(B2) == 2H(B1) == 2 ∗ 2.58 == 5.16 bits

At last we calculate the difference of the quantity of information whensending the sum of the dices and the single result of the roll.

H(∆) = H(Btot)−H(A) == 5.16− 3.27 == 1.89 bits

3

Page 5: Information Theory and Codes Exercises

Chapter 2

Bynary Enconding

2.1 Handout

Constuct the binary Huffman tree for a four letters alphabet with probabil-ity distribution P = (0.8, 0.1, 0.05, 0.05). Compute the average code wordlength and entropy.

2.2 Resolution

We can build the binary Huffman tree by associating the alphabet A =(a1, a2, a3, a4) which has a probability distribution of P = (0.8, 0.1, 0.05, 0.05)to the binary alphabet B = (0, 1).

4

Page 6: Information Theory and Codes Exercises

So we can see this association:

a1 −→ 0a2 −→ 10a3 −→ 110a4 −→ 111

from which we can calculate the average code length l with this formula

l =M∑i=1

lipi

l =4∑

i=1

lipi =

= 1 ∗ 0.8 + 2 ∗ 0.1 + 3 ∗ 0.05 + 3 ∗ 0.05 == 0.8 + 0.2 + 0.3 == 1.3

And as for the entropy we use the previous formula

H(A) =M∑i=1

pi log1pi

H(B) = 0.8 log(

10.8

)+ 0.1 log

(1

0.1

)+ 0.1 log

(1

0.05

)=

= 1.022 bits

We observe that the First Shannon Theorem is respected, since

H(A) ≤ l log D ≤ H(A) + log D

1.022 ≤ l ≤ 2.044

5

Page 7: Information Theory and Codes Exercises

Chapter 3

Coding Theory

3.1 Handout

Let C be a binary (11, 7, d)-code with parity-check matrix

H =

0 0 1 0 1 0 1 1 0 1 10 1 0 1 1 1 1 1 0 0 01 0 0 0 0 1 1 1 0 1 10 1 1 0 1 0 0 1 1 1 0

1. Find a generator matrix (echelon form) G for C;

2. Find the minimum distance d;

3. Determine the weight enumerator polynomial for C;

4. Decode the following received words:

R1 = (1000 0000 111) R2 = (0111 0100 011)

3.2 Resolution

3.2.1 Generator Matrix Computation

It is easy to obtain the generator matrix from the parity-check matrix if weare able to reduce the latter to an echelon form. In this computation we areallowed to exchange and add rows.

So we swap r1 with r3 and r2 with r4, obtaining

H =

1 0 0 0 0 1 1 1 0 1 10 1 1 0 1 0 0 1 1 1 00 0 1 0 1 0 1 1 0 1 10 1 0 1 1 1 1 1 0 0 0

6

Page 8: Information Theory and Codes Exercises

then we add r3 to r2 and this new r2 to r4, obtaining

H =

1 0 0 0 0 1 1 1 0 1 10 1 0 0 0 0 1 0 1 0 10 0 1 0 1 0 1 1 0 1 10 0 0 1 1 1 0 1 1 0 1

We have been able to write H in the form

H = (In−k|A)

As a consequence, the corresponding generator matrix will have the form

G =(AIk

)Having said that, it is trivial to compute the generator matrix

G =

0 1 1 1 0 1 10 0 1 0 1 0 11 0 1 1 0 1 11 1 0 1 1 0 11 0 0 0 0 0 00 1 0 0 0 0 00 0 1 0 0 0 00 0 0 1 0 0 00 0 0 0 1 0 00 0 0 0 0 1 00 0 0 0 0 0 1

We are used to information symbols being the first k symbols of a codeword,whereas the following (n − k) symbols are parity-check symbols. However,this order is reversed in our case, the identity matrix being located in thelower part of the generator matrix. The first 4 symbols of every codewordare therefore parity-check symbols, while the last 7 symbols are informationsymbols.

c1 = x2 + x3 + x4 + x6 + x7

c2 = x3 + x5 + x7

c3 = x1 + x3 + x4 + x5 + x7

c4 = x1 + x2 + x4 + x5 + x7

c5 = x1

c6 = x2

c7 = x3

c8 = x4

c9 = x5

c10 = x6

c11 = x7

7

Page 9: Information Theory and Codes Exercises

3.2.2 Minimum Distance and Weight Enumerator Polyno-mial

In order to find the minimum distance and the weight enumerator polyno-mial for the given code, we have decided to write a simple algorithm in C.Its purpose is to compute all possible 27 codewords, surveying their Ham-ming weight and storing the results in a statistics vector. The i-th entry ofthis vector contains the number of codewords having weight i and representstherefore the i-th coefficient of the weight enumerator polynomial.

Since we are dealing with a linear code, it is trivial to find the minimumdistance between codewords. As a matter of fact

minx,y∈C

x 6=y

d(x, y) = minx,y∈C

x 6=y

WH(x− y)

but since the code is linear, c = x−y still belongs to C, the equation becomes

minx,y∈C

x 6=y

d(x, y) = minc∈C

c 6=0

WH(c)

and thus looking for the minimum distance between codewords boils down tolooking for the minimum weight over all codewords. This allows us to simplyinspect the weight enumerator polynomial coefficients of the given code: theindex of the minimum-index non-zero coefficient (disregarding coefficient A0,because it corresponds to the 0 codeword) is the sought minimum distanced.

The result provided by the algorithm are the following:

i Ai

0 11 02 03 134 265 246 247 268 139 010 011 1

Thus the minimum distance d is 3 and the error correction capabilty of thegiven code is 1.

8

Page 10: Information Theory and Codes Exercises

Finally the weight enumerator polynomial is

W (x, y) =11∑

1=0

Aixiy11−i =

= y11 + 13x3y8 + 26x4y7 + 24x5y6 + 24x6y5 + 26x7y4 + 13x8y3 + x11

3.2.3 Words Decoding

In order to decode R1 and R2, we have to compute their respective syn-drome, being

R1 =

10000001111

R2 =

01110100011

and E1 and E2 their respective error pattern.

So we find the syndrome of the first word as

S1 = HR1 = HE1 =

1000

and we can notice that the computed syndrome is equal to the first columnof the parity-check matrix.

Since the syndrome is a linear combination of the columns of the parity-check matrix H, it is clear that the minimum weight error pattern is

E1 =

10000000000

9

Page 11: Information Theory and Codes Exercises

Knowing that the error correction capability of the code is 1, we cancorrect this error.

The estimated codeword for the first received vector is

C1 = R1 + E1 =

00000000111

and consequentely the sent message was

M1 =

0000111

We now proceed to computing the syndrome for the second received

vector

S2 = HR2 = HE2 =

1011

The computed syndrome is equal to the 8th column of the parity-check

matrix; therefore the minimum weight error pattern is

E2 =

00000001000

10

Page 12: Information Theory and Codes Exercises

We can correct the error with the same method as before, obtaing the esti-mated codeword

C2 = R2 + E2 =

01110101011

and consequentely the sent message was

M2 =

0101011

11

Page 13: Information Theory and Codes Exercises

Contents

1 Entropy and Quantity of Information 11.1 Handout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Bynary Enconding 42.1 Handout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3 Coding Theory 63.1 Handout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.2 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.2.1 Generator Matrix Computation . . . . . . . . . . . . . 63.2.2 Minimum Distance and Weight Enumerator Polynomial 83.2.3 Words Decoding . . . . . . . . . . . . . . . . . . . . . 9

12