exercise in the previous class

29
exercise in the previous class Q1: Compute P ( X = Y 1 ) and P ( X = Y 2 ) : P(X=Y 1 ) = 0.73 and P(X=Y 2 ) = 0 Q2:Compute I ( X; Y 1 ) and I ( X; Y 2 ) : H(X) = bit to compute H(X|Y 1 ), determine some probabilities; 1 X sunny rain sunny 45 15 rain 12 28 Y 1 X sunny rain sunny 0 43 rain 57 0 Y 2 X sunny rain sunny 0.75 0.25 rain 0.30 0.70 Y 1 P(X|Y 1 ) P(Y 1 =sunny)= 0.6, P(Y 1 =rain)=0.4 H(X|Y 1 =sunny) = 811 H(X|Y 1 =rain) = 881 H(X|Y 1 ) = 0.6×0.811+0.4×0.881=0.8 (X; Y 1 ) = H(X) – H(X|Y 1 ) = 0.986 – 0.839= 0.147 bit

Upload: cheri

Post on 23-Feb-2016

26 views

Category:

Documents


0 download

DESCRIPTION

exercise in the previous class. Q1: Compute P ( X = Y 1 ) and P ( X = Y 2 ) : P ( X = Y 1 ) = 0.73 and P ( X = Y 2 ) = 0 Q2:Compute I ( X; Y 1 ) and I ( X; Y 2 ) : H ( X ) = bit to compute H ( X | Y 1 ), determine some probabilities;. Y 1. Y 2. Y 1. sunny 0.75 0.25. sunny - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: exercise in the previous class

exercise in the previous class

Q1: Compute P(X=Y1) and P(X=Y2): P(X=Y1) = 0.73 and P(X=Y2) = 0

Q2:Compute I(X; Y1) and I(X; Y2): H(X) = bitto compute H(X|Y1), determine some

probabilities;

1

X sunnyrain

sunny4515

rain1228

Y1

X sunnyrain

sunny0

43

rain570

Y2

X sunnyrain

sunny0.750.25

rain0.300.70

Y1P(X|Y1)

P(Y1=sunny)= 0.6, P(Y1=rain)=0.4H(X|Y1=sunny) = 811H(X|Y1=rain) = 881H(X|Y1) = 0.6×0.811+0.4×0.881=0.839

I(X; Y1) = H(X) – H(X|Y1) = 0.986 – 0.839= 0.147 bit

Page 2: exercise in the previous class

exercise in the previous class (cnt’d)

Q2:Compute I(X; Y1) and I(X; Y2): H(X) = bitto compute H(X|Y2), determine some

probabilities;

2

X sunnyrain

sunny4515

rain1228

Y1

X sunnyrain

sunny0

43

rain570

Y2

X sunnyrain

sunny01

rain10

Y2P(X|Y2)

P(Y2=sunny)= 0.43, P(Y2=rain)=0.57H(X|Y2=sunny) = H(X|Y2=rain) = H(X|Y2) = 0.43×0+0.57×0=0

I(X; Y2) = H(X) – H(X|Y2) = 0.986 – 0= 0.986 bit

Q3: Which is the better forecasting? Y2 gives more information.

Page 3: exercise in the previous class

chapter 2:compact representation of information

3

Page 4: exercise in the previous class

We learn how to encode symbols from information source.source codingdata compression

the purpose of source encoding:to give representations which are good for communicationto discard (捨てる ) redundancy (冗長性 )

We want a source coding scheme which gives ...as precise (正確 ) encoding as possibleas compact encoding as possible

the purpose of chapter 2

4

sourceencoder0101101

Page 5: exercise in the previous class

plan of the chapter

basic properties needed for source codinguniquely decodableimmediately decodable

Huffman codeconstruction of Huffman code

extensions of Huffman codetheoretical limit of the “compression”related topics

5

today

Page 6: exercise in the previous class

words and terms

Meanwhile, we consider symbol-by-symbol encodings only.

M...the set of symbols generated by an information source.For each symbol in M, associate a sequence (系列 ) over {0, 1}.

codewords (符号語 ): sequences associated to symbols in Mcode (符号 ): the set of codewordsalphabet: {0, 1} in this case... binary code

6

Msunnycloudyrainy

C00

010101

three codewords; 00, 010 and 101code C = {00, 010 and 101}

011 is NOT a codeword, for example

Page 7: exercise in the previous class

encoding and decoding

encode ... to determine the codeword for a given symboldecode ... to determine the symbol for a given codeword

7

sunnycloudyrainy

00010101

encode

decodeencode = 符号化decode = 復号(化)

NO separation symbols between codewords;010 00 101 101 ... NG, 01000101101 ... OK

Why?{0, 1 and “space”} ... the alphabet have three symbols, not two

Page 8: exercise in the previous class

uniquely decodable codes

A code must be uniquely decodable (一意復号可能 ).Different symbol sequences are encoded to different 0-1 sequences.uniquely decodable codewords are all different,

but the converse (逆 ) does not hold in general.

8

a1

a2

a3

a4

C1

00100111

C2

001011111

C3

0101101

C4

010110

nonoyes yes

with the code C3...

a1 a3 a1

a4 a2

0110

Page 9: exercise in the previous class

more than uniqueness

consider a scenario of using C2...a1, a4, a4, a1 is encoded to 01111110.The 0-1 sequence is transmitted by 1 bit/sec.When does the receiver find that

the first symbol is a1?

9

a1

a2

a3

a4

C1

00100111

C2

001011111

seven seconds later, the receiver obtains 0111111:if 0 comes next, then 0 - 111 - 111 - 0 a1, a4, a4, a1

if 1 comes next, then 01 - 111 - 111 a2, a4, a4

We cannot finalize the first symbol even in the seven seconds later. buffer to save data, latency (遅延 ) of decoding...

Page 10: exercise in the previous class

immediately decodable codes

A code must be uniquely decodable, and if possible,it should be immediately decodable (瞬時復号可能 ).

Decoding is possible without looking ahead the sequence.If you find a codeword pattern, then decode it immediately.

important property from an engineering viewpoint.

formally writing...If is written as with and , then there is no and such that .

10

c1 s1 c2 s2≠

Page 11: exercise in the previous class

prefix condition

If a code is NOT immediately decodable, then there is such that with different and .

11

c1 s1

c2 s2

=

the codeword is a prefix (語頭 ) of (c1 is the same as the beginning part of c2)

a1

a2

a3

a4

C2

001011111

“0” is a prefix of “01” and “011”“01” is a prefix of “011”

Lemma:A code C is immediately decodable if and only ifno codeword in C is a prefix of other codewords.(prefix condition, 語頭条件 )

Page 12: exercise in the previous class

break: prefix condition and user interface

The prefix condition is important in engineering design.

bad example: strokes for character writing on Palm PDA

12

graffiti (ver. 1)

graffiti (ver. 2)

basically one stroke only

some needs of two strokesprefix condition violated“– –”, and “=”, “– 1” and “+”

Page 13: exercise in the previous class

how to achieve the prefix condition

easy ways to construct codes with the prefix condition:let all codewords have the same lengthput a special pattern at the end of each codeword

C = {011, 1011, 01011, 10011} ... “comma code”... too straightforward

select codewords by using a tree structure (code tree)for binary codes, we use binary treesfor k-ary codes, we use trees with degree k

13

a code treewith degree 3

Page 14: exercise in the previous class

construction of codes (k-ary case)

how to construct a k-ary code with M codewords

1. construct a k-ary tree T with M leaf nodes

2. for each branch (枝 ) of T, assign a label in {0, ..., k – 1}sibling (兄弟 ) branches cannot have the same label

3. for each of leaf nodes of T, traverse T from the root to the leaf, with concatenating (連接する ) labels on branches the obtained sequence is the codeword of the node

14

Page 15: exercise in the previous class

example

construct a binary code with four codewords

15

0 1

0

11

0

00

0110

11

0 1

0

11

0

Step 1 Step 2 Step 3

the constructed code is {00, 01, 10, 11}

Page 16: exercise in the previous class

example (cnt’d)

other constructions;we can choose different trees, different labeling...

16

0

10

10

1

0

11

01

0

0

1

1

0

100

1 0

C1={0, 10, 110, 111}

C2={0, 11, 101, 100}

C3={01, 000, 1011, 1010}

The prefix condition is always guaranteed. Immediately decodable codes are constructed.

Page 17: exercise in the previous class

C1 seems to give more compact representation than C3.codeword length = [1, 2, 3, 3]

Can we construct more compact immediately decodable codes? codeword length = [1, 1, 1, 1]?codeword length = [1, 2, 2, 3]?codeword length = [2, 2, 2, 3]?

the “best” among immediately decodable codes

17

0

10

10

1

0

1

1

0

100

1 0C1={0, 10, 110, 111} C3={01, 000, 1011, 1010}

What is the criteria (基準 ) ??

Page 18: exercise in the previous class

Kraft’s inequality

Theorem:A) If a k-ary code {c1, ..., cM} with |ci| = li is immediately decodable,

then (Kraft’s inequality) holds.

B) If , then we can constructa k-ary immediately decodable code {c1, ..., cM} with |ci| =

li.

proof omitted in this class ... use results of graph theory

[trivia] The result is given in the Master’s thesis of L. Kraft.18

Page 19: exercise in the previous class

back to the examples

Can we construct more compact immediately decodable codes? codeword length = [1, 2, 2, 3]?

…We cannot construct an immediately decodable code.

codeword length = [2, 2, 2, 3]?…We can construct an immediately decodable code,by simply constructing a code tree....

19

Page 20: exercise in the previous class

to the next step

basic properties needed for source codinguniquely decodableimmediately decodable

Huffman codeconstruction of Huffman code

extensions of Huffman codetheoretical limit of the “compression”related topics

20

today

Page 21: exercise in the previous class

the measure of efficiency

We want to construct a good source coding scheme.easy to use ... immediately decodableefficient ... what is the efficiency?

We try to minimize...the expected length of a codeword for representing one symbol:

21

symbola1

a2

:aM

probabilityp1

p2

:pM

codewordc1

c2

:cM

lengthl1

l2

:lM

average codeword length

∑𝑖=1

𝑀

𝑝𝑖 𝑙 𝑖

Page 22: exercise in the previous class

computing the average codeword length

22

symbola1

a2

a3

a4

probability0.40.30.20.1

C1

010

110111

C2

111110100

C3

00011011

C1: 0.4×1+ 0.3×2+ 0.2×3+ 0.1×3 = 1.9C2: 0.4×3+ 0.3×3+ 0.2×2+ 0.1×1 = 2.6C3: 0.4×2+ 0.3×2+ 0.2×2+ 0.1×2 = 2.0

It is expected that...C1 gives the most compact representation in typical cases.

Page 23: exercise in the previous class

Huffman code

Huffman algorithm gives a clever way to constructa code with small average codeword length.

1. prepare isolated M nodes, each attached witha probability of a symbol (node = size-one tree)

2. repeat the following operation until all trees are joined to onea. select two trees T1 and T2 having the smallest probabilitiesb. join T1 and T2 by introducing a new parent nodec. the sum of probabilities of T1 and T2 is given to the new tree

23

David Huffman1925-1999

Page 24: exercise in the previous class

example

24

0.05D

0.1C

0.25B

0.6A

0.05D

0.1C

0.15

0.25B

0.6A

0.05D

0.1C

0.15

0.25B

0.4

0.6A

1.0

0.05D

0.1C

0.15

0.25B

0.4

0.6A

00

0

11

1

“merger of small companies”

Page 25: exercise in the previous class

exercise

compare the average length with the equal-length codes...25

ABCDE

prob.0.20.10.30.30.1

codewords

Page 26: exercise in the previous class

ABCDEF

prob.0.30.20.20.10.10.1

codewords

exercise

compare the average length with the equal-length codes...26

Page 27: exercise in the previous class

different construction, same efficiency

We may have multiple options on the code construction:several nodes have the same small probabilitieslabels can be assigned differently to branches

Different option results in a different Huffman code, but...the average length does not depend on the chosen option.

27

0.4a1

0.2a2

0.2a3

0.1a4

0.1a5

0.4a1

0.2a2

0.2a3

0.1a4

0.1a5

Page 28: exercise in the previous class

summary of today’s class

basic properties needed for source codinguniquely decodableimmediately decodable

Huffman codeconstruction of Huffman code

extensions of Huffman codetheoretical limit of the “compression”related topics

28

today

Page 29: exercise in the previous class

exercise

Construct a binary Huffman code forthe information source given in the

table.

Compute the average codeword lengthof the constructed code.

Can you construct a 4-ary Huffman codefor the source?

29

ABCDEFGH

prob.0.3630.1740.1430.0980.0870.0690.0450.021