Download - exercise in the previous class
![Page 1: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/1.jpg)
exercise in the previous class
Q1: Compute P(X=Y1) and P(X=Y2): P(X=Y1) = 0.73 and P(X=Y2) = 0
Q2:Compute I(X; Y1) and I(X; Y2): H(X) = bitto compute H(X|Y1), determine some
probabilities;
1
X sunnyrain
sunny4515
rain1228
Y1
X sunnyrain
sunny0
43
rain570
Y2
X sunnyrain
sunny0.750.25
rain0.300.70
Y1P(X|Y1)
P(Y1=sunny)= 0.6, P(Y1=rain)=0.4H(X|Y1=sunny) = 811H(X|Y1=rain) = 881H(X|Y1) = 0.6×0.811+0.4×0.881=0.839
I(X; Y1) = H(X) – H(X|Y1) = 0.986 – 0.839= 0.147 bit
![Page 2: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/2.jpg)
exercise in the previous class (cnt’d)
Q2:Compute I(X; Y1) and I(X; Y2): H(X) = bitto compute H(X|Y2), determine some
probabilities;
2
X sunnyrain
sunny4515
rain1228
Y1
X sunnyrain
sunny0
43
rain570
Y2
X sunnyrain
sunny01
rain10
Y2P(X|Y2)
P(Y2=sunny)= 0.43, P(Y2=rain)=0.57H(X|Y2=sunny) = H(X|Y2=rain) = H(X|Y2) = 0.43×0+0.57×0=0
I(X; Y2) = H(X) – H(X|Y2) = 0.986 – 0= 0.986 bit
Q3: Which is the better forecasting? Y2 gives more information.
![Page 3: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/3.jpg)
chapter 2:compact representation of information
3
![Page 4: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/4.jpg)
We learn how to encode symbols from information source.source codingdata compression
the purpose of source encoding:to give representations which are good for communicationto discard (捨てる ) redundancy (冗長性 )
We want a source coding scheme which gives ...as precise (正確 ) encoding as possibleas compact encoding as possible
the purpose of chapter 2
4
sourceencoder0101101
![Page 5: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/5.jpg)
plan of the chapter
basic properties needed for source codinguniquely decodableimmediately decodable
Huffman codeconstruction of Huffman code
extensions of Huffman codetheoretical limit of the “compression”related topics
5
today
![Page 6: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/6.jpg)
words and terms
Meanwhile, we consider symbol-by-symbol encodings only.
M...the set of symbols generated by an information source.For each symbol in M, associate a sequence (系列 ) over {0, 1}.
codewords (符号語 ): sequences associated to symbols in Mcode (符号 ): the set of codewordsalphabet: {0, 1} in this case... binary code
6
Msunnycloudyrainy
C00
010101
three codewords; 00, 010 and 101code C = {00, 010 and 101}
011 is NOT a codeword, for example
![Page 7: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/7.jpg)
encoding and decoding
encode ... to determine the codeword for a given symboldecode ... to determine the symbol for a given codeword
7
sunnycloudyrainy
00010101
encode
decodeencode = 符号化decode = 復号(化)
NO separation symbols between codewords;010 00 101 101 ... NG, 01000101101 ... OK
Why?{0, 1 and “space”} ... the alphabet have three symbols, not two
![Page 8: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/8.jpg)
uniquely decodable codes
A code must be uniquely decodable (一意復号可能 ).Different symbol sequences are encoded to different 0-1 sequences.uniquely decodable codewords are all different,
but the converse (逆 ) does not hold in general.
8
a1
a2
a3
a4
C1
00100111
C2
001011111
C3
0101101
C4
010110
nonoyes yes
with the code C3...
a1 a3 a1
a4 a2
0110
![Page 9: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/9.jpg)
more than uniqueness
consider a scenario of using C2...a1, a4, a4, a1 is encoded to 01111110.The 0-1 sequence is transmitted by 1 bit/sec.When does the receiver find that
the first symbol is a1?
9
a1
a2
a3
a4
C1
00100111
C2
001011111
seven seconds later, the receiver obtains 0111111:if 0 comes next, then 0 - 111 - 111 - 0 a1, a4, a4, a1
if 1 comes next, then 01 - 111 - 111 a2, a4, a4
We cannot finalize the first symbol even in the seven seconds later. buffer to save data, latency (遅延 ) of decoding...
![Page 10: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/10.jpg)
immediately decodable codes
A code must be uniquely decodable, and if possible,it should be immediately decodable (瞬時復号可能 ).
Decoding is possible without looking ahead the sequence.If you find a codeword pattern, then decode it immediately.
important property from an engineering viewpoint.
formally writing...If is written as with and , then there is no and such that .
10
c1 s1 c2 s2≠
![Page 11: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/11.jpg)
prefix condition
If a code is NOT immediately decodable, then there is such that with different and .
11
c1 s1
c2 s2
=
the codeword is a prefix (語頭 ) of (c1 is the same as the beginning part of c2)
a1
a2
a3
a4
C2
001011111
“0” is a prefix of “01” and “011”“01” is a prefix of “011”
Lemma:A code C is immediately decodable if and only ifno codeword in C is a prefix of other codewords.(prefix condition, 語頭条件 )
![Page 12: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/12.jpg)
break: prefix condition and user interface
The prefix condition is important in engineering design.
bad example: strokes for character writing on Palm PDA
12
graffiti (ver. 1)
graffiti (ver. 2)
basically one stroke only
some needs of two strokesprefix condition violated“– –”, and “=”, “– 1” and “+”
![Page 13: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/13.jpg)
how to achieve the prefix condition
easy ways to construct codes with the prefix condition:let all codewords have the same lengthput a special pattern at the end of each codeword
C = {011, 1011, 01011, 10011} ... “comma code”... too straightforward
select codewords by using a tree structure (code tree)for binary codes, we use binary treesfor k-ary codes, we use trees with degree k
13
a code treewith degree 3
![Page 14: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/14.jpg)
construction of codes (k-ary case)
how to construct a k-ary code with M codewords
1. construct a k-ary tree T with M leaf nodes
2. for each branch (枝 ) of T, assign a label in {0, ..., k – 1}sibling (兄弟 ) branches cannot have the same label
3. for each of leaf nodes of T, traverse T from the root to the leaf, with concatenating (連接する ) labels on branches the obtained sequence is the codeword of the node
14
![Page 15: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/15.jpg)
example
construct a binary code with four codewords
15
0 1
0
11
0
00
0110
11
0 1
0
11
0
Step 1 Step 2 Step 3
the constructed code is {00, 01, 10, 11}
![Page 16: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/16.jpg)
example (cnt’d)
other constructions;we can choose different trees, different labeling...
16
0
10
10
1
0
11
01
0
0
1
1
0
100
1 0
C1={0, 10, 110, 111}
C2={0, 11, 101, 100}
C3={01, 000, 1011, 1010}
The prefix condition is always guaranteed. Immediately decodable codes are constructed.
![Page 17: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/17.jpg)
C1 seems to give more compact representation than C3.codeword length = [1, 2, 3, 3]
Can we construct more compact immediately decodable codes? codeword length = [1, 1, 1, 1]?codeword length = [1, 2, 2, 3]?codeword length = [2, 2, 2, 3]?
the “best” among immediately decodable codes
17
0
10
10
1
0
1
1
0
100
1 0C1={0, 10, 110, 111} C3={01, 000, 1011, 1010}
What is the criteria (基準 ) ??
![Page 18: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/18.jpg)
Kraft’s inequality
Theorem:A) If a k-ary code {c1, ..., cM} with |ci| = li is immediately decodable,
then (Kraft’s inequality) holds.
B) If , then we can constructa k-ary immediately decodable code {c1, ..., cM} with |ci| =
li.
proof omitted in this class ... use results of graph theory
[trivia] The result is given in the Master’s thesis of L. Kraft.18
![Page 19: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/19.jpg)
back to the examples
Can we construct more compact immediately decodable codes? codeword length = [1, 2, 2, 3]?
…We cannot construct an immediately decodable code.
codeword length = [2, 2, 2, 3]?…We can construct an immediately decodable code,by simply constructing a code tree....
19
![Page 20: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/20.jpg)
to the next step
basic properties needed for source codinguniquely decodableimmediately decodable
Huffman codeconstruction of Huffman code
extensions of Huffman codetheoretical limit of the “compression”related topics
20
today
![Page 21: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/21.jpg)
the measure of efficiency
We want to construct a good source coding scheme.easy to use ... immediately decodableefficient ... what is the efficiency?
We try to minimize...the expected length of a codeword for representing one symbol:
21
symbola1
a2
:aM
probabilityp1
p2
:pM
codewordc1
c2
:cM
lengthl1
l2
:lM
average codeword length
∑𝑖=1
𝑀
𝑝𝑖 𝑙 𝑖
![Page 22: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/22.jpg)
computing the average codeword length
22
symbola1
a2
a3
a4
probability0.40.30.20.1
C1
010
110111
C2
111110100
C3
00011011
C1: 0.4×1+ 0.3×2+ 0.2×3+ 0.1×3 = 1.9C2: 0.4×3+ 0.3×3+ 0.2×2+ 0.1×1 = 2.6C3: 0.4×2+ 0.3×2+ 0.2×2+ 0.1×2 = 2.0
It is expected that...C1 gives the most compact representation in typical cases.
![Page 23: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/23.jpg)
Huffman code
Huffman algorithm gives a clever way to constructa code with small average codeword length.
1. prepare isolated M nodes, each attached witha probability of a symbol (node = size-one tree)
2. repeat the following operation until all trees are joined to onea. select two trees T1 and T2 having the smallest probabilitiesb. join T1 and T2 by introducing a new parent nodec. the sum of probabilities of T1 and T2 is given to the new tree
23
David Huffman1925-1999
![Page 24: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/24.jpg)
example
24
0.05D
0.1C
0.25B
0.6A
0.05D
0.1C
0.15
0.25B
0.6A
0.05D
0.1C
0.15
0.25B
0.4
0.6A
1.0
0.05D
0.1C
0.15
0.25B
0.4
0.6A
00
0
11
1
“merger of small companies”
![Page 25: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/25.jpg)
exercise
compare the average length with the equal-length codes...25
ABCDE
prob.0.20.10.30.30.1
codewords
![Page 26: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/26.jpg)
ABCDEF
prob.0.30.20.20.10.10.1
codewords
exercise
compare the average length with the equal-length codes...26
![Page 27: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/27.jpg)
different construction, same efficiency
We may have multiple options on the code construction:several nodes have the same small probabilitieslabels can be assigned differently to branches
Different option results in a different Huffman code, but...the average length does not depend on the chosen option.
27
0.4a1
0.2a2
0.2a3
0.1a4
0.1a5
0.4a1
0.2a2
0.2a3
0.1a4
0.1a5
![Page 28: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/28.jpg)
summary of today’s class
basic properties needed for source codinguniquely decodableimmediately decodable
Huffman codeconstruction of Huffman code
extensions of Huffman codetheoretical limit of the “compression”related topics
28
today
![Page 29: exercise in the previous class](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816276550346895dd2e855/html5/thumbnails/29.jpg)
exercise
Construct a binary Huffman code forthe information source given in the
table.
Compute the average codeword lengthof the constructed code.
Can you construct a 4-ary Huffman codefor the source?
29
ABCDEFGH
prob.0.3630.1740.1430.0980.0870.0690.0450.021