exercise in the previous class
DESCRIPTION
exercise in the previous class. Q1: Compute P ( X = Y 1 ) and P ( X = Y 2 ) : P ( X = Y 1 ) = 0.73 and P ( X = Y 2 ) = 0 Q2:Compute I ( X; Y 1 ) and I ( X; Y 2 ) : H ( X ) = bit to compute H ( X | Y 1 ), determine some probabilities;. Y 1. Y 2. Y 1. sunny 0.75 0.25. sunny - PowerPoint PPT PresentationTRANSCRIPT
exercise in the previous class
Q1: Compute P(X=Y1) and P(X=Y2): P(X=Y1) = 0.73 and P(X=Y2) = 0
Q2:Compute I(X; Y1) and I(X; Y2): H(X) = bitto compute H(X|Y1), determine some
probabilities;
1
X sunnyrain
sunny4515
rain1228
Y1
X sunnyrain
sunny0
43
rain570
Y2
X sunnyrain
sunny0.750.25
rain0.300.70
Y1P(X|Y1)
P(Y1=sunny)= 0.6, P(Y1=rain)=0.4H(X|Y1=sunny) = 811H(X|Y1=rain) = 881H(X|Y1) = 0.6×0.811+0.4×0.881=0.839
I(X; Y1) = H(X) – H(X|Y1) = 0.986 – 0.839= 0.147 bit
exercise in the previous class (cnt’d)
Q2:Compute I(X; Y1) and I(X; Y2): H(X) = bitto compute H(X|Y2), determine some
probabilities;
2
X sunnyrain
sunny4515
rain1228
Y1
X sunnyrain
sunny0
43
rain570
Y2
X sunnyrain
sunny01
rain10
Y2P(X|Y2)
P(Y2=sunny)= 0.43, P(Y2=rain)=0.57H(X|Y2=sunny) = H(X|Y2=rain) = H(X|Y2) = 0.43×0+0.57×0=0
I(X; Y2) = H(X) – H(X|Y2) = 0.986 – 0= 0.986 bit
Q3: Which is the better forecasting? Y2 gives more information.
chapter 2:compact representation of information
3
We learn how to encode symbols from information source.source codingdata compression
the purpose of source encoding:to give representations which are good for communicationto discard (捨てる ) redundancy (冗長性 )
We want a source coding scheme which gives ...as precise (正確 ) encoding as possibleas compact encoding as possible
the purpose of chapter 2
4
sourceencoder0101101
plan of the chapter
basic properties needed for source codinguniquely decodableimmediately decodable
Huffman codeconstruction of Huffman code
extensions of Huffman codetheoretical limit of the “compression”related topics
5
today
words and terms
Meanwhile, we consider symbol-by-symbol encodings only.
M...the set of symbols generated by an information source.For each symbol in M, associate a sequence (系列 ) over {0, 1}.
codewords (符号語 ): sequences associated to symbols in Mcode (符号 ): the set of codewordsalphabet: {0, 1} in this case... binary code
6
Msunnycloudyrainy
C00
010101
three codewords; 00, 010 and 101code C = {00, 010 and 101}
011 is NOT a codeword, for example
encoding and decoding
encode ... to determine the codeword for a given symboldecode ... to determine the symbol for a given codeword
7
sunnycloudyrainy
00010101
encode
decodeencode = 符号化decode = 復号(化)
NO separation symbols between codewords;010 00 101 101 ... NG, 01000101101 ... OK
Why?{0, 1 and “space”} ... the alphabet have three symbols, not two
uniquely decodable codes
A code must be uniquely decodable (一意復号可能 ).Different symbol sequences are encoded to different 0-1 sequences.uniquely decodable codewords are all different,
but the converse (逆 ) does not hold in general.
8
a1
a2
a3
a4
C1
00100111
C2
001011111
C3
0101101
C4
010110
nonoyes yes
with the code C3...
a1 a3 a1
a4 a2
0110
more than uniqueness
consider a scenario of using C2...a1, a4, a4, a1 is encoded to 01111110.The 0-1 sequence is transmitted by 1 bit/sec.When does the receiver find that
the first symbol is a1?
9
a1
a2
a3
a4
C1
00100111
C2
001011111
seven seconds later, the receiver obtains 0111111:if 0 comes next, then 0 - 111 - 111 - 0 a1, a4, a4, a1
if 1 comes next, then 01 - 111 - 111 a2, a4, a4
We cannot finalize the first symbol even in the seven seconds later. buffer to save data, latency (遅延 ) of decoding...
immediately decodable codes
A code must be uniquely decodable, and if possible,it should be immediately decodable (瞬時復号可能 ).
Decoding is possible without looking ahead the sequence.If you find a codeword pattern, then decode it immediately.
important property from an engineering viewpoint.
formally writing...If is written as with and , then there is no and such that .
10
c1 s1 c2 s2≠
prefix condition
If a code is NOT immediately decodable, then there is such that with different and .
11
c1 s1
c2 s2
=
the codeword is a prefix (語頭 ) of (c1 is the same as the beginning part of c2)
a1
a2
a3
a4
C2
001011111
“0” is a prefix of “01” and “011”“01” is a prefix of “011”
Lemma:A code C is immediately decodable if and only ifno codeword in C is a prefix of other codewords.(prefix condition, 語頭条件 )
break: prefix condition and user interface
The prefix condition is important in engineering design.
bad example: strokes for character writing on Palm PDA
12
graffiti (ver. 1)
graffiti (ver. 2)
basically one stroke only
some needs of two strokesprefix condition violated“– –”, and “=”, “– 1” and “+”
how to achieve the prefix condition
easy ways to construct codes with the prefix condition:let all codewords have the same lengthput a special pattern at the end of each codeword
C = {011, 1011, 01011, 10011} ... “comma code”... too straightforward
select codewords by using a tree structure (code tree)for binary codes, we use binary treesfor k-ary codes, we use trees with degree k
13
a code treewith degree 3
construction of codes (k-ary case)
how to construct a k-ary code with M codewords
1. construct a k-ary tree T with M leaf nodes
2. for each branch (枝 ) of T, assign a label in {0, ..., k – 1}sibling (兄弟 ) branches cannot have the same label
3. for each of leaf nodes of T, traverse T from the root to the leaf, with concatenating (連接する ) labels on branches the obtained sequence is the codeword of the node
14
example
construct a binary code with four codewords
15
0 1
0
11
0
00
0110
11
0 1
0
11
0
Step 1 Step 2 Step 3
the constructed code is {00, 01, 10, 11}
example (cnt’d)
other constructions;we can choose different trees, different labeling...
16
0
10
10
1
0
11
01
0
0
1
1
0
100
1 0
C1={0, 10, 110, 111}
C2={0, 11, 101, 100}
C3={01, 000, 1011, 1010}
The prefix condition is always guaranteed. Immediately decodable codes are constructed.
C1 seems to give more compact representation than C3.codeword length = [1, 2, 3, 3]
Can we construct more compact immediately decodable codes? codeword length = [1, 1, 1, 1]?codeword length = [1, 2, 2, 3]?codeword length = [2, 2, 2, 3]?
the “best” among immediately decodable codes
17
0
10
10
1
0
1
1
0
100
1 0C1={0, 10, 110, 111} C3={01, 000, 1011, 1010}
What is the criteria (基準 ) ??
Kraft’s inequality
Theorem:A) If a k-ary code {c1, ..., cM} with |ci| = li is immediately decodable,
then (Kraft’s inequality) holds.
B) If , then we can constructa k-ary immediately decodable code {c1, ..., cM} with |ci| =
li.
proof omitted in this class ... use results of graph theory
[trivia] The result is given in the Master’s thesis of L. Kraft.18
back to the examples
Can we construct more compact immediately decodable codes? codeword length = [1, 2, 2, 3]?
…We cannot construct an immediately decodable code.
codeword length = [2, 2, 2, 3]?…We can construct an immediately decodable code,by simply constructing a code tree....
19
to the next step
basic properties needed for source codinguniquely decodableimmediately decodable
Huffman codeconstruction of Huffman code
extensions of Huffman codetheoretical limit of the “compression”related topics
20
today
the measure of efficiency
We want to construct a good source coding scheme.easy to use ... immediately decodableefficient ... what is the efficiency?
We try to minimize...the expected length of a codeword for representing one symbol:
21
symbola1
a2
:aM
probabilityp1
p2
:pM
codewordc1
c2
:cM
lengthl1
l2
:lM
average codeword length
∑𝑖=1
𝑀
𝑝𝑖 𝑙 𝑖
computing the average codeword length
22
symbola1
a2
a3
a4
probability0.40.30.20.1
C1
010
110111
C2
111110100
C3
00011011
C1: 0.4×1+ 0.3×2+ 0.2×3+ 0.1×3 = 1.9C2: 0.4×3+ 0.3×3+ 0.2×2+ 0.1×1 = 2.6C3: 0.4×2+ 0.3×2+ 0.2×2+ 0.1×2 = 2.0
It is expected that...C1 gives the most compact representation in typical cases.
Huffman code
Huffman algorithm gives a clever way to constructa code with small average codeword length.
1. prepare isolated M nodes, each attached witha probability of a symbol (node = size-one tree)
2. repeat the following operation until all trees are joined to onea. select two trees T1 and T2 having the smallest probabilitiesb. join T1 and T2 by introducing a new parent nodec. the sum of probabilities of T1 and T2 is given to the new tree
23
David Huffman1925-1999
example
24
0.05D
0.1C
0.25B
0.6A
0.05D
0.1C
0.15
0.25B
0.6A
0.05D
0.1C
0.15
0.25B
0.4
0.6A
1.0
0.05D
0.1C
0.15
0.25B
0.4
0.6A
00
0
11
1
“merger of small companies”
exercise
compare the average length with the equal-length codes...25
ABCDE
prob.0.20.10.30.30.1
codewords
ABCDEF
prob.0.30.20.20.10.10.1
codewords
exercise
compare the average length with the equal-length codes...26
different construction, same efficiency
We may have multiple options on the code construction:several nodes have the same small probabilitieslabels can be assigned differently to branches
Different option results in a different Huffman code, but...the average length does not depend on the chosen option.
27
0.4a1
0.2a2
0.2a3
0.1a4
0.1a5
0.4a1
0.2a2
0.2a3
0.1a4
0.1a5
summary of today’s class
basic properties needed for source codinguniquely decodableimmediately decodable
Huffman codeconstruction of Huffman code
extensions of Huffman codetheoretical limit of the “compression”related topics
28
today
exercise
Construct a binary Huffman code forthe information source given in the
table.
Compute the average codeword lengthof the constructed code.
Can you construct a 4-ary Huffman codefor the source?
29
ABCDEFGH
prob.0.3630.1740.1430.0980.0870.0690.0450.021