raik 283 data structures & algorithms
DESCRIPTION
RAIK 283 Data Structures & Algorithms. Huffman Coding Dr. Ying Lu [email protected]. RAIK 283 Data Structures & Algorithms. Giving credit where credit is due: Most of slides for this lecture are based on slides created by Dr. Richard Anderson, University of Washington. - PowerPoint PPT PresentationTRANSCRIPT
Giving credit where credit is due: Most of slides for this lecture are
based on slides created by Dr. Richard Anderson, University of Washington.
I have modified them and added new slides
RAIK 283 Data Structures & Algorithms
Coding theory ASCII coding Conversion,
Encryption, Compression
Binary coding
ABCDEF
For fixed-length binary coding of a 6-character alphabet, how many bits are needed?
Coding theory (cont.) ASCII coding Conversion,
Encryption, Compression
Binary coding
A 000B 001C 010D 011E 100F 101
Coding theory (cont.) ASCII coding Conversion,
Encryption, Compression
Binary coding Variable length
coding
A 000 00 0.3B 001 010 0.1C 010 011 0.1D 011 100 0.1E 100 11 0.3F 101 101 0.1
Probability
Average bits/character = ? Compression Ratio = ?
Decode the followingE 0T 11N 100I 1010S 1011
11010010010101011
E 0T 10N 100I 0111S 1010
100100101010
Prefix(-free) codes No prefix of a
codeword is a codeword
Uniquely decodable
A 00 1 00B 01
001 10
C 011
001 11
D 100
0001 0001
E 11 00001 11000
F 101
000001
101
Prefix codes and binary trees Tree representation
of prefix codes
A 00B 010C 0110D 0111E 10F 11
A
0
B
0
0
0
01
1
1
1
1
C D
EF
Minimum length code
A 1/4B 1/8C 1/16D 1/16E 1/2
Probability
How to code so that average bits/character is minimized?
Minimum length code (cont.)
Huffman tree – prefix codes tree with minimum weighted path length
C(T) – weighted path length
Huffman code algorithm Derivation
Two rarest items will have the longest codewords
Codewords for rarest items differ only in the last bit
Idea: suppose the weights are with and the smallest weights Start with an optimal code for
and Extend the codeword for to get
codewords for and
Huffman codeH = new minHeap()for each wi
T = new Tree(wi)H.Insert(T)
while H.Size() > 1T1 = H.DeleteMin()T2 = H.DeleteMin()T3 = Merge(T1, T2)H.Insert(T3)
Example
character A B C D _probability
0.35
0.1 0.2 0.2 0.15
In-class exercises P332 Exercises 9.4.1
In-class exercises 9.4.3 What is the maximal length of a
codeword possible in a Huffman encoding of an alphabet of n characters?
9.4.5 Show that a Huffman tree can be constructed in linear time if the alphabet’s characters are given in a sorted order of their frequencies.