raik 283 data structures & algorithms

15
Huffman Coding Dr. Ying Lu [email protected] RAIK 283 Data Structures & Algorithms

Upload: gwyn

Post on 22-Mar-2016

59 views

Category:

Documents


0 download

DESCRIPTION

RAIK 283 Data Structures & Algorithms. Huffman Coding Dr. Ying Lu [email protected]. RAIK 283 Data Structures & Algorithms. Giving credit where credit is due: Most of slides for this lecture are based on slides created by Dr. Richard Anderson, University of Washington. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: RAIK 283       Data Structures & Algorithms

Huffman Coding

Dr. Ying [email protected]

RAIK 283 Data Structures & Algorithms

Page 2: RAIK 283       Data Structures & Algorithms

Giving credit where credit is due: Most of slides for this lecture are

based on slides created by Dr. Richard Anderson, University of Washington.

I have modified them and added new slides

RAIK 283 Data Structures & Algorithms

Page 3: RAIK 283       Data Structures & Algorithms

Coding theory ASCII coding Conversion,

Encryption, Compression

Binary coding

ABCDEF

For fixed-length binary coding of a 6-character alphabet, how many bits are needed?

Page 4: RAIK 283       Data Structures & Algorithms

Coding theory (cont.) ASCII coding Conversion,

Encryption, Compression

Binary coding

A 000B 001C 010D 011E 100F 101

Page 5: RAIK 283       Data Structures & Algorithms

Coding theory (cont.) ASCII coding Conversion,

Encryption, Compression

Binary coding Variable length

coding

A 000 00 0.3B 001 010 0.1C 010 011 0.1D 011 100 0.1E 100 11 0.3F 101 101 0.1

Probability

Average bits/character = ? Compression Ratio = ?

Page 6: RAIK 283       Data Structures & Algorithms

Decode the followingE 0T 11N 100I 1010S 1011

11010010010101011

E 0T 10N 100I 0111S 1010

100100101010

Page 7: RAIK 283       Data Structures & Algorithms

Prefix(-free) codes No prefix of a

codeword is a codeword

Uniquely decodable

A 00 1 00B 01

001 10

C 011

001 11

D 100

0001 0001

E 11 00001 11000

F 101

000001

101

Page 8: RAIK 283       Data Structures & Algorithms

Prefix codes and binary trees Tree representation

of prefix codes

A 00B 010C 0110D 0111E 10F 11

A

0

B

0

0

0

01

1

1

1

1

C D

EF

Page 9: RAIK 283       Data Structures & Algorithms

Minimum length code

A 1/4B 1/8C 1/16D 1/16E 1/2

Probability

How to code so that average bits/character is minimized?

Page 10: RAIK 283       Data Structures & Algorithms

Minimum length code (cont.)

Huffman tree – prefix codes tree with minimum weighted path length

C(T) – weighted path length

Page 11: RAIK 283       Data Structures & Algorithms

Huffman code algorithm Derivation

Two rarest items will have the longest codewords

Codewords for rarest items differ only in the last bit

Idea: suppose the weights are with and the smallest weights Start with an optimal code for

and Extend the codeword for to get

codewords for and

Page 12: RAIK 283       Data Structures & Algorithms

Huffman codeH = new minHeap()for each wi

T = new Tree(wi)H.Insert(T)

while H.Size() > 1T1 = H.DeleteMin()T2 = H.DeleteMin()T3 = Merge(T1, T2)H.Insert(T3)

Page 13: RAIK 283       Data Structures & Algorithms

Example

character A B C D _probability

0.35

0.1 0.2 0.2 0.15

Page 14: RAIK 283       Data Structures & Algorithms

In-class exercises P332 Exercises 9.4.1

Page 15: RAIK 283       Data Structures & Algorithms

In-class exercises 9.4.3 What is the maximal length of a

codeword possible in a Huffman encoding of an alphabet of n characters?

9.4.5 Show that a Huffman tree can be constructed in linear time if the alphabet’s characters are given in a sorted order of their frequencies.