digital communication systems 452 - 2... · 2019. 1. 9. · [the arrl handbook for radio...

61
1 Digital Communication Systems ECS 452 Asst. Prof. Dr. Prapun Suksompong [email protected] 2. Source Coding Office Hours: BKD, 6th floor of Sirindhralai building Monday 10:00-10:40 Tuesday 12:00-12:40 Thursday 14:20-15:30

Upload: others

Post on 09-Mar-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

1

Digital Communication SystemsECS 452

Asst. Prof. Dr. Prapun [email protected]

2. Source Coding

Office Hours: BKD, 6th floor of Sirindhralai building

Monday 10:00-10:40Tuesday 12:00-12:40Thursday 14:20-15:30

Page 2: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Elements of digital commu. sys.

2

Noise & In

terferen

ce

Information Source

Destination

Channel

ReceivedSignal

TransmittedSignal

Message

Recovered Message

Source Encoder

Channel Encoder

DigitalModulator

Source Decoder

Channel Decoder

DigitalDemodulator

Transmitter

Receiver

Remove redundancy

Add systematic redundancy

Page 3: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

System Under Consideration

3

Noise & In

terferen

ce

Information Source

Destination

Channel

ReceivedSignal

TransmittedSignal

Message

Recovered Message

Source Encoder

Channel Encoder

DigitalModulator

Source Decoder

Channel Decoder

DigitalDemodulator

Transmitter

Receiver

Remove redundancy

Add systematic redundancy

Page 4: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Main Reference

4

Elements of Information Theory

2006, 2nd Edition

Chapters 2, 4 and 5

‘the jewel in Stanford's crown’

One of the greatest information theorists since Claude Shannon (and the one most like Shannon in approach, clarity, and taste).

Page 5: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

English Alphabet (Non-Technical Use)

5

Page 6: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

The ASCII Coded Character Set

6[The ARRL Handbook for Radio Communications 2013]

0 16 32 48 64 80 96 112

US UK

(American Standard Code for Information Interchange)

Page 7: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Example: ASCII Encoder

7

Characterx

Codewordc(x)

⋮E 1000101

⋮L 1001100

⋮O 1001111

⋮V 1010110

SourceEncoder

Information Source

“LOVE”“1001100100111110101101000101”

>> M = 'LOVE';>> X = dec2bin(M,7);>> X = reshape(X',1,numel(X))X =1001100100111110101101000101

MATLAB:

Remark:numel(A) = prod(size(A))(the number of elements in matrix A)

Page 8: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

English Redundancy: Ex. 1

8

J-st tr- t- r--d th-s s-nt-nc-.

Page 9: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

English Redundancy: Ex. 2

9

yxx cxn xndxrstxndwhxt x xm wrxtxngxvxn xf x rxplxcx xllthx vxwxls wxth xn 'x' (t gts lttl hrdr f y dn'tvn kn whr th vwls r).

Page 10: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

English Redundancy: Ex. 3

10

To be, or xxx xx xx, xxxx xx xxx xxxxxxxx

Page 11: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Entropy Rate of Thai Text

11

Page 12: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Introduction to Data Compression

12 [ https://www.khanacademy.org/computing/computer-science/informationtheory/moderninfotheory/v/compressioncodes ]

Page 13: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Introduction to Data Compression

13 [ https://www.khanacademy.org/computing/computer-science/informationtheory/moderninfotheory/v/compressioncodes ]

Page 14: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

ASCII: Source Alphabet of Size = 128

14[The ARRL Handbook for Radio Communications 2013]

0 16 32 48 64 80 96 112

(American Standard Code for Information Interchange)

Page 15: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Ex. Source alphabet of size = 4

15

Page 16: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Ex. DMS (1)

16

, , , ,X a b c d e

a c a c e c d b c ed a e e d a b b b db b a a b e b e d cc e d b c e c a a ca a e a c c a a d cd e e a a c a a a bb c a e b b e d b cd e b c a e e d d cd a b c a b c d d ed c e a b a a c a d

Information Source

1 , , , , ,50, otherwise

X

x a b c d ep x

Approximately 20% are letter ‘a’s[GenRV_Discrete_datasample_Ex1.m]

Page 17: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Ex. DMS (1)

17 [GenRV_Discrete_datasample_Ex1.m]

clear all; close all;

S_X = 'abcde'; p_X = [1/5 1/5 1/5 1/5 1/5];

n = 100;MessageSequence = datasample(S_X,n,'Weights',p_X)MessageSequence = reshape(MessageSequence,10,10)

>> GenRV_Discrete_datasample_Ex1

MessageSequence =

eebbedddeceacdbcbedeecacaecedcaedabecccabbcccebdbbbeccbadeaaaecceccdaccedadabceddaceadacdaededcdcade

MessageSequence =

eeeabbacdeeacebeeeadbcadcccdcebdcacccaedebabcbedacdceeeacadddbccbdcbacdeecdedccaeddcbaaeddcecabacdae

Page 18: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Ex. DMS (2)

18

1,2,3,4X

1 , 1,21 , 2,41 , 3,480, otherwise

X

x

xp x

x

Information Source

Approximately 50% are number ‘1’s

2 1 1 2 1 4 1 1 1 11 1 4 1 1 2 4 2 2 13 1 1 2 3 2 4 1 2 42 1 1 2 1 1 3 3 1 11 3 4 1 4 1 1 2 4 14 1 4 1 2 2 1 4 2 14 1 1 1 1 2 1 4 2 42 1 1 1 2 1 2 1 3 22 1 1 1 1 1 1 2 3 22 1 1 2 1 4 2 1 2 1

[GenRV_Discrete_datasample_Ex2.m]

Page 19: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Ex. DMS (2)

19 [GenRV_Discrete_datasample_Ex2.m]

clear all; close all;

S_X = [1 2 3 4]; p_X = [1/2 1/4 1/8 1/8];

n = 20;

MessageSequence = randsrc(1,n,[S_X;p_X]);%MessageSequence = datasample(S_X,n,'Weights',p_X);

rf = hist(MessageSequence,S_X)/n; % Ref. Freq. calc.stem(S_X,rf,'rx','LineWidth',2) % Plot Rel. Freq.hold onstem(S_X,p_X,'bo','LineWidth',2) % Plot pmfxlim([min(S_X)-1,max(S_X)+1])legend('Rel. freq. from sim.','pmf p_X(x)')xlabel('x')grid on

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

x

Rel. freq. from sim.pmf pX(x)

Page 20: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

20

DMS in MATLABclear all; close all;

S_X = [1 2 3 4]; p_X = [1/2 1/4 1/8 1/8]; n = 1e6;

SourceString = randsrc(1,n,[S_X;p_X]);

rf = hist(SourceString,S_X)/n; % Ref. Freq. calc.stem(S_X,rf,'rx','LineWidth',2) % Plot Rel. Freq.hold onstem(S_X,p_X,'bo','LineWidth',2) % Plot pmfxlim([min(S_X)-1,max(S_X)+1])legend('Rel. freq. from sim.','pmf p_X(x)')xlabel('x')grid on

SourceString = datasample(S_X,n,'Weights',p_X);

Alternatively, we can also use

[GenRV_Discrete_datasample_Ex.m]

Page 21: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

A more realistic example of pmf:

21 [http://en.wikipedia.org/wiki/Letter_frequency]

Relative freq. of letters in the English language

Page 22: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

A more realistic example of pmf:

22

Relative freq. of letters in the English languageordered by frequency

[http://en.wikipedia.org/wiki/Letter_frequency]

Page 23: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Example: ASCII Encoder

23

Characterx

Codewordc(x)

⋮E 1000101

⋮L 1001100

⋮O 1001111

⋮V 1010110

SourceEncoder

Information Source

“LOVE”“1001100100111110101101000101”

>> M = 'LOVE';>> X = dec2bin(M,7);>> X = reshape(X',1,numel(X))X =1001100100111110101101000101

MATLAB:

c(“L”) c(“O”) c(“V”) c(“E”)

Cod

eboo

k

Remark:numel(A) = prod(size(A))(the number of elements in matrix A)

Page 24: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

The ASCII Coded Character Set

24[The ARRL Handbook for Radio Communications 2013]

0 16 32 48 64 80 96 112

Page 25: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

A Byte (8 bits) vs. 7 bits

25

>> dec2bin('I Love ECS452',7)ans =1001001010000010011001101111111011011001010100000100010110000111010011011010001101010110010

>> dec2bin('I Love ECS452',8)ans =01001001001000000100110001101111011101100110010100100000010001010100001101010011001101000011010100110010

Page 26: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

>> dec2bin('I Love You',8)ans =01001001001000000100110001101111011101100110010100100000010110010110111101110101

Geeky ways to express your love

26

>> dec2bin('i love you',8)ans =01101001001000000110110001101111011101100110010100100000011110010110111101110101

https://www.etsy.com/listing/91473057/binary-i-love-you-printable-for-your?ref=sr_gallery_9&ga_search_query=binary&ga_filters=holidays+-supplies+valentine&ga_search_type=all&ga_view_type=galleryhttp://mentalfloss.com/article/29979/14-geeky-valentines-day-cardshttps://www.etsy.com/listing/174002615/binary-love-geeky-romantic-pdf-cross?ref=sr_gallery_26&ga_search_query=binary&ga_filters=holidays+-supplies+valentine&ga_search_type=all&ga_view_type=galleryhttps://www.etsy.com/listing/185919057/i-love-you-binary-925-silver-dog-tag-can?ref=sc_3&plkey=cdf3741cf5c63291bbc127f1fa7fb03e641daafd%3A185919057&ga_search_query=binary&ga_filters=holidays+-supplies+valentine&ga_search_type=all&ga_view_type=gallery http://www.cafepress.com/+binary-code+long_sleeve_tees

Page 27: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Summary: Source Encoder

27

SourceEncoder

Information Source

“LOVE”“1001100100111110101101000101”

c(“L”) c(“O”) c(“V”) c(“E”)

• An encoder · is a function that maps each of the symbol in the source alphabet into a corresponding (binary) codeword.

• The list for such mapping is called the codebook.

Source Symbolx

Codewordc(x)

⋮E 1000101

⋮L 1001100

⋮O 1001111

⋮V 1010110

Discrete Memoryless Source (DMS)

• The codewordcorresponding to a source symbol is denoted by

.• the length of

• Each codeword is constructed from a code alphabet.

• For binary codeword, the code alphabet is 0,1

source string encoded string

• The source alphabet is the collection of all possible source symbols.

• Each symbol that the source generates is assumed to be randomly selected from the source alphabet.

w/o extension

Page 28: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Morse code

28

Telegraph network

Samuel Morse, 1838

A sequence of on-off tones (or , lights, or clicks)

(wired and wireless)

Page 29: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Example

29 [http://www.wolframalpha.com/input/?i=%22I+love+you.%22+in+Morse+code]

Page 30: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Example

30

Page 31: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Morse code: Key Idea

31

Frequently-used characters are mapped to short codewords.

Relative frequencies of letters in the English language

Page 32: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Morse code: Key Idea

32

Frequently-used characters (e,t) are mapped to short codewords.

Relative frequencies of letters in the English language

Page 33: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Morse code: Key Idea

33

Frequently-used characters (e,t) are mapped to short codewords.

Basic form of compression.

Page 34: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

รหสัมอร์สภาษาไทย

34

Page 35: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Example: ASCII Encoder

35

Character Codeword

⋮E 1000101

⋮L 1001100

⋮O 1001111

⋮V 1010110

SourceEncoder

Information Source

“LOVE”“1001100100111110101101000101”

>> M = 'LOVE';>> X = dec2bin(M,7);>> X = reshape(X',1,numel(X))X =1001100100111110101101000101

MATLAB:

Page 36: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Another Example of non-UD code

36

Suppose we want to convey the sequence of outcomes from rolling a dice.

x c(x)

1 1

2 10

3 11

4 100

5 101

6 110

A sequence of throws such as 53214 is encoded as 10111101100

Page 37: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Another Example of non-UD code

37

Suppose we want to convey the sequence of outcomes from rolling a dice.

x c(x)

1 1

2 10

3 11

4 100

5 101

6 110

The encoded string 11 could be interpreted as 11: 1 1 3: 11

The encoded string 110 could be interpreted as 12: 1 10 6: 110

Page 38: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Another Example of non-UD code

38

x c(x)

A 1

B 011

C 01110

D 1110

E 10011

Page 39: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Another Example of non-UD code

39

x c(x)

A 1

B 011

C 01110

D 1110

E 10011

Consider the encoded string 011101110011.

It can be interpreted as CDB: 01110 1110 011 BABE: 011 1 011 10011

[ https://en.wikipedia.org/wiki/Sardinas%E2%80%93Patterson_algorithm ]

Page 40: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Game: 20 Questions

40

20 Questions is a classic game that has been played since the 19th century.

One person thinks of something (an object, a person, an animal, etc.)

The others playing can ask 20 questions in an effort to guess what it is.

Page 41: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

20 Questions: Example

41

Page 42: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Shannon–Fano coding

42

Proposed in Shannon’s “A Mathematical Theory of Communication” in 1948

The method was attributed to Fano, who later published it as a technical report. Fano, R.M. (1949). “The transmission of information”.

Technical Report No. 65. Cambridge (Mass.), USA: Research Laboratory of Electronics at MIT.

Should not be confused with Shannon coding, the coding method used to prove Shannon's

noiseless coding theorem, or with Shannon–Fano–Elias coding (also known as Elias coding), the

precursor to arithmetic coding.

Prof. Robert Fano (1917-2016)Shannon Award (1976 )

Page 43: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Huffman Code

43

MIT, 1951 Information theory class taught by Professor Fano. Huffman and his classmates were given the choice of

a term paper on the problem of finding the most efficient binary code.

or a final exam.

Huffman, unable to prove any codes were the most efficient, was about to give up and start studying for the final when he hit upon the idea of using a frequency-sorted binary tree and quickly proved this method the most efficient.

Huffman avoided the major flaw of the suboptimal Shannon-Fanocoding by building the tree from the bottom up instead of from the top down.

David Huffman (1925–1999)Hamming Medal (1999)

Page 44: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Huffman’s paper (1952)

44[D. A. Huffman, "A Method for the Construction of Minimum-Redundancy Codes," in Proceedings of the IRE, vol. 40, no. 9, pp. 1098-1101, Sept. 1952.][ http://ieeexplore.ieee.org/document/4051119/ ]

Page 45: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Summary:

45

[2.16-17] A good code must be uniquely decodable (UD). Difficult to check.

[2.24] Consider a special family of codes: prefix(-free) code. Always UD. Same as being instantaneous.

All codes

Nonsingular codes

UD codes

Prefix-free

Huffman

codes

codes

[Defn 2.30] Huffman’s recipe Repeatedly combine the two least-likely (combined) symbols. Automatically give prefix-free code.

[2.37] For a given source’s pmf, Huffman codes are optimalamong all UD codes for that source.

[Defn 2.36]

Each source symbol can be decoded as soon as we come to the end of the codeword corresponding to it

No codeword is a prefix of any other codeword.

[Defn 2.18]

Page 46: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Huffman coding

46 [ https://www.khanacademy.org/computing/computer-science/informationtheory/moderninfotheory/v/compressioncodes ]

Page 47: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Ex. Huffman Coding in MATLAB

47 [Huffman_Demo_Ex1]

Observe that MATLAB automatically give the expected length of the codewords

pX = [0.5 0.25 0.125 0.125]; % pmf of XSX = [1:length(pX)]; % Source Alphabet[dict,EL] = huffmandict(SX,pX); % Create codebook

%% Pretty print the codebook.codebook = dict;for i = 1:length(codebook)

codebook{i,2} = num2str(codebook{i,2});endcodebook

%% Try to encode some random source stringn = 5; % Number of source symbols to be generatedsourceString = randsrc(1,10,[SX; pX]) % Create data using pXencodedString = huffmanenco(sourceString,dict) % Encode the data

[Ex. 2.31]

Page 48: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Ex. Huffman Coding in MATLAB

48

codebook =

[1] '0' [2] '1 0' [3] '1 1 1'[4] '1 1 0'

sourceString =

1 4 4 1 3 1 1 4 3 4

encodedString =

0 1 1 0 1 1 0 0 1 1 1 0 0 1 1 0 1 1 1 1 1 0

[Huffman_Demo_Ex1]

Page 49: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Ex. Huffman Coding in MATLAB

49 [Huffman_Demo_Ex2]

pX = [0.4 0.3 0.1 0.1 0.06 0.04]; % pmf of XSX = [1:length(pX)]; % Source Alphabet[dict,EL] = huffmandict(SX,pX); % Create codebook

%% Pretty print the codebook.codebook = dict;for i = 1:length(codebook)

codebook{i,2} = num2str(codebook{i,2});endcodebook

EL

[Ex. 2.32]

The codewords can be different from our answers found earlier.

The expected length is the same.

>> Huffman_Demo_Ex2

codebook =

[1] '1' [2] '0 1' [3] '0 0 0 0' [4] '0 0 1' [5] '0 0 0 1 0'[6] '0 0 0 1 1'

EL =

2.2000

Page 50: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Ex. Huffman Coding in MATLAB

50

pX = [1/8, 5/24, 7/24, 3/8]; % pmf of XSX = [1:length(pX)]; % Source Alphabet[dict,EL] = huffmandict(SX,pX); % Create codebook

%% Pretty print the codebook.codebook = dict;for i = 1:length(codebook)

codebook{i,2} = num2str(codebook{i,2});endcodebook

EL

[Exercise]

codebook = [1] '0 0 1'[2] '0 0 0'[3] '0 1' [4] '1'

EL =1.9583

>> -pX*(log2(pX)).'ans =

1.8956

Page 51: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

56

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

x

Page 52: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Entropy and Description of RV

57 [ https://www.khanacademy.org/computing/computer-science/informationtheory/moderninfotheory/v/information-entropy ]

Page 53: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Entropy and Description of RV

58 [ https://www.khanacademy.org/computing/computer-science/informationtheory/moderninfotheory/v/information-entropy ]

Page 54: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Summary: Optimality of Huffman Codes

59

Consider a given DMS withknown pmf … [Defn 2.36] A code is optimal

if it is UD and its corresponding expected length is the shortestamong all possible UD codes for that source.

[2.37] Huffman codes are optimal.

All codes

Nonsingular codes

UD codes

Prefix-free

Huffman

codes

codes

Expected length (per source symbol) of an optimal code

Expected length (per source symbol) of a Huffman code

1

[2.49-2.54] Bounds on expected lengths:

Page 55: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Summary: Entropy

60

Entropy measures the amount of uncertainty (randomness) in a RV.

Three formulas for calculating entropy: [Defn 2.41] Given a pmf of a RV , ≡ ∑ log .

[2.44] Given a probability vector ,

≡ ∑ log .

[Defn 2.47] Given a number , ≡ log 1 log 1

[2.56] Operational meaning: Entropy of a random variable is the average length of its shortest description.

Set 0log 0 0.

binary entropy function

Page 56: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Examples

61

Example 2.31

Example 2.32

Huffman

1.75H X X

Huffman

2.14 2.2H X X

Efficiency = 100%

Efficiency 97%

Page 57: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Examples

62

Example 2.33

Example 2.34

ABCD

Huffman

2.29 2.3H X X

Huffman

1.86 2H X X Efficiency 93%

Efficiency 99%

Page 58: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Summary: Entropy

63

Important Bounds

deterministic uniform The entropy of a uniform (discrete) random variable:

The entropy of a Bernoulli random variable:

binary entropy function

Page 59: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Huffman Coding: Source Extension

66

1 2 3 4 5 6 7 80.4

0.5

0.6

0.7

0.8

0.9

1

n: order of extension

i.i.d.

BernoullikX p0.1p

nL

1

0.533

0.645

[Ex.2.40]

Page 60: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Huffman Coding: Source Extension

67

1 2 3 4 5 6 7 80.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Order of source extension

H X

1H Xn

n

i.i.d.

BernoullikX p0.1p

nL

[Ex.2.40]

Page 61: Digital Communication Systems 452 - 2... · 2019. 1. 9. · [The ARRL Handbook for Radio Communications 2013] 01632 48 64 80 96 112 (American Standard Code for Information Interchange)

Summary: Source Extension

68

The encoder operates on the blocks rather than on individual symbols.

[Defn 2.39] -th extension coding: 1 block = successive source symbols

= expected (average) codeword length per source symbol when Huffman coding is used with -th extension

1 2 3 4 5 6 7 80.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Order of source extension

H X

1H Xn

nL