t305: digital communications 1 arab open university-lebanon tutorial topic 3: information theory

T305: DIGITAL COMMUNICATIONS

1Arab Open University-Lebanon

Tutorial

Topic 3: Information Theory



Tutorial 9

Topic 3: Information Theory

IntroductionIn a digital communication system, messages are produced by a source and transmitted over a channel to the destination. The messages produced by the source need to be encoded for transmission via the communication channel, and then decoded at the destination. A simple model is shown below.



Tutorial 9

Messages from a digital source are made up from a fixed set of source symbols. These might be, for instance, numbers or letters. The set of source symbols is sometimes called the source alphabet.Source symbols need to be represented by code words for transmission. In digital communication systems, code words are usually sequences of binary digits. The code words may be of equal length, or may vary in length. The set of symbols which are used in the code words is sometimes described as the code alphabet. For a binary code, the code alphabet consists of only the symbols 0 and 1. The number of symbols in the code alphabet is known as the radix of the code, so a binary code has radix 2. Morse code, which uses the symbols dot, dash and space, has radix 3.



Tutorial 9

Messages from a source can be coded in different ways, and it is often important to choose an efficient code that will minimize the number of binary digits needed. Another example is ASCII coding where the radix is 2 & they are 128 symbolsUsing 7 bits to represent all characters would give us 27 (or 128). This means that ASCII coding is an efficient coding as there are no left-over combinations.

Example: Morse Code



Tutorial 9

QuestionIf it is required to represent coding for the decimal digit 0-9.

How many binary digits required per decimal digit?

Would you describe binary coded decimal as an efficient code?

The number 12345678 is a composition of a single digit coding. If we represent the number using two digit coding ( different grouping of source symbols – called second extension ) it would be represented as: 12 34 56 78. What would be the total numbers we need to represent? &How many binary digits needed to per number ?

n 2^n

1 2

2 4

3 8

4 16

5 32

6 64

7 128

8 256

9 512

10 1024

11 2048

12 4096

13 8192

14 16384

Hint



Tutorial 9

3.000

3.100

3.200

3.300

3.400

3.500

3.600

3.700

3.800

3.900

4.000

1 3 5 7 9 11 13 15 17 19

Decimal digits per group

Grouping decimal digits

decimal digits per

group

number of code words

binary digits

needed

binary digits per

decimal digit

1 10 4 4.000

2 100 7 3.500

3 1000 10 3.333

4 10000 14 3.500

5 100000 17 3.400

6 1000000 20 3.333

71000000

0 24 3.429

81000000

00 27 3.375

9 1E+09 30 3.333

10 1E+10 34 3.400

Example 12345678

Example 12 34 56 78(00-99)

Example 123 456(000-999)



Tutorial 9

Information and entropy

The concept of information enables the efficiency of a code to be quantified.

In communication theory, the term ‘information’ has a very specific meaning.

A message is said to contain a certain amount of information, which is related to how much is learnt by the recipient of the message.



Tutorial 9

The information conveyed by a message depends on the probability of receiving that particular message. The least probable messages convey most information. The information, I, gained by receiving a message of probability P is given by: I = -log (P)

Because probabilities are always less than 1, the value of log (P) is always a negative quantity; the minus sign ensures that information is always positive.




Tutorial 9

If the probability of receiving a particular message is independent of the messages which have been received before, the source is described as memoryless. For a memoryless source, the amount of information provided by two separate messages can be added to give the information conveyed when both messages have been received. So, I = −log (P1) + (−log (P2)) = −log (P1P2)Many sources do have memory. For instance, consecutive video frames typically have very similar content. In this case the information provided by two consecutive frames is considerably less than the sum of the information in each individual frame.




Tutorial 9

If the probability of a given source symbol is known, the information provided by that symbol can be calculated. If the probabilities of all the possible source symbols are known, the average information for the source can be found. This is described as the entropy of the source.

For a memoryless source, if the source can generate n different symbols and the ith symbol has probability Pi, then the entropy H is given by:




Tutorial 9

The entropy is a characteristic of the source, representing the average amount of information it provides. This is independent of how symbols from the source are coded.

A source has maximum entropy if all its symbols are equally probable.

This corresponds to maximum uncertainty about the outcome of a message.



Tutorial 9

An efficient code will represent the information from the source using as few binary digits as possible.

Shannon has shown that, for any code, the source entropy H is the minimum achievable value for the average length L of the code words:L H

The average code word length L is given by:

where li is the length of the ith code word and Pi is the probability of receiving it.



Tutorial 9

If the entropy of a source is known, the efficiency of any code used for that source can be calculated. The efficiency E is defined as the ratio of the entropy H to the average length L of the code words:




Tutorial 9

Source codingThe process of representing individual source symbols by appropriate code words is called source coding. In source coding the aim is to represent the source symbols for efficient transmission or storage.

In some cases the most efficient code is one which uses code words of fixed length. A fixed length code with code words of length n binary digits can represent 2n different source symbols.



Tutorial 9

The average number of binary digits required per source symbol can sometimes be reduced by grouping source symbols before coding. In this case the messages are considered to come from a new source, described as an extension of the original source. For example, for a source with three symbols, the symbols can be taken in pairs to form the second extension of the source. This extended source would have nine symbols

Source coding



Tutorial 9

If the probabilities of the source symbols are not all equal, a variable length code may be more efficient. When using a variable length code, the average code word length can be minimized by using short code words for the most probable source symbols and longer ones for the least probable.

When decoding a message using a fixed length code, the start of each new code word can be found by counting binary digits. Each code word can then be translated unambiguously to a single source symbol; the code is uniquely decodable. A code word can also be translated as soon as it arrives; so we say that the code is instantaneously decodable.

Source coding



Tutorial 9

Example of fixed length coding tree



Tutorial 9

Example of variable length coding



Tutorial 9

For a variable length code, however, the code needs to be carefully designed if it is to be uniquely and instantaneously decodable. For example, if four source symbols are represented by the binary sequences 0, 1, 01 and 10, the sequence 01 could be one code word or a sequence of two code words. Therefore this code is not uniquely decodable.

Not all uniquely decodable variable length codes are also instantaneously decodable. For example, if four source symbols are represented by the binary sequences 0, 01, 011 and 111, a typical message is: 00101101110Although this cannot be decoded instantaneously (from left to right), it can be decoded uniquely by working from the right to the left. In terms of code words, the sequence then becomes: 0, 01, 011, 0, 111, 0Because the message cannot be instantaneously decoded, the complete message must be stored before it can be decoded.



Tutorial 9

To design a code which is instantaneously decodable, the code designer must ensure that no code word forms the first part (called the prefix) of any other code word. If this condition is met, the decoder can accumulate binary digits until the sequence received corresponds to a complete code word.Instantaneous codes can be generated and decoded by means of coding trees as shown below. Left and right ‘branches’ represent 0s and 1s or vice versa. All the code words for a given instantaneous code (filled circles) correspond to the end-points of ‘branches’.



Tutorial 9



Tutorial 9

Which of the trees is most efficient if the symbols are equally probable?



Tutorial 9



Tutorial 9

Huffman coding.



Tutorial 9

Huffman code Saving



Tutorial 9



Tutorial 9

One way of finding an instantaneous code which is as efficient as possible is to use Huffman coding. In Huffman coding, source symbols with the largest probabilities are allocated systematically to the shortest code words as shown below.



Tutorial 9

Source codingAs with all codes, the average length Lh of the code words for Huffman coding is greater than or equal to the source entropy H. However, for Huffman coding the average code word length exceeds the entropy by at most 1.

If the source messages are combined, forming a source extension, and then Huffman coded, this results in longer code words and a larger source entropy. One of the features of source extensions is that if H1 is the entropy of the first extension (i.e. original source symbols) then the entropy, H2, of the second extension (i.e. pairs of the original source symbols), is: H2 = 2 H1



Tutorial 9

This can be generalized to higher-order extensions, and the entropy of the rth extension is: Hr = r H1If the process of taking higher and higher extensions is continued, the entropy H and the average code-word length Lh become large. The average code-word length therefore approaches the entropy. This result is Shannon’s first theorem.

Source coding



Tutorial 9

Channel codingCommunication systems do experience noise, for example from electrical interference. A model often used to study the effects of this noise has a single source of noise connected to the channel.

Noise distorts the signals used for communication and can cause errors in the received messages. Errors can be detected and sometimes corrected by adding redundant digits to source-coded messages. This means that the average number of binary digits per source symbol, L, is necessarily larger than the entropy, H, and so the efficiency, E, of the code must be less than 1. The redundancy, R, of a code is defined as:



Tutorial 9

Channel codingIn many cases, channel codes are designed assuming:

that the error rate is low,that errors occur independently of each other,that there is a negligible probability of several errors occurring together.

For burst noise, where sequences of errors can occur, these assumptions are not all valid. Some codes are designed specifically to cope with burst noise; one example is the cyclic redundancy check. Alternatively, interleaving can be used to spread the effect of a noise burst over several code words, reducing the impact on any one of them.



Tutorial 9

Figure below illustrates a channel code called a rectangular code. Sequences of message digits are grouped as ‘rows’, making up a fixed size block. A ‘horizontal’ parity check digit is inserted at the end of each row. Then a final row of ‘vertical’ parity checks is added, one for each ‘column’. If no more than one error occurs per block, a rectangular code can locate and hence correct it.

Channel coding



Tutorial 9

Channel coding

A more efficient use of parity digits is made in Hamming codes. Hamming showed that channel codes can be designed which use m parity check digits within a block (sequence) of n digits, where:

Here n includes the parity digits. A Hamming code with a block size of n = 15 uses only 4 parity digits for 11 message digits. Hamming codes can only correct one error per block.



Tutorial 9

Figure below represents a 7-digit Hamming code for a 4-digit message. The original message digits become digits 3, 5, 6 and 7 of the 7-digit Hamming coded word. Digit 1 of this word forms a parity digit for digits 3, 5 and 7; digit 2 forms a parity digit for digits 3, 6 and 7; and digit 4 forms a parity digit for digits 5, 6 and 7.



Tutorial 9

Alternatively, look-up tables can be used by the encoder to convert the original 4-digit message to a 7-digit Hamming code, and by the decoder to convert the received 7-digit sequence to the corresponding 4-digit message.



Tutorial 9

Channel capacityIf a transmission channel is noisy, this affects the rate at which information can be transmitted. If a message is transmitted over a noisy channel, one or more of the digits could be corrupted, and some of the information in the message would be lost.The effect of errors can be minimized by using channel codes which add redundant digits for error detection and correction. So using redundant codes means that the information rate for a channel which can transmit a fixed number of binary digits per second is reduced.

t305: digital communications 1 arab open university-lebanon tutorial topic 3: information theory

Documents