information theory and channel coding

31
Information Theory and Channel Coding Prof. Rodrigo C. de Lamare CETUC, PUC-Rio, Brazil [email protected]

Upload: others

Post on 01-Nov-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Information Theory and Channel Coding

Prof. Rodrigo C. de Lamare

CETUC, PUC-Rio, Brazil

[email protected]

XI. Polar codes

A. Introduction

B. Encoding and code structure

C. Decoding

β€’ Polar codes were invented by Erdal Arikan in 2009.

β€’ Polar codes are the first provably capacity-achieving efficient coding scheme.

β€’ The encoding of polar codes has low complexity (𝑛 log2 𝑛 ).

β€’ The decoding of polar codes can be carried out via successive cancellationdecoding with cost (𝑛 log2 𝑛 ), list decoding or belief propagation.

β€’ They have been adopted for control channels in extended Mobile Broadband (eMBB) of 5G systems.

A. Introduction

β€’ Let us consider the block diagram of a polar coding system.

β€’ A unique feature of polar codes is the use of polarization of bit channels, which results in noisy useless channels and error-free channels.

β€’ Bits that are transmitted over useless channels are frozen.

Encoding Channel Decoding

π’Ž 𝒄 𝒓 ΰ·œπ’„, ΰ·π’Ž

1 Γ— π‘˜ 1 Γ— 𝑛 1 Γ— 𝑛

Polarization,Frozen bits

β€’ A building block of polar codes is the basic channel transformationillustrated by

where π‘Š is a discrete memoryless channel (DMC).

β€’ For an error-free/ perfect channel:o π‘Œ determines 𝑋o No need to encode data.

β€’ For a useless channel:o π‘Œ is independent of 𝑋o No need to encode.

π‘Šπ‘ˆ1

π‘ˆ2

π‘Š

𝑋1

𝑋2

π‘Œ1

π‘Œ2

β€’ Concatenation of DMCs:

o Channels that are perfect or useless but nothing in between-> polarization

o In a polar coding system a given DMC π‘Š is used 𝑛 times to transmit a codeword.

Definition: Basic Channel Transformation (BCT)

π‘Š β†’ (π‘Šβˆ’,π‘Š+)

π‘Š: ΰΈ–0,1𝒳

β†’ 𝒴

π‘Šβˆ’: 0,1 β†’ 𝒴2: 𝒰1β†’ π‘Œ1, π‘Œ2 (output)

π‘Š+: 0,1 β†’ 𝒴2 Γ— 0,1 : 𝒰2β†’ π‘Œ1, π‘Œ2, π‘ˆ1 (output)

Inputs: 𝑋1 = π‘ˆ1 βŠ•π‘ˆ2𝑋2 = π‘ˆ2

π‘Šπ‘ˆ1

π‘ˆ2

π‘Š

𝑋1

𝑋2

π‘Œ1

π‘Œ2

Example 1

Consider that π‘Š is a binary erasure channel (BEC) with erasure probabilityπœ–

and the BCT

What are π‘Šβˆ’,π‘Š+ ? Consider π‘ˆ1, π‘ˆ2 ∈ {0,1}

π‘Šπ‘ˆ1

π‘ˆ2

π‘Š

𝑋1

𝑋2

π‘Œ1

π‘Œ2

π‘₯0 = 0

π‘₯1 = 1

𝑦0 = 0

𝑦1 =1

πœ–

1 βˆ’ 𝑝

1 βˆ’ 𝑝

𝑝

𝑝

Solution:

π‘Šβˆ’: input 𝒰1 , output (π‘Œ1, π‘Œ2)

i) No erasure: π‘Œ1, π‘Œ2 = (𝒰1 βŠ•π’°2, 𝒰2)

ii) Erasure on the 1st BEC: π‘Œ1, π‘Œ2 = (? , 𝒰2 )

iii) Erasure on the 2nd BEC: π‘Œ1, π‘Œ2 = (𝒰1 βŠ•π’°2, ? )

iv) Two erasures: π‘Œ1, π‘Œ2 = (? , ? )

π‘Šπ‘ˆ1

π‘ˆ2

π‘Š

𝑋1

𝑋2

π‘Œ1

π‘Œ2

π‘₯0 = 0

π‘₯1 = 1

𝑦0 = 0

𝑦1 =1

πœ–

1 βˆ’ 𝑝

1 βˆ’ 𝑝

𝑝

𝑝

π‘Š+: input 𝒰2 , output (π‘Œ1, π‘Œ2, 𝒰1)

i) No erasure: π‘Œ1, π‘Œ2, 𝒰1 = (𝒰1 βŠ•π’°2, 𝒰2, 𝒰1)

ii) Erasure on the 1st BEC: π‘Œ1, π‘Œ2, 𝒰1 = (? , 𝒰2 , 𝒰1)

iii) Erasure on the 2nd BEC: π‘Œ1, π‘Œ2, 𝒰1 = (𝒰1 βŠ•π’°2, ? , 𝒰1)

iv) Two erasures: π‘Œ1, π‘Œ2, 𝒰1 = (? , ? , 𝒰1)

π‘Šπ‘ˆ1

π‘ˆ2

π‘Š

𝑋1

𝑋2

π‘Œ1

π‘Œ2

π‘₯0 = 0

π‘₯1 = 1

𝑦0 = 0

𝑦1 =1

πœ–

1 βˆ’ 𝑝

1 βˆ’ 𝑝

𝑝

𝑝

Polarization

β€’ The polarization effect can be obtained by recursively applying theBCT.

π‘Š

ΰ·ͺΰ·ͺπ‘ˆ1

π‘Š

𝑋1

𝑋2

π‘Œ1

π‘Œ2

π‘Š

π‘Š

𝑋3

𝑋4

π‘Œ3

π‘Œ4

ΰ·ͺΰ·ͺπ‘ˆ2

π‘ˆ1 =ΰ·ͺΰ·ͺπ‘ˆ1

π‘ˆ3 = ΰ·ͺπ‘ˆ2

ΰ·ͺΰ·ͺπ‘ˆ3

ΰ·ͺΰ·ͺπ‘ˆ4

π‘ˆ2 = ΰ·ͺπ‘ˆ3

π‘ˆ4 = ΰ·ͺπ‘ˆ4

β€’ 1st copy of π‘Š: input 𝑋1, output π‘Œ1β€’ 2nd copy of π‘Š: input 𝑋2, output π‘Œ2

π‘Šβˆ’: ΰ·ͺΰ·ͺπ‘ˆ1 β†’ π‘Œ1, π‘Œ2 , where 𝑋1 =ΰ·ͺΰ·ͺπ‘ˆ1 βŠ•

ΰ·ͺΰ·ͺπ‘ˆ2

π‘Š+:ΰ·ͺΰ·ͺπ‘ˆ2 β†’ π‘Œ1, π‘Œ2,ΰ·ͺΰ·ͺπ‘ˆ1 , where 𝑋2 =

ΰ·ͺΰ·ͺπ‘ˆ2

β€’ 3rd copy of π‘Š: input 𝑋3, output π‘Œ3β€’ 4th copy of π‘Š: input 𝑋4, output π‘Œ4

π‘Šβˆ’:ΰ·ͺΰ·ͺπ‘ˆ1 β†’ π‘Œ1, π‘Œ2 , where 𝑋1 =ΰ·ͺΰ·ͺπ‘ˆ1 βŠ•

ΰ·ͺΰ·ͺπ‘ˆ2

π‘Š+: ΰ·ͺΰ·ͺπ‘ˆ2 β†’ π‘Œ1, π‘Œ2,ΰ·ͺΰ·ͺπ‘ˆ1 , where 𝑋2 =

ΰ·ͺΰ·ͺπ‘ˆ2

π‘Š

ΰ·ͺΰ·ͺπ‘ˆ1

π‘Š

𝑋1

𝑋2

π‘Œ1

π‘Œ2

π‘Š

π‘Š

𝑋3

𝑋4

π‘Œ3

π‘Œ4

ΰ·ͺΰ·ͺπ‘ˆ2

π‘ˆ1 =ΰ·ͺΰ·ͺπ‘ˆ1

π‘ˆ3 = ΰ·ͺπ‘ˆ2

ΰ·ͺΰ·ͺπ‘ˆ3

ΰ·ͺΰ·ͺπ‘ˆ4

π‘ˆ2 = ΰ·ͺπ‘ˆ3

π‘ˆ4 = ΰ·ͺπ‘ˆ4

β€’ Application of BCT to π‘Šβˆ’:

1st copy of π‘Š: input ΰ·ͺΰ·ͺπ‘ˆ1, output (Y1, π‘Œ2)

2nd copy of π‘Š: input ΰ·ͺΰ·ͺπ‘ˆ3, output (π‘Œ3, π‘Œ4)

π‘Šβˆ’βˆ’: ΰ·ͺΰ·ͺπ‘ˆ1 β†’ π‘Œ1, π‘Œ2, π‘Œ3, π‘Œ4 , where ΰ·ͺΰ·ͺπ‘ˆ1 = ΰ·ͺπ‘ˆ1 βŠ• ΰ·ͺπ‘ˆ3

π‘Šβˆ’+: ΰ·ͺΰ·ͺπ‘ˆ2 β†’ π‘Œ1, π‘Œ2, π‘Œ3, π‘Œ4, ΰ·ͺπ‘ˆ1 , whereΰ·ͺΰ·ͺπ‘ˆ3 = ΰ·ͺπ‘ˆ3

β€’ Application of BCT to π‘Š+:

1st copy of π‘Š: input ΰ·ͺΰ·ͺπ‘ˆ2, output (Y1, π‘Œ2, π‘ˆ1)

2nd copy of π‘Š: input ΰ·ͺΰ·ͺπ‘ˆ4, output (π‘Œ3, π‘Œ4,ΰ·ͺΰ·ͺπ‘ˆ3)

π‘Š+βˆ’: ΰ·ͺΰ·ͺπ‘ˆ2 β†’ π‘Œ1, π‘Œ2, π‘Œ3, π‘Œ4,ΰ·ͺΰ·ͺπ‘ˆ1,

ΰ·ͺΰ·ͺπ‘ˆ3, , where ΰ·ͺΰ·ͺπ‘ˆ2 = ΰ·ͺπ‘ˆ2 βŠ• ΰ·ͺπ‘ˆ4

π‘Š++: ΰ·ͺΰ·ͺπ‘ˆ2 β†’ π‘Œ1, π‘Œ2, π‘Œ3, π‘Œ4,ΰ·ͺΰ·ͺπ‘ˆ1,

ΰ·ͺΰ·ͺπ‘ˆ3, π‘ˆ2 ǁ , whereΰ·ͺΰ·ͺπ‘ˆ4 = ΰ·ͺπ‘ˆ4

π‘Š

ΰ·ͺΰ·ͺπ‘ˆ1

π‘Š

𝑋1

𝑋2

π‘Œ1

π‘Œ2

π‘Š

π‘Š

𝑋3

𝑋4

π‘Œ3

π‘Œ4

ΰ·ͺΰ·ͺπ‘ˆ2

π‘ˆ1 = ΰ·ͺπ‘ˆ1

π‘ˆ3 = ΰ·ͺπ‘ˆ2

ΰ·ͺΰ·ͺπ‘ˆ3

ΰ·ͺΰ·ͺπ‘ˆ4

π‘ˆ2 = ΰ·ͺπ‘ˆ3

π‘ˆ4 = ΰ·ͺπ‘ˆ4

β€’ By renaming π‘ˆ1 = ΰ·ͺπ‘ˆ1, π‘ˆ2 = ΰ·ͺπ‘ˆ2, π‘ˆ3 = ΰ·ͺπ‘ˆ3, π‘ˆ4 = ΰ·ͺπ‘ˆ4, we obtain

π‘Šβˆ’βˆ’: π‘ˆ1 β†’ π‘Œ1, π‘Œ2, π‘Œ3, π‘Œ4

π‘Šβˆ’ +: π‘ˆ2 β†’ π‘Œ1, π‘Œ2, π‘Œ3, π‘Œ4, π‘ˆ1

π‘Š+βˆ’ : π‘ˆ3 β†’ π‘Œ1, π‘Œ2, π‘Œ3, π‘Œ4, π‘ˆ1, π‘ˆ2

π‘Š+ +: π‘ˆ4 β†’ π‘Œ1, π‘Œ2, π‘Œ3, π‘Œ4, π‘ˆ1, π‘ˆ2, π‘ˆ3

General procedure

β€’ π‘Šπ‘›π‘ 1…𝑠𝑙 ∢ 𝒰 𝑠1…𝑠𝑙 +1 β†’ π‘Œ1, … , π‘Œ2𝑙 , 𝒰1, … , 𝒰(𝑠1…𝑠𝑙) ,

where 𝑠1…𝑠𝑙 ∈ 0,βˆ’, 1, +

𝑛 = 2𝑙

Inputs are 𝑒(𝑠1…𝑠𝑙)

Binary string is (𝑠1…𝑠𝑙)

β€’ The mutual information associated with this procedure is given by

𝐼 π‘Šπ‘›π‘ 1…𝑠𝑙 = 𝐼(𝒰 𝑠1…𝑠𝑙 , π‘Œ1, … , π‘Œ2𝑙 , 𝒰1, … , 𝒰 𝑠1…𝑠𝑙 )

β€’ By measuring the capacity of these concatenated channels, we obtain

𝐢 π‘Šπ‘› = 𝐼 π‘Šπ‘›π‘ 1…𝑠𝑙 ,

which is illustrated by

𝐢 π‘Šπ‘›

𝑙

𝐢 π‘Šπ‘›

1 1

channels

Example 2

Consider the BCT as the following Kronecker matrix

𝑻 =1 11 0

Illustrate how this matrix can be used for polarization with 𝑙 = 1,2

Solution:

We can write for 𝑙 = 1 the inputs 𝑋1 and 𝑋2 as follows:

𝒙 = 𝑋1 𝑋2 = 𝒰1 𝒰21 11 0

= 𝒖 𝑻

For 𝑙 = 2 the inputs 𝑋1, 𝑋2, 𝑋3 and 𝑋4 as follows:

𝒙 = 𝑋1 𝑋2 𝑋3 𝑋4 = 𝒰1 𝒰2𝒰3 𝒰4

1 11 0

1 11 0

1 11 0

0 00 0

= 𝒖 π‘»βŠ— 𝑰 = 𝒖𝑻 𝑻𝑻 𝟎

B. Encoding

β€’ The encoding stage of polar codes relies on the following steps:

o To repeatedly apply the BCT to 𝑛 DMCs.

o To select among the polarized channels those that are best.

o The message bits π’Ž are transmited over good channels

o The useless channels transmit the frozen bits 𝒖𝑓.

β€’ The encoding can be illustrated by

Encoding

π’Ž 𝒄

1 Γ— π‘˜ 1 Γ— 𝑛

Polarization,Frozen bits

Frozenbits

π’Ž

1 Γ— π‘˜ 1 Γ— 𝑛

𝒖BCT𝑻𝑛

𝒄

1 Γ— 𝑛

≑

β€’ For a codeword with length 𝑛 = 2𝑙, we define the (𝑛, π‘˜, β„±, 𝒖𝑓) polar coding scheme π’žπ‘‡π‘› as follows.

β€’ The length 𝑛 codeword 𝒄 of the polar code is given by

𝒄 = 𝒖𝑻𝑛,

where 𝑻𝑛 = π‘·π‘›π‘»βŠ™π’ is the 𝑙-fold BCT transformation, 𝑷𝑛 is a permutation

matrix and 𝒖 ∈ ℝ𝑛 is the input vector structured as

𝒖 = π’Ž | 𝒖𝑓 ,

where 𝒖𝑓 ∈ β„π‘›βˆ’π‘˜ contains the 𝑛 βˆ’ π‘˜ frozen bits.

β€’ The code rate is given by

𝑅 =π‘˜

𝑛

β€’ The encoding complexity is given by O(𝑛 π‘™π‘œπ‘”2𝑛), which involves savingarithmetic operations with the 𝑙-fold BCT.

β€’ It is common to employ other channel codes such as cyclic redundancycheck (CRC) codes to further enhance the performance of polar codes.

Example 3

Consider the BCT as the following Kronecker matrix

𝑻 =1 11 0

Show how this matrix can be used to produce a codeword with length 𝑛 = 4using bit reversal ordering via the matrix

π‘·πŸ’ =

1 00 0

0 01 0

0 10 0

0 00 1

Solution:

We can write for 𝑙 = 1 the inputs 𝑋1 and 𝑋2 as follows:

𝒄 = 𝑋1 𝑋2 = 𝒰1 𝒰21 11 0

= 𝒖 𝑻

For 𝑙 = 2 the inputs 𝑋1, 𝑋2, 𝑋3 and 𝑋4 constitute the codeword as follows:

𝒄 = 𝑋1 𝑋2 𝑋3 𝑋4 = 𝒰1 𝒰2𝒰3 𝒰4

1 11 0

1 11 0

1 11 0

0 00 0

= 𝒖 π‘»βŠ— 𝑰 = 𝒖𝑻 𝑻𝑻 𝟎

Using the bit-reversal matrix, we can write

𝒖 = 𝒖𝑷4and

𝒄 = 𝒖 π‘»βŠ— 𝑰 = 𝒖𝑷4𝑻 𝑻𝑻 𝟎

= 𝒖𝑷4π‘»βŠ™πŸ = 𝒖𝑻4

= 𝒰1 𝒰2𝒰3 𝒰4

1 11 1

1 10 0

1 01 0

1 00 0

Therefore, in general we have

𝑻𝑛 = π‘·π‘›π‘»βŠ™π’

Puncturing

β€’ Puncturing is important to adjust the codeword and/or the rate to therequirements of standards and applications.

β€’ Due to the nature of polar codes they have codewords that are powersof 2, i.e., 𝑛 = 2𝑙, which often requires puncturing of the frozen bits.

Encoding Puncturing

π’Ž 𝒄

1 Γ— π‘˜ 1 Γ— 𝑛 1 Γ— 𝑛’

Polarization,Frozen bits

𝒄′

𝑅 =π‘˜

𝑛𝑅′ =

π‘˜

𝑛′

Example 4

Encode the message π’Ž = 1 0 0 1 0 = 𝒰3 𝒰5𝒰6 𝒰7 𝒰8 using the frozen bits

𝒖𝒇 = 0 1 0 designated by the set β„± = 1,2,4 and the matrix

𝑻8 =

1 11 1

1 11 1

1 11 1

0 00 0

1 10 0

1 10 0

1 10 0

0 00 0

1 01 0

1 01 0

1 01 0

0 00 0

1 00 0

1 00 0

1 00 0

0 00 0

Consider also puncturing to produce a rate R =5

7polar code

Solution:

The codeword is given by

𝒄 = 𝒖 π‘»βŠ— 𝑰 = 𝒖𝑷8π‘»βŠ™πŸ‘ = 𝒖𝑻8

= 0 1 1 0 0 0 1 0

1 11 1

1 11 1

1 11 1

0 00 0

1 10 0

1 10 0

1 10 0

0 00 0

1 01 0

1 01 0

1 01 0

0 00 0

1 00 0

1 00 0

1 00 0

0 00 0

= 1 0 1 1 0 1 1 1

In order to puncture this code to obtain a rate R =5

7polar code, we need

to discard 1 frozen bit, which would result in

𝒄 = 0 1 1 0 1 1 1

C. Decoding

β€’ The most used decoding strategy is based on successive cancellation and isgiven by the following decisions:

ΰ·œπ‘’π‘˜ = α‰Šπ‘’π‘˜, 𝑖𝑓 π‘˜ ∈ β„±

πœ“π‘˜(𝒓 , ΰ·œπ‘’π‘˜βˆ’1 ) 𝑖𝑓 π‘˜ βˆ‰ β„±,

where the decoding functions is given by

πœ“π‘˜ 𝒓 , ΰ·œπ‘’π‘˜βˆ’1 = ࡞1, 𝑖𝑓 log

𝑃 𝒓 , ΰ·œπ‘’π‘˜βˆ’1 |π‘š = 1

𝑃 𝒓 , ΰ·œπ‘’π‘˜βˆ’1 |π‘š = βˆ’1β‰₯ 0

0 otherwise

Channel Decoding

𝒄 𝒓 ΰ·œπ’„, ΰ·π’Ž

1 Γ— 𝑛

Example 5

Compare the performance of an LDPC code and a polar code with

puncturing, 𝑛 = 1920 and rate 𝑅 =1

4.