computing and communications - 2. channel coding and

58
Computing and Communications 2. Channel Coding and Modulation Zhiyong Chen Institute of Wireless Communications Technology Shanghai Jiao Tong University Lecture 4- Oct. 16, 2017

Upload: others

Post on 19-Mar-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Computing and Communications2. Channel Coding and Modulation

Zhiyong Chen

Institute of Wireless Communications TechnologyShanghai Jiao Tong University

Lecture 4- Oct. 16, 2017

Information theory reviewChannel coding

I Channel coding is an operation to achieve reliablecommunication over an unreliable channel. It has two parts.

I An encoder that maps messages to codewords

I A decoder that maps channel outputs back to messages

Information theory reviewBlock Code

Given a channel W : X → Y, a block code with length N and rateR is such that

I the message set consists of integers {1, ...,M = 2NR}I the codeword for each message m is a sequence xN(m) of

length N over XN

I the decoder operates on channel output blocks yN over YN

and produces estimates m of the transmitted message m.

I the performance is measured by the probability of frame(block) error, also called frame error rate (FER), which isdefined as

Pe = Pr(m 6= m). (1)

Additive White Gaussian Noise (AWGN) channelDiscrete-time (DT) AWGN channel

The input at time i is a real number xi , the output is given by

yi = xi + zi (2)

where the noise sequence zi over the entire time frame is i.i.dGaussian ∼ N(0, σ2).

Additive White Gaussian Noise (AWGN) channelCapacity of the DT-AWGN channel

If a block code xN(m) : 1 ≤ m ≤ M is employed subject to apower constraint

N∑i

x2i (m) ≤ NP (3)

the capacity is given by

C =1

2log2(1 +

P

σ2) bits (4)

Additive White Gaussian Noise (AWGN) channelContinuous-time (CT) AWGN channel

This is a waveform channel whose output is given by

y(t) = x(t) + w(t) (5)

where x(t) is the channel input and w(t) is white Gaussian noisewith power spectral density N0/2.

Additive White Gaussian Noise (AWGN) channelCapacity of the CT-AWGN channel

If signaling over the CT-AWGN channel is restricted to waveformsx(t) that are time-limited to [0,T ], band-limited to [−W ,W ], andpower-limited to P, i.e.,∫ T

0x2(t)dt ≤ PT (6)

then the capacity is given by

C = W log2(1 +P

N0W) bits/sec (7)

Additive White Gaussian Noise (AWGN) channelDT model for the CT-AWGN model

I By Nyquist theory, each use of the CT-AWGN channel withsignals of duration T and bandwidth W gives rise to 2WTindependent DT-AWGN channels.

I It is customary to use the DT channels in pairs of “in-phase”and “quadrature” components of a complex number

I Accordingly, the capacity of the two-dimensional (2D)DT-AWGN channels derived from a CT-AWGN channel aregiven by

C2D = log2(1 +Es

N0) bits/2D or bits/Hz (8)

where Es is the signal energy per 2D,

Es ,P

2W= PT J/2D or J/Hz (9)

Additive White Gaussian Noise (AWGN) channelSignal-to-Noise Ratio

I Primary parameters in an AWGN channel are: Signalbandwidth W (Hz), signal power P (Watt), noise powerspectral density N0/2 (Joule/Hz).

I Capacity equals C = W log2(1 + PN0W

)

I Define SNR , P/N0W to write C = W log2(1 + SNR)

I Writing SNR = (P/2W )/(N0/2) SNR can be interpreted asthe signal energy per real dimension divided by the noiseenergy per real dimension

I For 2D complex signalling, one may write SNR = (P/W )/N0

and interpret SNR as signal energy per 2D divided by thenoise energy per 2D.

Additive White Gaussian Noise (AWGN) channelSpectral efficiency ρ and data rate R

I ρ is defined as the number of bits per two dimension over theAWGN channel. Units: bits/two-dimension or b/2D.

I R is defined as the number of bits per second sent over theAWGN channel. Units: bits/sec or b/s.

I Since there are W (2D/s) 2D dimensions per second, we have

R = ρW (10)

I Since ρ = R/W , the units of ρ can also be expressed asb/s/Hz (bits per second per Hertz).

Additive White Gaussian Noise (AWGN) channelNormalized SNR

I Shannons law says that for reliable communication one has tohave

ρ < log2(1 + SNR) (11)

orSNR > 2ρ − 1 (12)

I This motivates the definition

SNRnorm ,SNR

2ρ − 1(13)

I Shannon limit now reads

SNRnorm > 1 (0 dB) (14)

I The value of SNRnorm (in dB) for an operational systemmeasures “gap to capacity”, indicating how much room thereis for improvement.

Algebraic codingPerformance of some early codes

Additive White Gaussian Noise (AWGN) channelAnother measure of signal-to-noise ratio: Eb/N0

I Energy per bit is defined as

Eb , Es/ρ, (15)

and signal-to-noise ratio per information bit as

Eb/N0 = Es/ρN0 = SNR/ρ, (16)

I Shannons limit can be written in terms of Eb/N0 can bewritten as

Eb/N0 >2ρ − 1

ρ(17)

I The function (2ρ − 1)/ρ is an increasing function of ρ > 0,and as ρ→ 0, approaches ln 2 ≈ 0.69 (-1.59 dB), which iscalled the ultimate Shannon limit on Eb/N0.

Additive White Gaussian Noise (AWGN) channelPower-limited and band-limited regimes

I Operation over an AWGN channels is classified as“power-limited” if SNR � 1 and “band-limited” if SNR � 1.

I The Shannon limit on the spectral efficiency can beapproximated as

ρ < log2(1 + SNR) ≈ SNR log2 e, whenSNR � 1, (18)

ρ < log2(1 + SNR) ≈ log2 SNR, whenSNR � 1, (19)

I In the power-limited regime, the Shannon limit on ρ isdoubled by doubling the SNR (a 3 dB increase); while in theband-limited case, doubling the SNR increases the Shannonlimit by only 1 b/2D.

Additive White Gaussian Noise (AWGN) channelBand-limited regime

I Doubling the bandwidth almost doubles the capacity in thedeep band-limited regime.

I Doubling the bandwidth has small effect if the SNR is low(power-limited regime).

Additive White Gaussian Noise (AWGN) channelPower-limited regime

I Doubling the SNR almost doubles the capacity in the deeppower-limited regime.

I Doubling the SNR increases the capacity by not more than 1b/2D in the band-limited regime.

Additive White Gaussian Noise (AWGN) channelCoding and modulation

Additive White Gaussian Noise (AWGN) channelSignal constellations

I An N-dimensional signal constellation with size M is a setA = {a1, ..., aM} ⊂ RN , where each elementaj = {aj1, ..., ajN} ⊂ RN is called a signal point.

I The average energy of the constellation is defined as

E (A) =1

M

M∑j=1

‖ aj ‖2=1

M

M∑j=1

N∑i=1

a2ji (20)

I The minimum squared distance d2min(A) is defined as

d2min(A) = min

i 6=j‖ ai − aj ‖2 (21)

I The average number of nearest neighbors Kmin(A) is definedas the average number of nearest neighbors (at distancedmin(A)).

Additive White Gaussian Noise (AWGN) channelSignal constellation parameters

Some important derived parameters for each constellation are:

I Bit rate (nominal spectral efficiency)ρ = (2/N) log2M

I Average energy per two dimensions:Es = (2N)E (A)

I Average energy per bit:Eb = E (A)/ log2(M) = Es/ρ

I Energy-normalized figure of merits such as:d2min(A)/E (A),d2

min(A)/Es , or d2min(A)/Eb, which are

independent of scale.

Additive White Gaussian Noise (AWGN) channelUncoded 2-PAM

I A = {−α,+α}, N = 1, M = 2,ρ = 2

I E (A) = α2,Es) = 2α2,Eb = α2

I SNR = Es/N0 = α2/N0,SNRnorm = SNR/3

I dmin = 2α,Kmin = 1,d2min/Es = 2

I Probability of bit error

Pb(E ) = Q(√SNR) =

∫ inf

√SNR

1

2πexp{−µ/2}dµ (22)

Additive White Gaussian Noise (AWGN) channelUncoded 2-PAM

I Spectral efficiency: ρ = 2b/2DI Shannon limit: Eb/N0 > (2ρ − 1)/ρ = 3/2(1.76dB)I Target Pb(E ) = 10−5 achieved at Eb/N0 = 9.6dBI Potential coding gain is 9.6− 1.76 = 7.84dBI Ultimate coding gain is 9.6− (−1.59) = 10dB with ρ→ 0

Additive White Gaussian Noise (AWGN) channelUncoded M-PAM

I Signal set: A = α{±1,±3, ...,±(M − 1)}I Parameters:

I ρ = 2 log2(M)b/2DI E (A) = α2(M2 − 1)/3 J/DI Es = 2E ((A)) = 2α2(M2 − 1)/3N0

I SNR = Es/N0 = 2α2(M2 − 1)/3N0

I SNRnorm = SNR/(2ρ − 1) = 2α2/3

I Probability of symbol error, Ps(E ) is given by

Ps(E ) =2(M − 1)

MQ(α/σ) ≈ 2Q(α/σ) = 2Q(

√3SNRnorm)

(23)where σ =

√N0/2.

Additive White Gaussian Noise (AWGN) channelUncoded M-PAM Performance

I This curve is valid for any M-PAM with M � 1.I Target Ps(E ) = 10−5 is achieved at SNRnorm = 8.1 dB.I Shannon limit is SNRnorm = 0dB.

Additive White Gaussian Noise (AWGN) channelUncoded 4-QAM

Signal set: A = {(−α,−α), (−α, α), (α,−α), (α, α)}Parameters:

I N = 2,M = 4,ρ = 2,E (A) = 2α2

I Es = 2α2,Eb = α2

I dmin = 2αI Kmin = 2I dmin/Es = 2

Additive White Gaussian Noise (AWGN) channelUncoded M ×M-QAM

I The Signal is A = AM−PAM ×AM−PAMI Parameters:

I ρ = log2 M2 = 2 log2 M b/2D

I E (A) = 2α2(M2 − 1)/3 J/2DI Es = E (A) = 2α2(M2 − 1)/3I SNR = Es/N0 = 2σ2(M2 − 1)/3N0

I SNRnorm = SNR/(2ρ − 1) = 2σ2/3

I Probability of symbol error, Ps(E ), is given by

Ps(E ) ≈ 4Q(√

3SNRnorm) (24)

Additive White Gaussian Noise (AWGN) channelUncoded QAM Performance

I This curve is valid for any M ×M-QAM with M � 1.I Target Ps(E ) = 10−5 is achieved at SNRnorm = 8.4 dB.I Gap to Shannon limit 8.4dB.

Additive White Gaussian Noise (AWGN) channelCartesian product constellations

I Given a constellation A, define a new constellation A′as the

K th Cartesian power of A:

A′= AK = A×A′ × · · ·A︸ ︷︷ ︸

K

(25)

I E.g., 4-QAM is the second Cartesian power of 2-PAM.

I N′

= KN,M′

= KM

I E (A′) = KE (A)

I K′min = KKmin

I E′s = Es ,E

′b = Eb

I d′min = dmin

I ρ′

= ρ

Additive White Gaussian Noise (AWGN) channelMAP and ML decision rules

I Consider transmission over an AWGN channel using aconstellation A = {a1, ..., aM}. Suppose in each use of thesystem a signal aj ∈ A is selected with probability p(aj) andsent over the channel.

I Given the channel output y, the receiver needs to makes adecision a on which of the signal points a was sent. There arevarious decision rules.

I The Maximum A-Posteriori Probability (MAP) rule sets

aMAP = arg maxa∈A

[p(a|y)] = arg maxa⊂A

[p(a)p(y|a)/p(y)] (26)

I The Maximum Likelihood (ML) rule sets

aML = arg maxa∈A

[p(y)|a] (27)

I ML and MAP rules are equivalent for the important specialcase where p(aj) = 1/M for all j .

Additive White Gaussian Noise (AWGN) channelMinimum Distance decision rule

I Given an observation y, the Minimum Distance (MD) decision

aMD = arg mina∈A‖ y − a ‖ (28)

I On an AWGN channel the ML rule is equivalent to the MDrule. This is because on an AWGN channel, with input-outputrelation y = a + n, the transition probability density is given by

p(y|a) =1

(πN0)N/2exp{− ‖ y − a ‖2 /N0} (29)

Thus, the ML rule aML = arg maxa⊂A[p(y)|a] simplifies to

aMAP = arg mina∈A‖ y − a ‖ (30)

Additive White Gaussian Noise (AWGN) channelDecision regions

I Consider a decision rule for a given N-dimensionalconstellation A with size M. Let Rj ⊂ RN be the set ofobservation points y ∈ RN which are decided as aj .

I For a complete decision rule, the decision regions partition theobservation space:

RN =M⋃j=1

Rj ,Rj

⋂Ri = ∅, i 6= j (31)

I Conversely, any partition of RN into M regions defines adecision rule for N-dimensional signal constellations of size M.

Additive White Gaussian Noise (AWGN) channelProbability of decision error

I Let E be the decision error event. For a receiver with decisionregions Rj , th conditional probability of E given that aj issent is given by

Pr(E |aj) = Pr(y /∈ Rj |aj) (32)

while the average probability of error equals

Pr(E ) =M∑j=1

Pr(aj) Pr(E |aj) (33)

I MAP rule minimizes Pr(E )

Additive White Gaussian Noise (AWGN) channelDecision regions under the MD decision rule

I Under the MD decision rule, the decision regions are given by

Rj = {y ∈ RN :‖ y − aj ‖2≤‖ y − ai ‖2 for all i 6= j} (34)

I The regions Rj are also called the Voronoi regions.

I Each region Rj is the intersection of M − 1 pairwise decisionregions Rji defined as

Rji = {y ∈ RN :‖ y − aj ‖2≤‖ y − ai ‖2} (35)

In other words, Rj =⋂

i 6=j Rji .

Additive White Gaussian Noise (AWGN) channelProbability of error under MD rule on AWGN

I Under any rigid motion (translation or rotation) of aconstellation A, the Voronoi regions also move in the sameway.

I Under the MD decision rule, on any additive AWGN channelwe have

Pr(E |aj) = 1−∫Rj

p(y|aj)dy = 1−∫Rj−aj

pN(n)dn (36)

This probability of error is invariant under rigid motions.(Proof is left as exercise.) (Is this true for any additive noise?)

I Likewise, Pr(E ) is invariant under rigid motions.

I If the mean m = 1M

∑j aj of a constellation A is not zero, we

may translate it by −m to reduce the mean energy from E (A)to E (A)− ‖ m ‖2 without changing Pr(E ).

Additive White Gaussian Noise (AWGN) channelProbability of decision error for some constellations

I For 2-PAMPr(E |aj) = Q(

√2Eb/N0) (37)

where Q(x) =∫ infx

1√2π

exp(−µ2/2)dµ.

I For 4-QAM

Pr(E |aj) = 1− (1− Q(√

2Eb/N0))2 ≈ 2Q(√

2Eb/N0) (38)

I One can express exact error probabilities for M-PAM and(M ×M)-QAM in terms of the Q function. (Exercise)

I However, for general constellations it becomes impractical todetermine the exact error probability. Often one uses somebounds and approximations instead of the exact forms.

Additive White Gaussian Noise (AWGN) channelPairwise error probabilities

We consider MD decision rules and AWGN channels here.

I The pairwise error probability Pr(aj → aj) is defined as theprobability that, conditional on aj being transmitted, thereceived point y is closer to ai than to aj . In other words

Pr(aj → aj) = Pr(‖y − ai‖ � ‖y − aj‖|aj) (39)

I Recalling the pairwise error regions

Rji = {y ∈ RN :‖ y − aj ‖2≤‖ y − ai ‖2} (40)

it can be shown that

Pr(aj → aj) =1√πN0

∫ ∞d(ai ,aj )/2

exp(−x2/N0)dx = Q(‖ai , aj‖2√

2N0)

(41)

Additive White Gaussian Noise (AWGN) channelThe union bound

I The conditional probability of error is bounded (under the MDdecision rule on an AWGN channel) as

Pr(E |aj) =∑i 6=j

Pr(aj → aj) =∑i 6=j

Q(‖ai , aj‖2√

2N0) (42)

I This leads to

Pr(E ) ≤ 1

M

M∑j=1

∑i 6=j

Q(‖ai , aj‖2√

2N0) (43)

I One may also use the approximation

Pr(E ) ≈ Kmin(A)Q(dmin(A)√

2N0) (44)

I The union bound is tight at sufficiently high SNR.

Algebraic codingMotivation for coding

I In general channel transmission will produce errors, and thechoice of decoding criterion does not eliminate errors, thenhow to reduce the error probability? The coding method isused to reduce the error probability.

I For example, p = 0.01 for BSC → PE = 0.01 = 10−2

Algebraic codingMotivation for coding

I How to improve the accuracy of channel transmission? Wecan try the following method

I 0→ 000 and 1→ 111 , so 3− extendedBSC

P =

[p3

p3p2ppp2

p2ppp2

pp2

p2pp2ppp2

pp2

p2ppp2

p2pp3

p3

]I Based on MLD, we have the following decoding function

F (000) = 000,F (001) = 000,F (010) = 000,F (011) =111,F (100) = 000,F (101) = 111,F (110) = 111,F (111) =111

I As a result, PE = 3× 10−4.I Decoding can choose “multiple decoding”, that is, according

to the receiving sequence of more than 1 or more than 0, morethan 0 sentenced to 0, more than 1 will be sentenced to 1.

I The error probability is reduced by two orders of magnitude,which can correct one bit error in the code word. If repeatedseveral times, the error rate can be further reduced

Algebraic codingMotivation for coding

I But there is a new problem, when the n is large, the rate ofinformation transmission will be reduced a lot

I M = 2, n = 1,R = 1

I n = 3,R = 1/3

I n = 5,R = 1/5

I · · ·

Algebraic codingMotivation for coding

I This is clearly a contradiction, there is no solution to it?I Shannon 2rd TheormI We only extend 2 symbols, yielding the decreasing of RI n− extended BSC, R and PE increase with MI M = 4, e.g., 000, 011, 101, 110 as information, so

PE = 2× 10−2 and 2/3I However, another problem: M = 4, there are 70 choices.I Different methods cause different PE , e.g.,

Method 1: 000, 011, 101, 110Method 2: 000, 001, 010, 100

I Method 1 with PE = 2× 10−2, Method 2 withPE = 2.28× 10−2

I Method 1 is the good choice. We can find that if there is a000 error, we can determine the error by using Method 1. Inthe second method, if any one error in 000, became the otherlegal codes, we cannot judge whether the error.

Algebraic codingMotivation for coding

I We can find that the second method are too ”similar” or ”tooclose” to the code words.

I Another example: M = 4, n = 5, so R = 2/5I Input sequence: ai = {ai1, ai2, ai3, ai4, ai5}I Equation: ai3 = ai1 ⊕ ai2, ai4 = ai4, ai5 = ai1 ⊕ ai2I Based on MLD, we have

I This code can correct one bit symbol error in all code words,and also correct the errors of two two bit symbols.

I PE = 7.8× 10−4

Algebraic codingMotivation for coding

I Objective: Introduce the rationale for coding, discuss someimportant algebraic codes

I TopicsI Why coding?I Some important algebraic codes

I Reed-Muller codesI Reed-Solomon codesI BCH codes

Algebraic codingMotivation for coding

I Simple contellations such as PAM and QAM are far fromdelivering Shannons promise. They have a large gap toShannon limit.

I Signaling schemes such as orthogonal, bi-orthogonal, simplexachieve Shannon capacity when one can expand thebandwidth indefinitely; however, after a certain point theybecome impractical both in terms of complexity per bit andbandwidth limitations.

I Shannons proof shows that in the power-limited regime, thekey to achieving capacity is to begin with a simple 1D or 2Dconstellation A, consider Cartesian powers AN of increasinglyhigh orders, and select a subset A′ ∈ A to improve theminimum distance of the constellation at the expense ofspectral efficiency.

Algebraic codingCoding and Modulation

I Design codes in a finite field F taking advantage of thealgebraic structure to simplify encoding and decoding.

I Algebraic codes typically map a binary data sequenceµK ∈ FK

2 into a codeword xN ∈ F2m for some µ ≥ 1.

I Modulation maps F2m into a signal set A ⊂ Rn for somen ≥ 1 (typically n = 1, 2).

I For example, if A = {−α, α}, one may use the mapping0→ α and −1→ −α

Algebraic codingSpectral efficiency with coding and modulation

I For a typical 2D signal set A ⊂ R2 (such as a QAM scheme)and a binary code of rate K/N, the spectral efficiency is

ρ = (log2 |A|)(K

N) b/2D (45)

I Thus, coding reduces the spectral efficiency of the uncodedconstellation by a factor of K/N.

I It is hoped that coding will make up for the deficit in spectralefficiency by improving the distance profile of the signal set.

I Goal: Design codes that have large minimum Hammingdistances in FN

2 (Hamming metric) and modulate them tohave correspondingly large Euclidean distances.

Algebraic codingBinary block codes

DefinitionA binary block code of length n is any subset C ∈ {0, 1}n of the setof all binary n-toples of length n.

DefinitionA code C is called linear if C is a subspace of the vector space Fn

2

Algebraic codingGenerators of a binary linear block code

I Let C ⊂ Fn2 be a binary linear code. Since C is a vector space,

it has a dimension k and there exists a set of basis vectorsG = {g1, ..., gk} that generate C in the sense that

C = {k∑

j=1

ajgj : aj ∈ Fn2, 1 ≤ j ≤ k}. (46)

I Such a code C is called an (n, k) binary linear code. The setG is called the set of generators of C.

I An encoder for a code C with generators Gcan implementedas a matrix multiplication x = aG where G is the generatormatrix whose i-th row is gi , a ∈ Fn

2 is the information word,and x is the code word.

Algebraic codingThe Hamming weight

DefinitionFor x ∈ Fn

2, the Hamming weight of x is defined as

wH(x) = number of ones in x (47)

The Hamming weight has the following properties:

I Non-negativity: wH(x) ≥ 0 with equality iff x = 0.

I Symmetry: wH(x) = wH(−x).

I Triangle inequality: wH(x + y) ≤ wH(x) + wH(y).

Algebraic codingThe Hamming distance

DefinitionFor x, y ∈ Fn

2, the Hamming weight of x and yis defined as

dH(x, y) = wH(x− y) (48)

The Hamming distance has the following properties for anyx, y, z ∈ Fn

2

I Non-negativity: dH(x, y) ≥ 0 with equality iff x = y.

I Symmetry: dH(x, y) = dH(y, x).

I Triangle inequality: dH(x, y) ≤ dH(x + z) + dH(z + y).

Thus, the Hamming distance is a metric in the mathematical senseof the word and the space Fn

2 with this metric is called theHamming space.

Algebraic codingDistance invariance

TheoremThe set of Hamming distance dH(x, y) from any codeword x ∈ Cto all codewords y ∈ C is independent of x, and is equal to the setof Hamming weights wH(y) of all codewords y ∈ C.

Proof.The set of distances from x is dH(x, y) : y ∈ C}. This set can bewritten as {wH(x + y) : y ∈ C} = x + C. But x + C = C for a linearcode. Taking x = 0, we obtain the proof.

Algebraic codingMinimum distance

DefinitionThe code minimum distance d of a code C is defined as theminimum of d(x, y) over all x, y ∈ C with x 6= y.

I The minimum distance d equals the minimum of wH(x) overall non-zero codewords x ∈ C.

I We refer to an (n, k) code with minimum distance d as an(n, k , d) code.

Thus, the Hamming distance is a metric in the mathematical senseof the word and the space Fn

2 with this metric is called theHamming space.

Algebraic codingThe Hamming distance

I Larger dmin, Smaller PE .

Algebraic codingEuclidean Images of Binary Codes

Algebraic codingMinimum distances

Algebraic codingNominal coding gain, union bound

Algebraic codingDecision rules

Algebraic codingHard-decision decoding

Algebraic codingPerformance of some early codes