1 abelian group codes for channel coding and source coding

1

Abelian Group Codes for Channel Coding and

Source CodingAria G. Sahebi and S. Sandeep Pradhan

Department of Electrical Engineering and Computer Science,

University of Michigan, Ann Arbor, MI 48109, USA.Email: [email protected], [email protected]

Abstract

In this paper, we study the asymptotic performance of Abelian group codes for the the channel coding

problem for arbitrary discrete (finite alphabet) memoryless channels as well as the lossy source coding

problem for arbitrary discrete (finite alphabet) memoryless sources. For the channel coding problem,

we find the capacity characterized in a single-letter information-theoretic form. This simplifies to the

symmetric capacity of the channel when the underlying group is a field. For the source coding problem, we

derive the achievable rate-distortion function that is characterized in a single-letter information-theoretic

form. When the underlying group is a field, it simplifies to the symmetric rate-distortion function. We give

several illustrative examples. Due to the non-symmetric nature of the sources and channels considered,

our analysis uses a synergy of information-theoretic and group-theoretic tools.

I. INTRODUCTION

Approaching the information-theoretic performance limits of communication using structured codes

has been of great interest for the last several decades [1], [7], [14], [15]. The earlier attempts to design

computationally efficient encoding and decoding algorithms for point-to-point communication (both

channel coding and source coding) resulted in injection of finite field structures to the coding schemes

[12]. In the channel coding problem [24], the channel input alphabets are matched to algebraic structure

and encoders are represented by matrices. Similarly in source coding problem [18], the reconstruction

alphabets are matched to algebraic structure and decoders are represented by matrices. Later these coding

This work was supported by NSF grant CCF-1116021. This work was presented in part at IEEE International Symposium on

Information Theory (ISIT), July 2011, and Allerton conference on communication, conrol and computing, October 2012.

2

approaches were extended to weaker algebraic structures such as rings and groups [2], [3], [11], [19],

[20]1. The motivation for this is twofold: a) finite fields exist only for alphabets with size equal to a

prime power, and b) for communication under certain constraints, codes with weaker algebraic structures

have better properties. For example, when communicating over an additive white Gaussian noise channel

with 8-PSK constellation, codes over Z8, the cyclic group of size 8, are more desirable over binary linear

codes because the structure of the code is matched to the structure of the signal set [10] (also see [3]),

and hence the former have superior error correcting properties. As another example, construction of polar

codes over alphabets of size pr, for r > 1 and p prime, is simpler with a module structure rather than

a vector space structure [29], [32], [33]. Subsequently, as interest in network information theory grew,

these codes were used to approach the information-theoretic performance limits of certain special cases

of multi-terminal communication problems [4], [17], [31], [36], [37]. These limits were obtained earlier

using the random coding ensembles in the information theory literature.

In 1979, Korner and Marton, in a significant departure from tradition, showed that for a binary

distributed source coding problem, the asymptotic average performance of binary linear code ensembles

can be superior to that of the standard random coding ensembles. Although, structured codes were

being used in communication mainly for computational complexity reasons, the duo showed that, in

contrast, even when computational complexity is not an issue, the use of structured codes leads to superior

asymptotic performance limits in multi-terminal communication problems. In the recent past, such gains

were shown for a wide class of problems [5], [23], [25], [30]. In our prior work, we developed an

inner bound to the optimal rate-distortion region for the distributed source coding problem [23] in which

cyclic group codes were used as building blocks in the coding schemes. Similar coding approaches were

applied for the interference channel and the broadcast channel in [26], [27]. The motivation for studying

Abelian group codes beyond the non-existence of finite fields over arbitrary alphabets is the following. The

algebraic structure of the code imposes certain restrictions on the performance. For certain communication

problems, linear codes were shown to be not optimal [23], and Abelian group codes exhibit a superior

performance. For example, consider a distributed source coding problem with two statistically correlated

but individually uniform quaternary sources X and Y that are related via the relation Y = −X + Z,

where + denotes addition modulo-4 and Z is a hidden quaternary random variable that has a non-

uniform distribution and is independent of X . The joint decoder wishes to reconstruct Z losslessly. In

this problem, random codes over Z4 perform better than random linear codes over the Galois field of size

1Note that this is an incomplete list. There is a vast body of work on group codes. See [12] for a more complete bibliography.

3

4. In summary, the main reason for using algebraic structured codes in this context is performance rather

than complexity of encoding and decoding. Hence information-theoretic characterizations of asymptotic

performance of Abelian group code ensembles for various communication problems and under various

decoding constraints became important.

Such performance limits have been characterized in certain special cases. It is well-known that binary

linear codes achieve the capacity of binary symmetric channels [16]. More generally, it has also been

shown that q-ary linear codes can achieve the capacity of symmetric channels [14] and linear codes can

be used to compress a source losslessly down to its entropy [22]. Goblick [1] showed that binary linear

codes achieve the rate-distortion function of binary uniform sources with Hamming distortion criterion.

Group codes were first studied by Slepian [34] for the Gaussian channel. In [6], the capacity of group

codes for certain classes of channels has been computed. Further results on the capacity of group codes

were established in [7], [8]. The capacity of group codes over a class of channels exhibiting symmetries

with respect to the action of a finite Abelian group has been investigated in [11].

In this work, we focus on two problems. In the first part, we consider the channel coding problem

for arbitrary discrete memoryless channels. We assume that the channel input alphabet is equipped with

the structure of a finite Abelian group G. We provide an information-theoretic characterization of the

capacity of such channels achievable using group codes which are cosets of subgroups of Gn, where n

denotes the block length of encoding which is arbitrarily large. This performance limit is equal to the

symmetric capacity of the channel when the underlying group is a field; i.e., it is equal to the Shannon

mutual information between the channel input and the channel output when the channel input is uniformly

distributed. In the general case, additional constraints corresponding to subgroups of the underlying group

appear in the characterization, and the achievable rate can be smaller than the symmetric capacity of the

channel.

In the second, we consider the lossy source coding problem for arbitrary discrete memoryless sources

with single-letter distortion measures and the reconstruction alphabet being equipped with the structure

of a finite Abelian group G. We provide an information-theoretic characterization of the rate-distortion

function achievable using group codes which are cosets of subgroups of Gn. The performance limit

is equal to the symmetric rate-distortion function of the source when the underlying group is a field

i.e., the Shannon rate-distortion function with the additional constraint that the reconstruction variable is

uniformly distributed. For the general case, as in channel coding, additional constraints corresponding

to subgroups of the underlying group appear in the characterization, and this can result in a larger rate

compared to the symmetric rate for a given distortion level.

4

We use joint typicality encoding and decoding [13] for both problems at hand, which makes the

analysis more tractable. In this approach we use a synergy of information-theoretic and group-theoretic

tools. The traditional approaches have looked at encoding and decoding of structured codes based on

either minimum distance or maximum likelihood. However, the approach based on joint typicality does

not provide insight into error exponents as compared to traditional ones.

The paper is organized as follows: In Section II, some definitions and basic facts are stated which are

used in the paper. In Section III, we give two examples of multi-terminal communication problems where

the performance of group code ensembles is strictly superior to that of unstructured code ensembles. In

Section IV, we introduce the ensemble of Abelian group codes used in the paper. In Section V, we

state the main results of the paper for both the source coding problem as well as the channel coding

problem. In Section VI, we prove the converse results for both problems and, in Section VII, we prove

the achievability results. We conclude in Section VIII.

II. PRELIMINARIES

1) Channel Model: We consider discrete memoryless channels used without feedback. We associate

two finite sets X and Y with the channel as the channel input and output alphabets. The input-output

relation of the channel is characterized by a conditional probability law WY |X(y|x) for x ∈ X and y ∈ Y .

The channel is specified by (X ,Y,WY |X).

2) Source Model: The source is modeled as a discrete-time memoryless random process X with

each sample taking values from a finite set X called alphabet according to the distribution PX . The

reconstruction alphabet is denoted by a finite set U and the quality of reconstruction is measured by a

bounded single-letter distortion function d : X × U → R+. We denote this source by (X ,U , PX , d).

3) Groups: All groups referred to in this paper are finite Abelian groups. Given a group (G,+), a

subgroup H of G is denoted by H ≤ G. A coset C of a subgroup H is a shift of H by an arbitrary

element a ∈ G (i.e., C = a + H for some a ∈ G). For a subgroup H of G, the number of cosets of

H in G is called the index of H in G and is denoted by |G : H|. The index of H in G is equal to

|G|/|H| where |G| and |H| are the cardinality or size of G and H respectively. For a prime p dividing

the cardinality of G, the Sylow-p subgroup of G is the largest subgroup of G whose cardinality is a

power of p. Group isomorphism is denoted by ∼=.

4) Group Codes: Given a group G, a group code C over G with block length n is any subgroup of

Gn. A shifted group code over G, C+B is a translation of a group code C by a fixed vector B ∈ Gn.

5

Group codes generalize the notion of linear codes over fields to sources with reconstruction alphabets

(and channels with input alphabets) having composite sizes.

5) Achievability for Channel Coding: For a group G, a group transmission system with parameters

(n,Θ, τ) for reliable communication over a given channel (X = G,Y,WY |X) consists of a codebook,

an encoding mapping and a decoding mapping. The codebook C is a shifted subgroup of Gn whose size

is equal to Θ and the mappings are defined as

Enc : {1, 2, · · · ,Θ} → C

Dec : Yn → {1, 2, · · · ,Θ}

such that

max1≤m≤Θ

∑x∈Xn

1{x=Enc(m)}∑y∈Yn

1{m6=Dec(y)}Wn(y|x) ≤ τ

Given a channel (X = G,Y,WY |X), a rate R is said to be achievable using group codes if for all ε > 0

and for all sufficiently large n, there exists a group transmission system for reliable communication with

parameters (n,Θ, τ) such that

1

nlog Θ ≥ R− ε, τ ≤ ε

The group capacity of the channel C is defined as the supremum of the set of all achievable rates using

group codes.

6) Achievability for Source Coding and the Rate-Distortion Function: For a group G, a group trans-

mission system with parameters (n,Θ,∆, τ) for compressing a given source (X ,U = G,PX , d) consists

of a codebook, an encoding mapping and a decoding mapping. The codebook C is a shifted subgroup

of Gn whose size is equal to Θ and the mappings are defined as

Enc : X n → {1, 2, . . . ,Θ},

Dec : {1, 2, . . . ,Θ} → C

such that E [d(Xn, Un)] ≤ ∆, where Xn is the random vector of length n generated by the source,

and Un is the reconstruction vector and is given by Un = Dec(Enc(Xn)). In this transmission system,

n denotes the block length, log Θ denotes the number of “channel uses”, ∆ denotes the distortion level

and the distortion d(x, x) between two vectors x and x is assumed to be the average of the single-letter

6

distortion between the components xi and xi. Let Θi denote the size of the range of the ith component of

the shifted group code. Given a source (X ,U = G,PX , d), a pair of non-negative real numbers (R,D)

is said to be achievable using group codes if for every ε > 0 and for all sufficiently large numbers n,

there exists a group transmission system with parameters (n,Θ,∆) for compressing the source such that

1

nlog Θ ≤ R+ ε, ∆ ≤ D + ε,

1

n

n∑i=1

[log Θi −H(Ui)] ≤ ε.

The optimal group rate-distortion function R∗(D) of the source is given by the infimum of the rates R

such that (R,D) is achievable using group codes.

The rationale for imposing the third constraint regarding the average entropy of single-letter recon-

struction samples is the following. Recall that in channel coding all codewords are used by definition.

However in source coding, we may have a situation where only a small fraction of the codewords in the

group code is used. In which case, one may have to use some form of entropy coding to remove the

redundancy. This leads to a different class of codes called nested group codes. We will not study this

class in this paper. To prevent this situation, we impose the third constraint. This also puts the source

coding problem on a similar footing as compared to the channel coding problem.

7) Typicality: We follow the notion of typicality as found in [13]. Consider two random variables

X and Y with joint probability mass function PX,Y (x, y) over X × Y . Let n be an integer and ε be a

positive real number. The sequence pair (xn, yn) belonging to X n × Yn is said to be jointly ε-typical

with respect to PX,Y (x, y) if

∀a ∈ X , ∀b ∈ Y :

∣∣∣∣ 1nN (a, b|xn, yn)− PX,Y (a, b)

∣∣∣∣ ≤ ε

|X ||Y|

and none of the pairs (a, b) with PX,Y (a, b) = 0 occurs in (xn, yn). Here, N(a, b|xn, yn) counts the

number of occurrences of the pair (a, b) in the sequence pair (xn, yn). We denote the set of all jointly

ε-typical sequence pairs in X n × Yn by Anε (X,Y ).

Given a sequence xn ∈ Anε (X), the set of ε-typical sequences Anε (Y |xn) is defined as

Anε (Y |xn) = {yn ∈ Yn |(xn, yn) ∈ Anε (X,Y )}

8) Notation: In our notation, O(ε) is any function of ε such that limε↓0O(ε) = 0, P is the set of

all primes, Z+ is the set of positive integers and R+ is the set of non-negative reals. Let |x|+ denote

max{x, 0}. Since we deal with summations over several groups in this paper, when not clear from the

context, we indicate the underlying group in each summation; e.g. summation over the group G is denoted

by

(G)︷︸︸︷∑. Direct sum of groups is denoted by

⊕and direct product of sets is denoted by

⊗.

7

III. MOTIVATING EXAMPLES

In this section, we present examples of the use of group codes for two multi-terminal communication

problems, namely the distributed source coding problem and the interference channel coding problem.

In these problems, one can use group code ensembles to improve upon the asymptotic performance of

the standard random coding ensembles used in information theory.

A. Example in Distributed Source Coding

Consider a two-user distributed source coding problem in which the two sources X and Y take values

from Z4 and a centralized decoder is interested in decoding the sum of the two sources losslessly. The

two sources need to be compressed distributively. Furthermore, assume that X is uniformly distributed

over Z4 and Y = −X + Z where Z is independent from X and is distributed over Z4 such that

PZ(0) = 1 − τ and PZ(1) = PZ(2) = PZ(3) = τ3 for some τ ∈ (0, 1). Let R1 and R2 be the rates of

the two encoders. Using Slepian-Wolf coding [35], which uses random unstructured codes, one can show

that the the following sum rate is achievable

RU = R1 +R2 = H(X,Y ) = H(X,−X + Z) = H(X) +H(Z) = 2 + h(τ) + τ log 3,

where h(·) denotes the binary entropy function. We present a coding scheme based on group codes that is

optimal in the sum rate and strictly improves upon the rate achievable using random unstructured codes.

Consider a fictitious discrete memoryless channel with input U and output V related via V = U +Z,

where Z is independent of U . Let C be a good group channel code for the channel PV |U . Using the

channel coding result in this paper (see Section VII), the rate of C can be arbitrarily close to

r = min(I(U ;V ), 2I(U ;V |[U ])

)= min

(2−H(U |V ), 2− 2H(U |[U ]V )

)where for g ∈ Z4, [g] = g + {0, 2}. Note that H(U |V ) = H(Z) and

H(U |[U ]V )(a)= H(V − Z|[Z]V )

(b)= H(Z|[Z])

where (a) follows since there is a one-to-one correspondence between ([V −Z], V ) and ([Z], V ) and (b)

follows since V and Z are independent. Therefore, we have

r = min(

2−H(Z), 2− 2H(Z|[Z]))

(a)= 2−H(Z) = 2− h(τ)− τ log(3),

where in (a) we have used the relation, h(τ)+τ log2(3) ≤ 2h(2τ/3), to show that I(U ;V ) < 2I(U ;V |[U ]).

The encoding scheme for the distributed source coding problem is as follows: Given a pair of source

sequences xn, and yn, the X-, Y -encoder send xn+C, and yn+C, respectively, to the decoder. In other

8

words each encoder sends the index of the coset of C that contains its source word. This implies that,

because of the closure of C with respect to group addition, and the commutativity of the group, the

decoder has zn +C, from which it can recover zn with high probability using the property of the code.

Therefore, using group codes, the following sum rate is achievable:

RG = R1 +R2 = 2(2− r) = 2 max(H(Z), 2H(Z|[Z])

)= 2H(Z) = 2h(τ) + 2τ log(3) < RU .

It can be shown that this rate is not achievable using random linear codes over Galois field of size 4.

The optimality follows from the standard information-theoretic arguments. The algebraic property of the

code–closure with respect to addition modulo-4– is exploited in reducing the sum rate from H(X,Y ) to

2H(Z).

Consider another example with PZ(0) = 0.4014, PZ(1) = 0.2035, PZ(2) = 0.3356 and PZ(3) =

0.0595. Here H(Z) = 1.7669, 2H(Z|[Z]) = 1.8712. The achievable rate for distributed compression

using random unstructured codes or random linear codes is RU = 3.7669 and that using random group

codes is RG = 3.7424. We do not claim optimality for this example. Although random group codes over

Z4 are inferior to random unstructured codes in terms of point-to-point compression of the source Z, in

the problem of distributed compression of X and Y , they outperform the latter.

B. Example for the Interference Channel

Consider the problem of communication over the following interference channel between three pairs

of encoders and decoders. The 3-user interference channel [28, Example 7] has three inputs X1, X2 and

X3 which take values from Z4 and has three outputs, which are given by Y1 = X1 + X2 + X3 + N1,

Y2 = X2 + N2 and Y3 = X3 + N3 where the additions are mod-4 operations and N1, N2 and N3 are

independent random variables distributed according to PNi(0) = 1−δi, PNi(1) = PNi(2) = PNi(3) = δi3

for i = 1, 2, 3. For simplicity, we assume δ2 = δ3 = δ. The input X1 is costed according to w(0) = 0,

w(1) = w(2) = w(3) = 1 but X2 and X3 are not costed. Let τ be the cost constraint on X1. Let us

assume for simplicity δ1, δ <14 and τ < 3

4 . Observe that the channel inputs of the second and the third

transmitters interfere with that of the first. There is no interference for the second and third receivers.

Consider the following optimal coding scheme based on group codes. Let C be a good group channel

code for the channel with input X2 and output Y2. The same code is employed for communication

between X3 and Y3. Using the arguments used in distributed source coding, it follows that the following

rates are achievable for transmitters 2 and 3:

R2 = R3 = 2− h(δ)− δ log(3).

9

Note that X2 +X3 interferes with X1 at Y1, and the decoder wishes to decode the interference first. Using

the closure of C with respect to group addition, note that the rate of Xn2 + Xn

3 is 2 − h(δ) − δ log(3).

The interference can be decoded reliably if the effective noise given by X1 +N1 satisfies: P (X1 +N1 6=

0) ≤ P (N2 6= 0) = P (N3 6= 0). This condition implies that δ1 + τ − 4δ1τ3 ≤ δ. After decoding the

interference, the effective channel becomes (Y1 − X2 − X3) = X1 + N1, and the first transmitter can

communicate at the following rate, given by the capacity of this effective channel,

R1 = C∗ , supPX1

:Ew(X1)≤τI(X1;Y1|X2 +X3).

In summary, the rate triple (C∗, 2−h(δ)− δ log(3), 2−h(δ)− δ log(3)) is achievable using group codes.

It can be shown that [28, Lemma 8] if

C∗ + 2(2− h(δ)− δ log 3) > 2− h(δ1)− δ1 log 3,

then the above triple of rates cannot be achieved using either random unstructured codes or random linear

codes over the Galois field of size 4. The following example of δ1, τ and δ satisfies all the conditions:

δ1 = τ = 34 −

√308 , δ = 1

8 . Note that group codes outperform random codes in this case because there is

a match between the structure of the group code and the structure of the channel.

IV. ABELIAN GROUP CODE ENSEMBLE

In this section, we use a standard characterization of Abelian groups and introduce the ensemble of

Abelian group codes used in the paper.

A. Abelian Groups

For an Abelian group G, let P(G) denote the set of all distinct primes which divide |G| and for a

prime p ∈ P(G) let Sp(G) be the corresponding Sylow subgroup of G. It is known [21, Theorem 3.3.1]

that any Abelian group G can be decomposed as a direct sum of its Sylow subgroups in the following

manner

G =⊕

p∈P(G)

Sp(G) (1)

Furthermore, each Sylow subgroup Sp(G) can be decomposed into Zpr groups as follows:

Sp(G) ∼=⊕

r∈Rp(G)

ZMp,r

pr (2)

10

where Rp(G) ⊆ Z+ and for r ∈ Rp(G), Mp,r is a positive integer. Note that ZMp,r

pr is defined as the

direct sum of the ring Zpr with itself for Mp,r times. Combining Equations (1) and (2), we can represent

any Abelian group as follows:

G ∼=⊕

p∈P(G)

⊕r∈Rp(G)

ZMp,r

pr =⊕

p∈P(G)

⊕r∈Rp(G)

Mp,r⊕m=1

Z(m)pr (3)

where Z(m)pr is called the mth Zpr ring of G or the (p, r,m)th ring of G. Equivalently, this can be written

as follows

G ∼=⊕

(p,r)∈Q(G)

ZMp,r

pr =⊕

(p,r,m)∈G(G)

Z(m)pr

where Q(G) ⊆ P×Z+ is defined as:

Q(G) = {(p, r)|p ∈ P(G), r ∈ Rp(G)}, (4)

and G(G) ⊆ P×Z+ ×Z+ is defined as:

G(G) = {(p, r,m) ∈ P×Z+ ×Z+|(p, r) ∈ Q(G),m ∈ {1, 2, · · · ,Mp,r}}

This means that any element a of the Abelian group can be regarded as a vector whose components

are indexed by (p, r,m) ∈ G(G) and whose (p, r,m)th component ap,r,m takes values from the ring Zpr ,

or as a vector whose components are indexed by (p, r) ∈ Q(G), and whose (p, r)th component ap,r takes

values from the ring ZMp,r

pr .

With a slight abuse of notation, we represent an element a of G as

a =⊕

(p,r,m)∈G(G)

ap,r,m

Furthermore, for two elements a, b ∈ G, we have

a+ b =⊕

(p,r,m)∈G(G)

ap,r,m +pr bp,r,m

where + denotes the group operation and +pr denotes addition mod-pr. More generally, let a, b, · · · , z

be any number of elements of G. Then we have

a+ b+ · · ·+ z =⊕

(p,r,m)∈G(G)

(ap,r,m +pr bp,r,m +pr · · ·+pr zp,r,m) (5)

This can equivalently be written as

[a+ b+ · · ·+ z]p,r,m = ap,r,m +pr bp,r,m +pr · · ·+pr zp,r,m

11

where [·]p,r,m denotes the (p, r,m)th component of it’s argument.

Let IG:p,r,m ∈ G be a generator for the group which is isomorphic to the (p, r,m)th ring of G. Then

we have

a =

(G)︷︸︸︷∑(p,r,m)∈G(G)

ap,r,mIG:p,r,m (6)

where the summations are done with respect to the group operation and the multiplication ap,r,mIG:p,r,m

is by definition the summation (with respect to the group operation) of IG:p,r,m to itself for ap,r,m times.

In other words, ap,r,mIG:p,r,m is the short hand notation for

ap,r,mIG:p,r,m =

(G)︷︸︸︷∑i∈{1,··· ,ap,r,m}

IG:p,r,m

where the summation is the group operation.

Example: Let G = Z4 ⊕Z3 ⊕Z29. Then we have P(G) = {2, 3}, S2(G) = Z4 and S3(G) = Z3 ⊕Z2

9,

R2(G) = {2}, R3(G) = {1, 2}, M2,2 = 1, M3,1 = 1, M3,2 = 2 and

G(G) = {(2, 2, 1), (3, 1, 1), (3, 2, 1), (3, 2, 2)}

Each element a of G can be represented by a quadruple (a2,2,1, a3,1,1, a3,2,1, a3,2,2) where a2,2,1 ∈ Z4,

a3,1,1 ∈ Z3 and a3,2,1, a3,2,2 ∈ Z9. Finally, we have IG:2,2,1 = (1, 0, 0, 0), IG:3,1,1 = (0, 1, 0, 0), IG:3,2,1 =

(0, 0, 1, 0), IG:3,2,2 = (0, 0, 0, 1) so that Equation (6) holds.

In the following section, we introduce the ensemble of Abelian group codes which we use in the paper.

B. The Image Ensemble

Recall that for a positive integer n, an Abelian group code of length n over the group G is a coset of

a subgroup of Gn. Our ensemble of codes consists of all Abelian group codes over G; i.e., we consider

all subgroups of Gn. We use the following fact to characterize all subgroups of Gn:

Lemma IV.1. For an Abelian group G, let φ : J → G be a homomorphism from some Abelian group

J to G. Then φ(J) ≤ G; i.e., the image of the homomorphism is a subgroup of G. Moreover, for any

subgroup H of G there exists a corresponding Abelian group J and a homomorphism φ : J → G such

that H = φ(J).

Proof: The first part of the lemma is proved in [9, Theorem 12-1]. For the second part, Let J be

isomorphic to H and let φ be the identity mapping (more rigorously, let φ be the isomorphism between

J and H).

12

In order to use the above lemma to construct the ensemble of subgroups of Gn, we need to identify all

groups J from which there exist non-trivial homomorphisms to Gn. Then the above lemma implies that

for each such J and for each homomorphism φ : J → Gn, the image of the homomorphism is a group

code over G of length n and for each group code C ≤ Gn, there exists a group J and a homomorphism

such that C is the image of the homomorphism. This ensemble corresponds to the ensemble of linear

codes characterized by their generator matrix when the underlying group is a field of prime size. Note

that as in the case of standard ensembles of linear codes, the correspondence between this ensemble and

the set of Abelian group codes over G of length n may not be one-to-one.

Let G and J be two Abelian groups with decompositions:

G =⊕

(p,r,m)∈G(G)

Z(m)pr

J =⊕

(q,s,l)∈G(J)

Z(l)qs

and let φ be a homomorphism from J to G. For (q, s, l) ∈ G(J) and (p, r,m) ∈ G(G), let

g(q,s,l)→(p,r,m) = [φ(IJ :q,s,l)]p,r,m

where IJ :q,s,l ∈ J is the standard generator for the (q, s, l)th ring of J and [φ(IJ :q,s,l)]p,r,m is the

(p, r,m)th component of φ(IJ :q,s,l) ∈ G. For a =⊕

(q,s,l)∈G(J) aq,s,l ∈ J , let b = φ(a) and write

b =⊕

(p,r,m)∈G(G) bp,r,m. Note that as in Equation (6), we can write:

a =

(J)︷︸︸︷∑(q,s,l)∈G(J)

aq,s,lIJ :q,s,l

=

(J)︷︸︸︷∑(q,s,l)∈G(J)

(J)︷︸︸︷∑i∈{1,··· ,aq,s,l}

IJ :q,s,l

13

where the summations are the group summations. We have

bp,r,m = [φ(a)]p,r,m

=

φ

(J)︷︸︸︷∑(q,s,l)∈G(J)

(J)︷︸︸︷∑i∈{1,··· ,aq,s,l}

IJ :q,s,l

p,r,m

(a)=

(G)︷︸︸︷∑

(q,s,l)∈G(J)

(G)︷︸︸︷∑i∈{1,··· ,aq,s,l}

φ (IJ :q,s,l)

p,r,m

(b)=

(Zpr )︷︸︸︷∑(q,s,l)∈G(J)

(Zpr )︷︸︸︷∑i∈{1,··· ,aq,s,l}

[φ (IJ :q,s,l)]p,r,m

(c)=

(Zpr )︷︸︸︷∑(q,s,l)∈G(J)

aq,s,l [φ (IJ :q,s,l)]p,r,m

=

(Zpr )︷︸︸︷∑(q,s,l)∈G(J)

aq,s,lg(q,s,l)→(p,r,m)

Note that (a) follows since φ is a homomorphism; (b) follows from Equation (5); and (c) follows by

using aq,s,l [φ (IJ :q,s,l)]p,r,m as the short hand notation for the summation of [φ (IJ :q,s,l)]p,r,m to itself for

aq,s,l times.

Note that g(q,s,l)→(p,r,m) represents the effect of the (q, s, l)th component of a on the (p, r,m)th

component of b dictated by the homomorphism. This means that the homomorphism φ can be represented

by

φ(a) =⊕

(p,r,m)∈G(G)

(Zpr )︷︸︸︷∑(q,s,l)∈G(J)

aq,s,lg(q,s,l)→(p,r,m) (7)

where aq,s,lg(q,s,l)→(p,r,m) is the short-hand notation for the mod-pr addition of g(q,s,l)→(p,r,m) to itself

for aq,s,l times. We have the following lemma on g(q,s,l)→(p,r,m):

Lemma IV.2. For a homomorphism described by (7), we have

g(q,s,l)→(p,r,m) = 0 If p 6= q

g(q,s,l)→(p,r,m) ∈ pr−sZpr If p = q, r ≥ s

Moreover, any mapping described by (7) and satisfying these conditions is a homomorphism.

14

Proof: The proof is provided in Appendix IX-A.

This lemma implies that in order to construct a subgroup of G, we only need to consider homomor-

phisms from an Abelian group J to G such that

P(J) ⊆ P(G)

since if for some (q, s, l) ∈ G(J), q /∈ P(G) then φ(a) would not depend on aq,s,l. For p ∈ P(G), define

rp = maxRp(G) (8)

We show that we can restrict ourselves to J’s such that for all (q, s, l) ∈ G(J), s ≤ rq. Let (p, r,m) ∈

G(G) be such that p = q. Since g(q,s,l)→(p,r,m) ∈ Zpr and r ≤ rq, we have(aq,s,lg(q,s,l)→(p,r,m)

)(mod pr) =

((aq,s,l) (mod pr)g(q,s,l)→(p,r,m)

)(mod pr)

=((aq,s,l) (mod prq)g(q,s,l)→(p,r,m)

)(mod pr)

This implies that for all a ∈ J and all (q, s, l) ∈ G(J), in the expression for the (p, r,m)th component of

φ(a) with p = q, aq,s,l appears as (aq,s,l) (mod qrq). Therefore, it suffices for aq,s,l to take values from

Zqrq and this happens if s ≤ rq.

To construct Abelian group codes of length n over G, let G = Gn. we have

Gn ∼=⊕

p∈P(G)

⊕r∈Rp

ZnMp,r

pr =⊕

p∈P(G)

⊕r∈Rp

nMp,r⊕m=1

Z(m)pr =

⊕(p,r,m)∈G(Gn)

Z(m)pr (9)

Define J as

J =⊕

q∈P(G)

rq⊕s=1

Zkq,sqs =

⊕q∈P(G)

rq⊕s=1

kq,s⊕l=1

Z(l)qs =

⊕(q,s,l)∈G(J)

Z(l)qs (10)

for some positive integers kq,s.

Example: Let G = Z8 ⊕Z9 ⊕Z5. Then we have

J = Zk2,12 ⊕Zk2,24 ⊕Zk2,38 ⊕Zk3,13 ⊕Zk3,29 ⊕Zk5,15

Define

k =∑

q∈P(G)

rq∑s=1

kq,s

and wq,s = kq,sk for q ∈ P(G) and s = 1, · · · , rq so that we can write

J =⊕

q∈P(G)

rq⊕s=1

kwq,s⊕l=1

Z(l)qs (11)

15

for some constants wq,s adding up to one. Note that

G(J) = {(q, s, l) : q ∈ P(G), 1 ≤ s ≤ rq, 1 ≤ l ≤ kwq,s}.

Define

S(G) = {(p, s)|p ∈ P(G), 1 ≤ s ≤ rp}. (12)

Note that S(J) = Q(J) = S(G).

The ensemble of Abelian group encoders consists of all mappings φ : J → Gn of the form

φ(a) =⊕

(p,r,m)∈G(Gn)

(Zpr )︷︸︸︷∑(q,s,l)∈G(J)

aq,s,lg(q,s,l)→(p,r,m) (13)

for a ∈ J where g(q,s,l)→(p,r,m) = 0 if p 6= q, g(q,s,l)→(p,r,m) is a uniform random variable over Zpr

if p = q, r ≤ s, and g(q,s,l)→(p,r,m) is a uniform random variable over pr−sZpr if p = q, r ≥ s. The

corresponding shifted group code with parameters (n, k, w) is defined by

C = {φ(a) +B|a ∈ J} (14)

where B is a uniform random variable over Gn. The rate of this code is given by

R =1

nlog |J | = k

n

∑q∈P(G)

rq∑s=1

swq,s log q (15)

Remark IV.3. An alternate approach to constructing Abelian group codes is to consider kernels of

homomorphisms (the kernel ensemble). To construct the ensemble of Abelian group codes in this manner,

let φ be a homomorphism from J into Gn such that for a ∈ Gn,

φ(a) =⊕

(q,s,l)∈G(J)

(Zqs )︷︸︸︷∑(p,r,m)∈G(Gn)

ap,r,mg(p,r,m)→(q,s,l)

where g(p,r,m)→(q,s,l) = 0 if q 6= p, g(p,r,m)→(q,s,l) is a uniform random variable over Zqs if q = p, s ≤ r,

and g(p,r,m)→(q,s,l) is a uniform random variable over ps−rZqs if q = p, s ≥ r. The code is given by

C = {a ∈ Gn|φ(a) = c} where c is a uniform random variable over J .

In this paper, we use the image ensemble for both the channel and the source coding problem; however,

similar results can be derived using the kernel ensemble as well.

16

V. MAIN RESULTS

In this section, we provide an information-theoretic characterization of the optimal rate-distortion

function of a given source and the capacity of a given channel using group codes when the underlying

group is an arbitrary finite Abelian group represented by Equation (3). We define two subgroups of G

and then state two theorems using these subgroups, and finally provide an interpretation of the results

and these subgroups with two examples.

A. Definitions

Consider four vectors θ, w, η and b. The components θp,s and wp,s of θ and w, respectively, are indexed

by (p, s) ∈ S(G), and satisfy: 0 ≤ θp,s ≤ s,∑(p,s)∈S(G)

wp,s = 1, wp,s ≥ 0.

Let 000 denote the all-zero vector, and sss denote the vector whose components satisfy sssp,s = s for all

(p, s) ∈ S(G).

The components ηp,r,m,s of η are indexed by (p, r,m, s), for every (p, r,m) ∈ G(G) and every s ∈

{1, . . . , rp}, and satisfy 0 ≤ ηp,r,m,s ≤ r−|r−s|+. The components bp,r,m of b are indexed by (p, r,m) ∈

P(G) and satisfy bp,r,m ∈ Zpr . Let α be a probability distribution on the set of all η and b, where αη,b

denote the probability assigned to a particular η and b.

For θ, and η, define

θθθ(η) =

min1≤s≤rpwp,s 6=0

|r − s|+ + ηp,r,m,s

(p,r,m)∈P(G)

.

Define Hη ≤ G and Hη+θ ≤ Hη as

Hη =⊕

(p,r,m)∈G(G)

pθθθ(η)p,r,mZ(m)pr (16)

Hη+θ =⊕

(p,r,m)∈G(G)

pθθθ(η+θ)p,r,mZ(m)pr , (17)

where η + θ denote a vector obtained by adding the components with index (p, s) in S(G).

For a given θ and w, define

ωθ =

∑(p,s)∈S(G) θp,swp,s log p∑

(p,s)∈S(G) swp,s log p.

17

B. Main Results

The following theorem is the first main result of this paper.

Consider a channel (X = G,Y,WY |X). For every η and b, let Xη,b be a random variable distributed

uniformly over Hη+b. Let [Xη,b]θ = Xη,b+Hη+θ which takes values from the cosets of Hη+θ in Hη+b.

Theorem V.1. For a channel (X = G,Y,WY |X), the group capacity is given by

C = supα,w

minθ 6=sss

1

1− ωθ

∑η,b

αη,bI(Xη,b;Y |[Xη,b]θ). (18)

Proof: The proof is provided in Section VII-A2.

The following theorem is the second main result of this paper.

Consider a source (X ,U = G,PX , d), and distortion level D. For every η and b, let Uη,b be a

reconstruction random variable distributed uniformly over Hη + b and is statistically correlated with the

source X . Let us denote this collection of random variables as UUU . Let [Uη,b]θ = Uη,b +Hη+θ.

Theorem V.2. For a source (X ,U = G,PX , d), group rate-distortion function is given by

R∗(D) = infUUU

infα,w

maxθ 6=000

1

ωθ

∑η,b

αη,bI([Uη,b]θ;X), (19)

where infimum is over all α, w and UUU , such that

D ≥∑η,b

αη,bE[d(X,Uη,b)]

Proof: The proof is provided in Section VII-B2.

C. Interpretation of the Result

In this section, we give some intuition about the result and the quantities defined above using several

examples. At a high level, wp,s characterizes the normalized weight given to the Zps component of

the input group J in constructing the homomorphism from J to Gn, and θ indexes subgroups of J . η

characterizes the the collection of input distributions used on the channel. 1(1−ωθ)

I(Xη,b;Y |[Xη,b]θ) in

channel coding and 1ωθI([Uη,b]θ;X) in source coding denote the rate constraints imposed by the subgroup

Hη+θ. Due to the algebraic structure of the code, in the ensemble, two random codewords corresponding

to two distinct indexes are statistically dependent, unless G is a finite field. For the channel coding

problem, when a random codeword corresponding to a given message index is transmitted over the

channel, consider the event that all components of the difference between the codeword transmitted and

a random codeword corresponding to another message index belong to a proper subgroup Hη+θ of G.

18

Then the probability that the latter is decoded instead of the former is higher than the case when no

algebraic structure on the code is enforced. For the source coding problem, when the code is chosen

randomly, consider the event that all components of their difference belong to a proper subgroup Hη+θ

of G. Then if one of them is a poor representation of a given source sequence, so is the other with a

probability that is higher than the case when no algebraic structure on the code is enforced. This means

that the code size has to be larger so that with high probability one can find a good representation of the

source.

Example: We start with the simple example where G = Z8. In this case, we have P(G) = {2}, r2 = 3,

S(G) = {(2, 1), (2, 2), (2, 3)}, and Q(G) = {(2, 3)}. Let η be the all-zero vector and b = 0. For vectors

w, θ and θθθ defined as above, we have w = (w2,1, w2,2, w2,3), θ = (θ2,1, θ2,2, θ2,3) and θθθ = θθθ2,3,1. Recall

that the ensemble of Abelian group codes used in the random coding argument consists of the set of all

homomorphisms from some J = Zkw2,1

2 ⊕Zkw2,2

4 ⊕Zkw2,3

8 , and hence the vector of weights w determines

the input group of the homomorphism. Any vector θ = (θ2,1, θ2,2, θ2,3) with 0 ≤ θ2,1 ≤ 1, 0 ≤ θ2,2 ≤ 2

and 0 ≤ θ2,3 ≤ 3 corresponds to a subgroup Kθ of the input group J given by

Kθ = 2θ2,1Zkw2,1

2 ⊕ 2θ2,2Zkw2,2

4 ⊕ 2θ2,3Zkw2,3

8

Similarly, any θθθ(η + θ) = θθθ2,3,1 corresponds to a subgroup Hη+θ = 2θθθ2,3,1Z8 of the group G.

Example: Next, we consider the case where G = Z4⊕Z3. In this case, we have P(G) = {2, 3}, r2 = 2,

r3 = 1, S(G) = {(2, 1), (2, 2), (3, 1)}, and Q(G) = {(2, 2), (3, 1)}. For vectors w, θ and θθθ defined as

before, we have w = (w2,1, w2,2, w3,1), θ = (θ2,1, θ2,2, θ3,1) and θθθ = (θθθ2,2,1, θθθ3,1,1). Let η be the all-zero

vector and b = 0. The ensemble of Abelian group codes consists of the set of all homomorphisms from

some J = Zkw2,1

2 ⊕ Zkw2,2

4 ⊕ Zkw3,1

3 . Any vector θ = (θ2,1, θ2,2, θ3,1) with 0 ≤ θ2,1 ≤ 1, 0 ≤ θ2,2 ≤ 2

and 0 ≤ θ3,1 ≤ 1 corresponds to a subgroup Kθ of the input group J given by

Kθ = 2θ2,1Zkw2,1

2 ⊕ 2θ2,2Zkw2,2

4 ⊕ 3θ3,1Zkw3,1

8

Similarly, any θθθ(η + θ) = (θθθ2,2,1, θθθ3,1,1) corresponds to a subgroup Hη+θ = 2θθθ2,2,1Z4 ⊕ 3θθθ3,1,1Z3 of the

group G.

VI. PROOF OF CONVERSE

Consider an arbitrary shifted group code C with parameters (n, k, w). We assume that the associated

homomorphism is a one-to-one mapping. We can express the code compactly as follows:

C =

n⊕i=1

⊕(p,r,m)=G(G)

rp∑s=1

ap,sg(i)(p,s)→(r,m) +B(i)

: ap,s ∈ Zkwp,sps ,∀(p, s) ∈ S(G)

.

19

For every pair of vectors (η, b) as defined in Section V-A, define

Γη,b ={i ∈ [1, n] : g

(i)(p,s)→(r,m) ∈ p

ηp,r,m,s+|r−s|+Zkwp,spr \pηp,r,m,s+1+|r−s|+Z

kwp,spr , B(i) = b,∀(p, r,m, s)

}Let θ be an arbitrary vector whose components θp,s are indexed by (p, s) ∈ S(G) and satisfy 0 ≤ θp,s ≤

s. Construct a one-to-one correspondence ap,s ↔ (ap,s, ap,s) where ap,s ∈ pθp,sZkwp,sps and ap,s ∈ Zkwp,spθp,s.

A. Channel Coding

For an arbitrary ap,s ∈ Zkwp,spθp,s, consider the following subcode of C:

C1(θ, a) =

n⊕i=1

⊕(p,r,m)=G(G)

rp∑s=1

(ap,s + ap,s)g(i)(p,s)→(r,m) +B(i)

: ap,s ∈ pθp,sZkwp,sps ,∀(p, s) ∈ S(G)

.

The rate of the code C1(θ, a) is given by (1− ωθ)kn

∑(p,s)∈S(G) swp,s log p.

For a given channel (X = G,Y,WY |X), suppose that rate R is achievable using group codes. Consider

an arbitrary ε > 0. This implies that there exists a shifted group code C with parameters (n, k, w) that

yields a maximal error probability τ such that τ ≤ ε and kn

∑(p,s)∈S(G) swp,s log p ≥ R − ε. Since the

maximal error probability of the code is τ , the average error probability is no greater than τ . Using a

uniform distribution on a, we let Xi denote the random channel input at the ith channel use induced by

this code.

For the subcode C1(θ, a), the average error probability is no greater than τ . Using the fact that a is

uniformly distributed over its range, for i ∈ Γη,b, in the code C1(θ, a), the channel input Xi(θ, a) at the

ith channel use has the following distribution

P (Xi(θ, a) = β) =∏

(p,r,m)∈G(G)

p−|r−θθθ(η+θ)(p,r,m)|+ =1

|Hη+θ|,

if β(p,r,m) ∈∑rp

s=1 ap,sg(i)(p,s)→(r,m) +bp,r,m+pθθθ(η+θ)(p,r,m)Zpr for all (p, r,m) ∈ G(G), and P (Xi(θ, a) =

β) = 0 otherwise. Using Fano’s inequality, and the standard information theoretic arguments, we have

for every θ with 0 ≤ θp,s ≤ s, and a with ap,s ∈ Zkwp,spθp,s

(1− ωθ)(R− ε)(1− τ)− 1

n≤ 1

n

n∑i=1

I(Xi(θ, a);Yi)

20

Hence, averaging with uniform distribution on a, we get for all θ 6= sss,

(1− ωθ)(R− ε)(1− τ)− 1

n≤ 1

n

n∑i=1

∑a

P (a)I(Xi(θ, a);Yi) (20)

(a)=∑η,b

∑i∈Γ(η,b)

1

n

∑a

P (a)I(Xi;Yi|Xi ∈ aggg(i) + b+Hη+θ) (21)

(b)=∑η,b

∑i∈Γ(η,b)

1

nI(Xi;Yi|Xi ∈ Aggg(i) + b+Hη+θ) (22)

(c)=∑η,b

|Γ(η, b)|n

I(Xη,b;Y |[Xη,b]θ) (23)

where in (a) we have expressed⊕

(p,r,m)∈G(G)

∑rps=1 ap,sg

(i)(p,s)→(r,m) +bp,r,m as aggg(i) +b, in (b) A denotes

the random variable corresponding to a, in (c) we have used the fact that Aggg(i) + b+Hη+θ is uniform

over the set of cosets of Hη+θ + b in Hη + b. Hence the converse follows.

B. Source Coding

Let us express C as

C =

⊕η,b

⊕i∈Γ(η,b)

⊕(p,r,m)=G(G)

rp∑s=1

ap,sg(i)(p,s)→(r,m) + b


.

Consider the following code:

C2(θ) =

⊕η,b

⊕i∈Γ(η,b)

⊕(p,r,m)=G(G)

rp∑s=1

ap,sg(i)(p,s)→(r,m) + b+Hη+θ


(a)=

⊕η,b

⊕i∈Γ(η,b)

⊕(p,r,m)=G(G)

rp∑s=1

ap,sg(i)(p,s)→(r,m) + b+Hη+θ

: ap,s ∈ Zkwp,spθp,s, ∀(p, s) ∈ S(G)

where (a) follows from the fact that for i ∈ Γ(η, b), we have ⊕

(p,r,m)=G(G)

rp∑s=1

ap,sg(i)(p,s)→(r,m) : ap,s ∈ pθp,sZkwp,sps , ∀(p, s) ∈ S(G)

= Hη+θ.

The rate of the code C2(θ) is given by ωθkn

∑(p,s)∈S(G) swp,s log p.

For a given source (X ,U = G, pX , d), suppose the pair of rate and distortion (R,D) is achievable

using group codes. Consider an arbitrary ε > 0. This implies that there exists a shifted group code C with

parameters (n, k, w) that yields a distortion ∆ such that ∆ ≤ D+ε, kn∑

(p,s)∈S(G) swp,s log p ≤ R+ε, and1n

∑ni=1[log Θi−H(Ui)] ≤ ε. Let Un = Dec(Enc(Xn)), induced by the code C. Note that for i ∈ Γ(η, b),

the ith sample Ui takes values in Hη + b. Let [Ui]θ = Ui + Hη+θ denote the unique coset of Hη+θ in

21

Hη + b that contains Ui. Observe that [Ui]θ is a function of Ui for i ∈ [1, n]. Let [Un]θ =⊕

i∈[1,n][Ui]θ.

Note that [Un]θ ∈ C2(θ). Hence we get the following conditions for the rate of the code for all θ 6= 000:

ωθ(R+ ε) ≥ 1

nH([Un]θ) =

1

nI([Un]θ;X

n) ≥ 1

n

n∑i=1

I([Ui]θ;Xi) (24)

=∑η,b

|Γ(η, b)|n

∑i∈Γ(η,b)

1

|Γ(η, b)|I([Ui]θ;Xi) (25)

(a)=∑η,b

|Γ(η, b)|n

∑i∈Γ(η,b)

1

|Γ(η, b)|I(PX , Pθ,i) (26)

(b)

≥∑η,b

|Γ(η, b)|n

I

PX , ∑i∈Γ(η,b)

1

|Γ(η, b)|Pθ,i

, (27)

(c)

≥∑η,b

|Γ(η, b)|n

I([Uη,b]θ;X). (28)

Here (a) follows by denoting the conditional probability P ([Ui]θ = u|Xi = x) as Pθ,i(u|x), and writing

explicitly the mutual information as a function of the source distribution and the conditional distribution of

the corresponding function of the reconstruction given the source. (b) follows from convexity of mutual

information. In (c) [Uη,b]θ denote the random variable which is related to X through the conditional

probability distribution 1|Γ(η,b)|

∑i∈Γ(η,b) Pθ,i.

Note that for θ = sss we have [Uη,b]sss = Uη,b. Hence we have the following conditions regarding the

distortion of the code:

D + ε ≥ ∆ ≥ 1

n

n∑i=1

E(d(Xi, Ui)) =∑η,b

|Γ(η, b)|n

∑i∈Γ(η,b)

1

|Γ(η, b)|∑x∈X

PX(x)∑

u∈Hη+b

Psss,i(u|x)d(x, u)

=∑η,b

|Γ(η, b)|n

∑x∈X

PX(x)∑

u∈Hη+b

P (Uη,b = u|X = x)d(x, u) (29)

=∑η,b

|Γ(η, b)|n

Ed(X,Uη,b). (30)

Finally, we ensure that Uη,b is distributed nearly uniformly over its range. For every η and b, let

Uη,b be uniformly distributed over Hη + b. Now using the relation between the variational distance and

22

information divergence [13, p.58], we have∑η,b

|Γ(η, b)|2n ln 2

d2v(Uη,b, Uη,b) ≤

∑η,b

|Γ(η, b)|n

D(Uη,b‖Uη,b) (31)

=∑η,b

|Γ(η, b)|n

D

1

|Γ(η, b)|∑

i∈Γ(η,b)

Ui

∥∥∥∥Uη,b (32)

≤∑η,b

|Γ(η, b)|n

1

|Γ(η, b)|∑

i∈Γ(η,b)

D(Ui‖Uη,b) (33)

=∑η,b

1

n

∑i∈Γ(η,b)

D(Ui‖Ui) =1

n

n∑i=1

[log Θi −H(Ui)] ≤ ε. (34)

The converse follows from the continuity of mutual information.

VII. ACHIEVABILITY

A. Channel Coding

We give proof of only the essential elements for conciseness. Fix a triple of code parameters (n, k, w).

Let R denote the rate of the code. First we consider the special case where for every channel use, the

input can take all possible values in G. This corresponds to the choice: αη∗,b = 1, where η∗ is the all-

zero vector and b is arbitrary, and hence Hη∗ = G. Generalization to arbitrary probability distributions

is relatively straightforward.

1) Encoding and Decoding: Following the analysis of Section IV-B, we construct the ensemble of

group codes of length n over G as the image of all homomorphisms φ from some Abelian group J into

Gn where J and Gn are as in Equations (11) and (9), respectively. The random homomorphism φ is

described in Equation (13).

To find an achievable rate, we use a random coding argument in which the random encoder is

characterized by the random homomorphism φ and a random vector B uniformly distributed over Gn.

Given a message a ∈ J , the encoder maps it to x = φ(a) + B and x is then fed to the channel. At

the receiver, after receiving the channel output y ∈ Yn, the decoder looks for a unique a ∈ J such that

φ(a) + B is jointly typical with y with respect to the distribution PXWY |X where PX is uniform over

G. If the decoder does not find such a or if such a is not unique, it declares error.

2) Error Analysis: Let a, x and y be the message, the channel input and the channel output, respec-

tively. The error event can be characterized by the union of two events: E(a) = E1(a) ∪ E2(a) where

E1(a) is the event that φ(a) +B is not jointly typical with y and E2(a) is the event that there exists a

23

a 6= a such that φ(a) + B is jointly typical with y. We can provide an upper bound on the probability

of the error event as P (E(a)) ≤ P (E1(a)) + P (E2(a) ∩ (E1(a))c). Using the standard approach, one

can show that P (E1(a)) → 0 as n → ∞. The probability of the error event E2(a) ∩ (E1(a))c can be

written as

Pavg(E2(a) ∩ (E1(a))c) =∑x∈Gn

1{φ(a)+B=x}∑

y∈Anε (Y |x)

WnY |X(y|x)1{∃a∈J :a6=a,φ(a)+B∈Anε (X|y)}

The expected value of this probability over the ensemble is given by E{Pavg(E2(a)∩ (E1(a))c)} = Perr

where

Perr =∑x∈Gn

∑y∈Anε (Y |x)

WnY |X(y|x)P (φ(a) +B = x,∃a ∈ J : a 6= a, φ(a) +B ∈ Anε (X|y))

Using the union bound, we have

Perr ≤∑x∈Gn

∑y∈Anε (Y |x)

∑a∈Ja6=a

∑x∈Anε (X|y)

WnY |X(y|x)P (φ(a) +B = x, φ(a) +B = x)

We need the following lemmas to proceed.

Lemma VII.1. For a, a ∈ J , x, x ∈ Gn and for (p, s) ∈ Q(J) = S(G), let θp,s ∈ {0, 1, · · · , s} be such

that

ap,s − ap,s ∈ pθp,sZkwp,sps \pθp,s+1Zkwp,sps

Then,

P (φ(a) +B = x, φ(a) +B = x) =

1|G|n

1|Hη∗+θ|n

If x− x ∈ Hnη∗+θ

0 Otherwise

Proof: The proof is provided in Appendix IX-B

Lemma VII.2. Let X be a random variable taking values from the group G and for a subgroup H of

G, define [X] = X +H . For y ∈ Anε (Y ) and x ∈ Anε (X|y), let z = [x] = x+Hn. Then we have

(x+Hn) ∩Anε (X|y) = Anε (X|zy)

and

(1− ε)2n[H(X|Y [X])−O(ε)] ≤ |(x+Hn) ∩Anε (X|y)| ≤ 2n[H(X|Y [X])+O(ε)]

Proof: The proof is provided in Appendix IX-C

24

For a ∈ J , and for θ with θp,s ∈ {0, 1, · · · , s} for all (p, s) ∈ S(G), let

Tθ(a) = {a ∈ J |ap,s − ap,s ∈ pθp,sZkwp,sps \pθp,s+1Zkwp,sps ,∀(p, s) ∈ S(G)}

It follows that for all a ∈ J

|Tθ(a)| =∏

(p,s)∈S(G)

p(s−θp,s)kwp,s ,

and

Perr ≤∑x∈Gn

∑y∈Anε (Y |x)

∑θ 6=sss

∑a∈Tθ(a)

∑x∈Anε (X|y)

WnY |X(y|x)P (φ(a) +B = x, φ(a) +B = x)

Using Lemmas VII.1, and VII.2, we have

Perr ≤∑θ 6=sss

∑x∈Gn

∑y∈Anε (Y |x)

∑a∈Tθ(a)

∑x∈Anε (X|y)x∈x+Hn

η∗+θ

WnY |X(y|x)

1

|G|n1

|Hη∗+θ|n

≤∑θ 6=sss

∑x∈Gn

∑y∈Anε (Y |x)

∑a∈Tθ(a)

WnY |X(y|x)2n[H(Xη∗,b|Y,[Xη∗,b]θ)+O(ε)] 1

|G|n1

|Hη∗+θ|n

≤∑θ 6=sss

∏(p,s)∈S(G)

p(s−θp,s)kwp,s

2n[H(Xη∗,b|Y,[Xη∗,b]θ)+O(ε)] 1

|Hη∗+θ|n

=∑θ 6=sss

exp2

nkn

∑(p,s)∈S(G)

(s− θp,s)wp,s log p+H(Xη∗,b|Y, [Xη∗,b]θ)− log |Hη∗+θ|+O(ε)

Recall that R = k

n

∑(p,s)∈S(G) swp,s log p. Noting that Perr is the probability of error for message a

averaged over the ensemble, in order for the maximal error probability to go to zero, we require the

exponent of all the terms to be negative; or equivalently, for all θ 6= sss,

R

∑(p,s)∈S(G)(s− θp,s)wp,s log p∑

(p,s)∈S(G) swp,s log q< log |Hη∗+θ| −H(Xη∗,b|Y, [Xη∗,b]θ)−O(ε).

Therefore, the achievability conditions are

R ≤ 1

1− ωθI(Xη∗,b;Y |[Xη∗,b]θ)

for all θ 6= sss. This means that the following rate is achievable

R = minθ 6=sss

1

1− ωθI(Xη∗,b;Y |[Xη∗,b]θ).

For a general α, we use an extended random coding argument. We construct the ensemble of group

codes of length n over G as the image of all homomorphisms φ from some Abelian group J into⊕η,b (Hη + b)nαη,b . The random homomorphism φ is described in Equation (13). Given a message

25

a ∈ J , the encoder maps it to x = φ(a) + B and x is then fed to the channel. At the receiver, after

receiving the channel output y ∈ Yn, the decoder looks for a unique a ∈ J such that for every (η, b),

the appropriate set of nαη,b samples of φ(a) + B is jointly typical with the corresponding set of nαη,b

samples of y with respect to the distribution Pη,bWY |X where Pη,b is uniform over Hη+b. If the decoder

does not find such a or if such a is not unique, it declares error. Using a similar analysis, one gets the

desired achievability result.

3) Z4: Evaluating the expression for the group capacity for codes over Z4, we get three non-redundant

terms as follows:

R < supα1,α2,w0min{T1, T2, T3}

where supremum is over all α1 and α2 such that 0 ≤ α1, α2, α1 + α2 ≤ 1, and 0 ≤ w0 ≤ 1 and

T1 = α1I4 + α2I2 + (1 − α1 − α2)I ′2, T2 = (1 + w0)[α1

2 (I2 + I ′2) + α2I2 + (1− α1 − α2)I ′2]

and

T3 = (1+w0)w0

α1

2 (I2 + I ′2), I4 = I(X;Y ), I2 = I(X;Y |X ∈ 2Z4) and I ′2 = I(X;Y |X 6∈ 2Z4). This can

be solved to obtain the following

C = max{min{I4, (I2 + I ′2)}, I2, I′2}.

B. Source Coding

Fix a triple of code parameters (n, k, w). Let R denote the rate of the code. We consider the special

case where αη∗,b = 1, where η∗ is the all-zero vector, and hence Hη∗ = G. Generalization to arbitrary

probability distributions is straightforward. Fix a conditional distribution PU |X on G such that U is

uniform on G.

1) Encoding and Decoding: To find an achievable rate for a distortion level D, we use a random coding

argument as in channel coding. Consider a random shifted group code as C = {φ(a) + B : a ∈ J},

where φ and B are uniformly distributed over their range and are independent of each other. Given the

source output sequence x ∈ X n, the random encoder looks for a codeword u ∈ C such that u is jointly

typical with x with respect to pXU . If it finds at least one such u, it encodes x to u (if it finds more

than one such u it picks one of them at random). Otherwise, it declares error. The decoder outputs u as

the source reconstruction.

2) Error Analysis: Let x = (x1, · · · , xn) and u = (u1, · · · , un) be the source output and the

decoder output, respectively. Note that if the encoder declares no error then since x and u are jointly

typical, (d(xi, ui))i=1,··· ,n is typical with respect to the distribution of d(X,U). Therefore for large n,1nd(x, u) = 1

n

∑ni=1 d(xi, ui) ≈ E{d(X,U)} ≤ D. It remains to show that the rate can be as small as

26

given in the theorem while keeping the probability of encoding error small.

Given the source output x ∈ X n, define

α(x) =∑

u∈Anε (U |x)

1{u∈C} =∑

u∈Anε (U |x)

∑a∈J

1{φ(a)+B=u}

An encoding error occurs if and only if α(x) = 0. We use the following Chebyshev’s inequality to show

that under certain conditions the probability of error can be made arbitrarily small:

P (α(x) = 0) ≤ var{α(x)}E{α(x)}2

We have

E{α(x)} =∑

u∈Anε (U |x)

∑a∈J

P (φ(a) +B = u)

=|Anε (U |x)| · |J |

|G|n

and

E{α(x)2} = E

∑u,u∈Anε (U |x)

∑a,a∈J

1{φ(a)+B=u,φ(a)+B=u}

=

∑u,u∈Anε (U |x)

∑a,a∈J

P ({φ(a) +B = u, φ(a) +B = u})

=∑θ

∑a∈J

∑u∈Anε (U |x)

∑a∈Tθ(a)

∑u∈Anε (U |x)u−u∈Hn

η∗+θ

1

|G|n· 1

|Hη∗+θ|n

Note that the term corresponding to θ = 000 is bounded from above by E{α(x)}2. Using Lemma VII.2,

we have ∣∣∣Anε (U |x) ∩(u+Hn

η∗+θ

)∣∣∣ ≤ 2n[H(Uη∗,b|[Uη∗,b]θ,X)+O(ε)]

Therefore,

var{α} = E{α(x)2} − E{α(x)}2

≤∑θ 6=000

|J | · |Anε (U |x)|

∏(p,s)∈S(G)

p(s−θp,s)kwq,s

2n[H(Uη∗,b|[Uη∗,b]θ,X)+O(ε)]

|G|n · |Hη∗+θ|n

27

Therefore,

P (α(x) = 0) ≤ var{α(x)}E{α(x)}2

≤∑θ 6=000

∏(p,s)∈S(G)

p(s−θp,s)kwq,s

2−n[H(Uη∗,b|X)−H(Uη∗,b|[Uη∗,b]θX)−O(ε)]|G|n

|J | · |Hη∗+θ|n

Note that H(Uη∗,b|X)−H(Uη∗,b|[Uη∗,b]θ, X) = H([Uη∗,b]θ|X) and

|J | =∏

(p,s)∈S(G)

pkswp,s

Therefore,

P (α(x) = 0) ≤∑θ 6=000

exp2

−nH([Uη∗,b]θ|X)−log |Hη∗ : Hη∗+θ|+

k

n

∑(p,s)∈S(G)

θp,swp,s log p−O(ε)

In order for the probability of error to go to zero as n increases, we require the exponent of all the terms

to be negative; or equivalently,

R = maxθ 6=000

1

ωθI([Uη∗,b]θ;X)

is achievable. Using an extension of the random coding argument to general αη,b, similar to that considered

in channel coding, we get the achievability of the theorem.

VIII. CONCLUSION

We derived the achievable set of rates using Abelian group codes for arbitrary discrete memoryless

channels. In the case of linear codes, it simplifies to the symmetric capacity of the channel i.e., the Shannon

capacity with the additional constraint that the channel input distribution is uniformly distributed. For the

case where the underlying group is not a field, we observe that several subgroups of the group appear

in the achievable rate and this causes the rate to be smaller than the symmetric capacity of the channel

in general.

We derived a similar result for the source coding problem; i.e., the achievable rate-distortion function

using Abelian group codes for arbitrary discrete memoryless sources. When the underlying group is a

field, these group codes are linear codes, and this function is equivalent to the symmetric rate-distortion

function i.e., the Shannon rate-distortion function with the additional constraint that the reconstruction

random variable is uniformly distributed. We showed that when the underlying group is not a field, due

to the algebraic structure of the code, certain subgroups of the group appear in the rate-distortion function

and cause a larger rate for a given distortion level.

28

IX. APPENDIX

A. Proof of Lemma IV.2

We first prove that for a homomorphism φ, g(q,s,l)→(p,r,m) satisfies the above conditions. First assume

p 6= q. Note that the only nonzero component of IJ :q,s,l takes values from Zqs and therefore

qsIJ :q,s,l =

(J)︷︸︸︷∑i=1,··· ,qs

IJ :q,s,l = 0

Note that since φ is a homomorphism, we have φ(qsIJ :q,s,l) = 0. On the other hand,

φ(qsIJ :q,s,l) = φ(

(J)︷︸︸︷∑i=1,··· ,qs

IJ :q,s,l)

=

(G)︷︸︸︷∑i=1,··· ,qs

φ(IJ :q,s,l)

=⊕

(p,r,m)∈G(G)

(G)︷︸︸︷∑

i=1,··· ,qsφ(IJ :q,s,l)

p,r,m

=⊕

(p,r,m)∈G(G)

(Zpr )︷︸︸︷∑i=1,··· ,qs

[φ(IJ :q,s,l)]p,r,m

=⊕

(p,r,m)∈G(G)

qs [φ(IJ :q,s,l)]p,r,m

=⊕

(p,r,m)∈G(G)

qsg(q,s,l)→(p,r,m)

Therefore, we have qsg(q,s,l)→(p,r,m) = 0 (mod pr) or equivalently qsg(q,s,l)→(p,r,m) = Cpr for some

integer C. Since p 6= q, this implies pr|g(q,s,l)→(p,r,m) and since g(q,s,l)→(p,r,m) takes value from Zpr , we

have g(q,s,l)→(p,r,m) = 0.

Next, assume p = q and r ≥ s. Note that same as above, we have φ(qsIJ :q,s,l) = 0 and

φ(qsIJ :q,s,l) =⊕

(p,r,m)∈G(G)

qsg(q,s,l)→(p,r,m)

and therefore, qsg(q,s,l)→(p,r,m) = 0 (mod pr). Since g(q,s,l)→(p,r,m) takes values from Zpr and p = q,

this implies pr−s|g(q,s,l)→(p,r,m) or equivalently g(q,s,l)→(p,r,m) ∈ pr−sZpr .

29

Next we show that any mapping described by (7) satisfying the conditions of the lemma is a homo-

morphism. For two elements a, b ∈ J and for (p, r,m) ∈ G(G) we have

[φ(a+ b)]p,r,m =

φ ⊕

(q,s,l)∈G(J)

(aq,s,l +qs bq,s,l)

p,r,m

=

φ

(J)︷︸︸︷∑(q,s,l)∈G(J)

(aq,s,l +qs bq,s,l)IJ :q,s,l

p,r,m

=

φ

(J)︷︸︸︷∑(q,s,l)∈G(J)

(J)︷︸︸︷∑i=1,··· ,aq,s,l+qsbq,s,l

IJ :q,s,l

p,r,m

=

(G)︷︸︸︷∑

(q,s,l)∈G(J)

(G)︷︸︸︷∑i=1,··· ,aq,s,l+qsbq,s,l

φ (IJ :q,s,l)

p,r,m

=

(Zpr )︷︸︸︷∑(q,s,l)∈G(J)

(Zpr )︷︸︸︷∑i=1,··· ,aq,s,l+qsbq,s,l

[φ (IJ :q,s,l)]p,r,m

=

(Zpr )︷︸︸︷∑(q,s,l)∈G(J)


g(q,s,l)→(p,r,m) (35)

On the other hand, we have

[φ(a) + φ(b)]p,r,m = [φ(a)]p,r,m +pr [φ(b)]p,r,m

=

(Zpr )︷︸︸︷∑

(q,s,l)∈G(J)


+pr

(Zpr )︷︸︸︷∑

(q,s,l)∈G(J)

bq,s,lg(q,s,l)→(p,r,m)

=

(Zpr )︷︸︸︷∑

(q,s,l)∈G(J)

(Zpr )︷︸︸︷∑i=1,··· ,aq,s,l

g(q,s,l)→(p,r,m)

+pr

(Zpr )︷︸︸︷∑

(q,s,l)∈G(J)

(Zpr )︷︸︸︷∑i=1,··· ,bq,s,l

g(q,s,l)→(p,r,m)

=

(Zpr )︷︸︸︷∑(q,s,l)∈G(J)

(Zpr )︷︸︸︷∑i=1,··· ,aq,s,l+bq,s,l

g(q,s,l)→(p,r,m) (36)

where the addition in aq,s,l + bq,s,l is the integer addition.

In order to show that φ is a homomorphism, it suffices to show that under the conditions of the lemma,

Equations (35) and (36) are equivalent. We show that for a fixed (q, s, l) ∈ G(J), if the conditions of the

30

lemma are satisfied, then(Zpr )︷︸︸︷∑

i=1,··· ,aq,s,l+bq,s,l

g(q,s,l)→(p,r,m) =


g(q,s,l)→(p,r,m) (37)

Note that if p 6= q, then both summations are zero. Note that we have(Zpr )︷︸︸︷∑

i=1,··· ,aq,s,l+bq,s,l

g(q,s,l)→(p,r,m) =

(Zpr )︷︸︸︷∑i=1,··· ,(aq,s,l+bq,s,l) (mod pr)

g(q,s,l)→(p,r,m)

and(Zpr )︷︸︸︷∑

i=1,··· ,aq,s,l+qsbq,s,l

g(q,s,l)→(p,r,m) =

(Zpr )︷︸︸︷∑i=1,··· ,(aq,s,l+qsbq,s,l) (mod pr)

g(q,s,l)→(p,r,m)

If p = q and r ≤ s, then we have (aq,s,l +qs bq,s,l) (mod pr) = (aq,s,l + bq,s,l) (mod pr) and hence it

follows that Equation (37) is satisfied. If p = q and r ≥ s, since g(q,s,l)→(p,r,m) ∈ pr−sZpr we have

(Zpr )︷︸︸︷∑i=1,··· ,aq,s,l+bq,s,l

g(q,s,l)→(p,r,m) =

(Zpr )︷︸︸︷∑i=1,··· ,(aq,s,l+bq,s,l) (mod ps)

g(q,s,l)→(p,r,m)

and hence it follows that Equation (37) is satisfied.

B. Proof of Lemma VII.1

Note that since g(q,s,l)→(p,r,m)’s and B are uniformly distributed, in order to find the desired joint

probability, we need to count the number of choices for g(q,s,l)→(p,r,m)’s and B such that for (p, r,m) ∈

G(Gn), (Zpr )︷︸︸︷∑

(q,s,l)∈G(J)


+pr Bp,r,m = up,r,m

(Zpr )︷︸︸︷∑

(q,s,l)∈G(J)


+pr Bp,r,m = up,r,m

and divide this number by the total number of choices which is equal to

|G|n ·∏

(p,r,m)∈G(Gn)

∏(q,s,l)∈G(J)

q=p

pmin(r,s) = |G|n ·

∏(p,r,m)∈G(G)

∏(q,s,l)∈G(J)

q=p

pmin(r,s)

n

31

where the term pmin(r,s) appears since the number of choices for g(q,s,l)→(p,r,m) is pr if p = q, r ≤ s and

is equal to ps if p = q, r ≥ s. Since B can take values arbitrarily from Gn, the number of choices for

the above set of conditions is equal to the number of choices for g(q,s,l)→(p,r,m)’s such that,(Zpr )︷︸︸︷∑

(q,s,l)∈G(J)

(aq,s,l − aq,s,l)g(q,s,l)→(p,r,m)

= up,r,m − up,r,m

Let θp,s,l ∈ {0, 1, . . . , s} be such that ap,s,l − ap,s,l ∈ pθp,s,lZps\pθp,s,l+1Zps . Note

θp,s = min{1≤l≤kwp,s}

θp,s,l.

Note that for all (q, s, l) ∈ G(J), (aq,s,l−aq,s,l)g(q,s,l)→(p,r,m) ∈ pθθθp,r,mZpr . Therefore we require up,r,m−

up,r,m ∈ pθθθp,r,mZpr and therefore we require u− u ∈ Hnη∗+θ

or otherwise the probability would be zero.

For fixed p ∈ P(G) and r ∈ Rp(G), let (q∗, s∗, l∗) ∈ G(J) be such that q∗ = p and

θθθ(η∗ + θ)p,r,m = |r − s∗|+ + θq∗,s∗,l∗ .

For fixed (p, r,m) ∈ G(Gn), and for (q, s, l) 6= (q∗, s∗, l∗), choose g(q,s,l)→(p,r,m) arbitrarily from it’s

domain. The number of choices for this is equal to∏

(p,r,m)∈G(G)

∏(q,s,l)∈G(J)

q=p(q,s,l)6=(q∗,s∗,l∗)

pmin(r,s)

n

For each (p, r,m) ∈ G(Gn), we need to have

(aq∗,s∗,l∗ − aq∗,s∗,l∗)g(q∗,s∗,l∗)→(p,r,m) = up,r,m − up,r,m −

(Zpr )︷︸︸︷∑

(q,s,l)∈G(J)(q,s,l)6=(q∗,s∗,l∗)

(aq,s,l − aq,s,l)g(q,s,l)→(p,r,m)

Note that the right hand side is included in pθθθp,r,mZpr and (aq∗,s∗,l∗−aq∗,s∗,l∗) is included in pθq∗,s∗,l∗Z(q∗)(s∗) .

We need to count the number of solutions for g(q∗,s∗,l∗)→(p,r,m) in p|r−s∗|+Zpr . Using Lemma IX.1, we

can show that the number of solutions is equal to pθq∗,s∗,l∗ . The total number of solutions for φ is equal

to

∏(p,r,m)∈G

∏(q,s,l)∈G(J)

q=p(q,s,l) 6=(q∗,s∗,l∗)

pmin(r,s)

· pθq∗,s∗,l∗n

32

Hence we have

P (φ(a) +B = u, φ(a) +B = u) =

∏(p,r,m)∈G(G)

pθq∗,s∗,l∗ ·∏ (q,s,l)∈G(J)q=p

(q,s,l)6=(q∗,s∗,l∗)

pmin(r,s)

n

[∏(p,r,m)∈G(G)

∏(q,s,l)∈G(J)

q=ppmin(r,s)

]n

=

∏

(p,r,m)∈G(G)

∏(q,s,l)∈G(J)

q=p(q,s,l)=(q∗,s∗,l∗)

pθq∗,s∗,l∗

pmin(r,s)

n

Note that for (q, s, l) = (q∗, s∗, l∗) we have

min(r, s) = min(r, s∗) = r − |r − s∗|+ = r −(θθθp,r,m − θq∗,s∗,l∗

)Therefore, the above probability is equal to

∏(p,r,m)∈G

∏(q,s,l)∈G(J)

q=p(q,s,l)=(q∗,s∗,l∗)

pθq∗,s∗,l∗

pr−(θθθp,r,m−θq∗,s∗,l∗)

n

=

∏

(p,r,m)∈G

∏(q,s,l)∈G(J)

q=p(q,s,l)=(q∗,s∗,l∗)

1

pr−θθθp,r,m

n

=

∏(p,r,m)∈G

pθθθp,r,m

pr

n =1

|Hη∗+θ|n

Since the dither B is uniform, we conclude that

P

φ(u) +B = x

φ(u) +B = x

=1

|G|n1

|Hη∗+θ|n

C. Proof of Lemma VII.2

First, we show that (x+Hn) ∩ Anε (X|y) is contained in Anε (X|zy). Since z is a function of x, we

have (x, z, y) ∈ Anε (X, [X], Y ). For x′ ∈ (x+Hn) ∩Anε (X|y), we have [x′] = x′ +Hn = x+Hn = z

and (x′, z, y) = (x′, [x′], y) ∈ Anε (X, [X], Y ). Therefore, x′ ∈ Anε (X|zy) and hence,

(x+Hn) ∩Anε (X|y) ⊆ Anε (X|zy)

Conversely, for x′ ∈ Anε (X|zy), since (x, z) ∈ Anε (X, [X]) where [X] is a function of X , we have

[x′] = z. This implies x′ ∈ z + Hn = x + Hn. Clearly, we also have x′ ∈ Anε (X|y). The claim on the

size of the set follows since (z, y) ∈ Anε ([X]Y ).

33

D. Useful Lemma

Lemma IX.1. Let p be a prime and s, r a positive integer such that s ≤ r. For a ∈ Zps and b ∈ Zpr ,

let 0 ≤ θ ≤ s and θ ≤ θ ≤ r be such that

a ∈ pθZps\pθ+1Zps

b ∈ pθZpr

Write a = pθα for some invertible element α ∈ Zpr and b = pθβ for some β ∈ β ∈ {0, 1, · · · , pr−θ−1}.

Then, the set of solutions to the equation ax (mod pr) = b is{pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1

}Proof: Note that the representation of b as b = pθβ is not unique and for any β of the form

β = β+ ipr−θ for i = 0, 1, · · · , pθ−1, b can be written as pθβ. Also, the representation of a as a = pθα

is not unique and for any α = α+ ipr−θ for i = 0, 1, · · · , pθ − 1, we have a = pθα. The set of solutions

to ax = b is identical to the set of solutions to pθx = pθα−1β. The set of solutions to the latter is{pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1

}It remains to show that this set of solutions is independent of the choice of α and β. First, we show that

the set of solutions is independent of the choice of β. For β = β+jpr−θ for some j ∈ {0, 1, · · · , pθ2−1},

we have {pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1

}={pθ−θα−1

(β + jpr−θ

)+ iα−1pr−θ|i = 0, 1, · · · , pθ − 1

}={pθ−θα−1β + (i+ j)α−1pr−θ|i = 0, 1, · · · , pθ − 1

}(a)={pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1

}where (a) follows since the set pr−θ{0, 1, · · · , pθ − 1} is a subgroup of Zpr and jpr−θ lies in this set.

Next, we show that the set of solutions is independent of the choice of α. For α = α+ jpr−θ for some

j ∈ {0, 1, · · · , pθ − 1}, we have

α(α−1 − α−1jpr−θα−1

)= 1

34

Therefore, it follows that the unique inverse of α satisfies α−1 − α−1 ∈ α−1pr−θZpr . Assume α−1 =

α−1 + kα−1pr−θ. We have,{pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1

}={pθ−θ

(α−1 + kα−1pr−θ

)β + i

(α−1 + kα−1pr−θ

)pr−θ|i = 0, 1, · · · , pθ − 1

}={pθ−θα−1β +

(i+ ikpr−θ + kβpθ−θ

)α−1pr−θ|i = 0, 1, · · · , pθ − 1

}(a)={pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1

}where same as above, (a) follows since the set pr−θ{0, 1, · · · , pθ−1} is a subgroup of Zpr and (ikpr−θ+

kβpθ−θ)pr−θ lies in this set.

REFERENCES

[1] T. J. Goblick, Jr., “Coding for a discrete information source with a distortion measure”, Ph.D. dissertation, Dept. Electr.

Eng., MIT , Cambridge, MA, 1962.

[2] H. A. Loeliger and T. Mittelholzer, “Convolutional codes over groups”, IEEE Trans. Inform. Theory,vol. 42, no. 6, pp. 1660–

1686, November 1996.

[3] H. A. Loeliger, “Signal sets matched to groups”, IEEE Trans. Inform. Theory, vol. 37, no. 6, pp. 1675–1682, November

1991.

[4] I. Csiszar, “Linear codes for sources and source networks: Error exponents, universal coding,” IEEE Trans. Inform. Theory,

vol. IT- 28, pp. 585–592, July 1982.

[5] S. Shridharan, A. Jafarian, S. Vishwanath, and S. A. Jafar, “Lattice Coding for K User Gaussian Interference Channels”,

IEEE Transactions on Information Theory, April 2010. .

[6] R. Ahlswede. Group codes do not achieve Shannons’s channel capacity for general discrete channels. The annals of

Mathematical Statistics, 42(1):224–240, Feb. 1971.

[7] R. Ahlswede and J. Gemma. Bounds on algebraic code capacities for noisy channels I. Information and Control, 19(2):124–

145, 1971.

[8] R. Ahlswede and J. Gemma. Bounds on algebraic code capacities for noisy channels II. Information and Control,

19(2):146–158, 1971.

[9] N. J. Bloch. Abstract Algebra With Applications. Prentice-Hall, Inc, Englewood Cliffs, New Jersey, 1987.

[10] G. Como. Group codes outperform binary coset codes on non-binary symmetric memoryless channels. IEEE

Trans. Information Theory, 56(9):4321–4334, Sept. 2010.

[11] G. Como and F. Fagnani. The capacity of finite abelian group codes over symmetric memoryless channels. IEEE

Transactions on Information Theory, 55(5):2037–2054, 2009.

[12] J. H. Conway and N. J. A. Sloane. Sphere Packings, Lattices and Groups. Springer-Verlag, New York, 1988.

[13] I. Csiszar and J. Korner. Information theory: Coding theorems for Discrete memoryless Systems. 1981.

[14] R. L. Dobrushin. Asymptotic optimality of group and systematic codes for some channels. Theor. Probab. Appl., 8:47–59,

1963.

35

[15] R. L. Dobrusin. Asymptotic bounds of the probability of error for the transmission of messages over a discrete memoryless

channel with a symmetric transition probability matrix. Teor. Veroyatnost. i Primenen, pages 283–311, 1962.

[16] P. Elias. Coding for noisy channels. IRE Conv. Record, part. 4:37–46, 1955.

[17] U. Erez and S. tenBrink. A close-to-capacity dirty paper coding scheme. IEEE Trans. Inform. Theory, 51:3417–3432,

October 2005.

[18] S. Litsyn G. Cohen, I. Honkala and A. Lobstein. Covering Codes. Elsevier-North-Holland, Amsterdam, 1997.

[19] R. Garello and S. Benedetto. Multilevel construction of block and trellis group codes. IEEE Trans. Inform. Theory,

41:1257–1264, Sep. 1995.

[20] G. D. Forney Jr and M. Trott. The dynamics of group codes: State spaces, trellis diagrams, and canonical encoders. IEEE


[21] M. Hall Jr. The Theory of Groups. The Macmillan Company, New York, 1959.

[22] J. Korner and K. Marton. How to encode the modulo-two sum of binary sources. IEEE Transactions on Information

Theory, IT-25:219–221, Mar. 1979.

[23] D. Krithivasan and S. S. Pradhan. Distributed source coding using abelian group codes. 2011. IEEE Transactions on

Information Theory(57)1495-1519.

[24] F. J. MacWilliams and N. J. A. Sloane. The theory of error-correcting codes. Elsevier-North-Holland, 1977.

[25] B. A. Nazer and M. Gastpar. Computation over multiple-access channels. IEEE Transactions on Information Theory,

53(10):3498–3516, Oct. 2007.

[26] A. Padakandla and S.S. Pradhan. A new coding theorem for three user discrete memoryless broadcast channel. 2012.

Online: http://128.84.158.119/abs/1207.3146v2.

[27] A. Padakandla, A.G. Sahebi, and S.S. Pradhan. A new achievable rate region for the 3-user discrete memoryless interference

channel. In Information Theory Proceedings (ISIT), 2012 IEEE International Symposium on, pages 2256–2260, 2012.

[28] A. Padakandla, A.G. Sahebi, and S.S. Pradhan. An achievable rate region for the 3-user interference channel based on

coset codes. 2014. Online: http://arxiv.org/abs/1403.4583.

[29] W. Park and A. Barg. Polar codes for q-ary channels, q = 2r . 2012. Online: http://arxiv.org/abs/1107.4965.

[30] T. Philosof and R. Zamir. On the loss of single-letter characterization: The dirty multiple access channel. IEEE Transactions

on Information Theory, 55:2442–2454, June 2009.

[31] S. S. Pradhan and K. Ramchandran. Distributed source coding using syndromes (DISCUS): Design and construction. IEEE


[32] A. G. Sahebi and S. S. Pradhan. Multilevel Polarization of Polar Codes Over Arbitrary Discrete Memoryless Channels.

Proc. 49th Allerton Conference on Communication, Control and Computing, Sept. 2011.

[33] E. Sasoglu, E Telatar, and E Arikan. Polarization for arbitrary discrete memoryless channels. IEEE Information Theory

Worshop, Dec. 2009. Lausanne, Switzerland.

[34] D. Slepian. Group codes for for the Gaussian channel. Bell Syst. Tech. Journal, 1968.

[35] D. Slepian and J. K. Wolf. A coding theorem for multiple access channels with correlated sources. bell Syst. tech. J.,

52:1037–1076, 1973.

[36] A. D. Wyner. Recent results in the Shannon theory. IEEE Trans. Inform. Theory, IT-20:2–10, January 1974.

[37] R. Zamir, S. Shamai, and U. Erez. Nested linear/lattice codes for structured multiterminal binning. IEEE Trans. Inform.

Theory, 48:1250–1276, June 2002.

1 abelian group codes for channel coding and source coding

Documents