1 abelian group codes for channel coding and source coding
TRANSCRIPT
1
Abelian Group Codes for Channel Coding and
Source CodingAria G. Sahebi and S. Sandeep Pradhan
Department of Electrical Engineering and Computer Science,
University of Michigan, Ann Arbor, MI 48109, USA.Email: [email protected], [email protected]
Abstract
In this paper, we study the asymptotic performance of Abelian group codes for the the channel coding
problem for arbitrary discrete (finite alphabet) memoryless channels as well as the lossy source coding
problem for arbitrary discrete (finite alphabet) memoryless sources. For the channel coding problem,
we find the capacity characterized in a single-letter information-theoretic form. This simplifies to the
symmetric capacity of the channel when the underlying group is a field. For the source coding problem, we
derive the achievable rate-distortion function that is characterized in a single-letter information-theoretic
form. When the underlying group is a field, it simplifies to the symmetric rate-distortion function. We give
several illustrative examples. Due to the non-symmetric nature of the sources and channels considered,
our analysis uses a synergy of information-theoretic and group-theoretic tools.
I. INTRODUCTION
Approaching the information-theoretic performance limits of communication using structured codes
has been of great interest for the last several decades [1], [7], [14], [15]. The earlier attempts to design
computationally efficient encoding and decoding algorithms for point-to-point communication (both
channel coding and source coding) resulted in injection of finite field structures to the coding schemes
[12]. In the channel coding problem [24], the channel input alphabets are matched to algebraic structure
and encoders are represented by matrices. Similarly in source coding problem [18], the reconstruction
alphabets are matched to algebraic structure and decoders are represented by matrices. Later these coding
This work was supported by NSF grant CCF-1116021. This work was presented in part at IEEE International Symposium on
Information Theory (ISIT), July 2011, and Allerton conference on communication, conrol and computing, October 2012.
2
approaches were extended to weaker algebraic structures such as rings and groups [2], [3], [11], [19],
[20]1. The motivation for this is twofold: a) finite fields exist only for alphabets with size equal to a
prime power, and b) for communication under certain constraints, codes with weaker algebraic structures
have better properties. For example, when communicating over an additive white Gaussian noise channel
with 8-PSK constellation, codes over Z8, the cyclic group of size 8, are more desirable over binary linear
codes because the structure of the code is matched to the structure of the signal set [10] (also see [3]),
and hence the former have superior error correcting properties. As another example, construction of polar
codes over alphabets of size pr, for r > 1 and p prime, is simpler with a module structure rather than
a vector space structure [29], [32], [33]. Subsequently, as interest in network information theory grew,
these codes were used to approach the information-theoretic performance limits of certain special cases
of multi-terminal communication problems [4], [17], [31], [36], [37]. These limits were obtained earlier
using the random coding ensembles in the information theory literature.
In 1979, Korner and Marton, in a significant departure from tradition, showed that for a binary
distributed source coding problem, the asymptotic average performance of binary linear code ensembles
can be superior to that of the standard random coding ensembles. Although, structured codes were
being used in communication mainly for computational complexity reasons, the duo showed that, in
contrast, even when computational complexity is not an issue, the use of structured codes leads to superior
asymptotic performance limits in multi-terminal communication problems. In the recent past, such gains
were shown for a wide class of problems [5], [23], [25], [30]. In our prior work, we developed an
inner bound to the optimal rate-distortion region for the distributed source coding problem [23] in which
cyclic group codes were used as building blocks in the coding schemes. Similar coding approaches were
applied for the interference channel and the broadcast channel in [26], [27]. The motivation for studying
Abelian group codes beyond the non-existence of finite fields over arbitrary alphabets is the following. The
algebraic structure of the code imposes certain restrictions on the performance. For certain communication
problems, linear codes were shown to be not optimal [23], and Abelian group codes exhibit a superior
performance. For example, consider a distributed source coding problem with two statistically correlated
but individually uniform quaternary sources X and Y that are related via the relation Y = −X + Z,
where + denotes addition modulo-4 and Z is a hidden quaternary random variable that has a non-
uniform distribution and is independent of X . The joint decoder wishes to reconstruct Z losslessly. In
this problem, random codes over Z4 perform better than random linear codes over the Galois field of size
1Note that this is an incomplete list. There is a vast body of work on group codes. See [12] for a more complete bibliography.
3
4. In summary, the main reason for using algebraic structured codes in this context is performance rather
than complexity of encoding and decoding. Hence information-theoretic characterizations of asymptotic
performance of Abelian group code ensembles for various communication problems and under various
decoding constraints became important.
Such performance limits have been characterized in certain special cases. It is well-known that binary
linear codes achieve the capacity of binary symmetric channels [16]. More generally, it has also been
shown that q-ary linear codes can achieve the capacity of symmetric channels [14] and linear codes can
be used to compress a source losslessly down to its entropy [22]. Goblick [1] showed that binary linear
codes achieve the rate-distortion function of binary uniform sources with Hamming distortion criterion.
Group codes were first studied by Slepian [34] for the Gaussian channel. In [6], the capacity of group
codes for certain classes of channels has been computed. Further results on the capacity of group codes
were established in [7], [8]. The capacity of group codes over a class of channels exhibiting symmetries
with respect to the action of a finite Abelian group has been investigated in [11].
In this work, we focus on two problems. In the first part, we consider the channel coding problem
for arbitrary discrete memoryless channels. We assume that the channel input alphabet is equipped with
the structure of a finite Abelian group G. We provide an information-theoretic characterization of the
capacity of such channels achievable using group codes which are cosets of subgroups of Gn, where n
denotes the block length of encoding which is arbitrarily large. This performance limit is equal to the
symmetric capacity of the channel when the underlying group is a field; i.e., it is equal to the Shannon
mutual information between the channel input and the channel output when the channel input is uniformly
distributed. In the general case, additional constraints corresponding to subgroups of the underlying group
appear in the characterization, and the achievable rate can be smaller than the symmetric capacity of the
channel.
In the second, we consider the lossy source coding problem for arbitrary discrete memoryless sources
with single-letter distortion measures and the reconstruction alphabet being equipped with the structure
of a finite Abelian group G. We provide an information-theoretic characterization of the rate-distortion
function achievable using group codes which are cosets of subgroups of Gn. The performance limit
is equal to the symmetric rate-distortion function of the source when the underlying group is a field
i.e., the Shannon rate-distortion function with the additional constraint that the reconstruction variable is
uniformly distributed. For the general case, as in channel coding, additional constraints corresponding
to subgroups of the underlying group appear in the characterization, and this can result in a larger rate
compared to the symmetric rate for a given distortion level.
4
We use joint typicality encoding and decoding [13] for both problems at hand, which makes the
analysis more tractable. In this approach we use a synergy of information-theoretic and group-theoretic
tools. The traditional approaches have looked at encoding and decoding of structured codes based on
either minimum distance or maximum likelihood. However, the approach based on joint typicality does
not provide insight into error exponents as compared to traditional ones.
The paper is organized as follows: In Section II, some definitions and basic facts are stated which are
used in the paper. In Section III, we give two examples of multi-terminal communication problems where
the performance of group code ensembles is strictly superior to that of unstructured code ensembles. In
Section IV, we introduce the ensemble of Abelian group codes used in the paper. In Section V, we
state the main results of the paper for both the source coding problem as well as the channel coding
problem. In Section VI, we prove the converse results for both problems and, in Section VII, we prove
the achievability results. We conclude in Section VIII.
II. PRELIMINARIES
1) Channel Model: We consider discrete memoryless channels used without feedback. We associate
two finite sets X and Y with the channel as the channel input and output alphabets. The input-output
relation of the channel is characterized by a conditional probability law WY |X(y|x) for x ∈ X and y ∈ Y .
The channel is specified by (X ,Y,WY |X).
2) Source Model: The source is modeled as a discrete-time memoryless random process X with
each sample taking values from a finite set X called alphabet according to the distribution PX . The
reconstruction alphabet is denoted by a finite set U and the quality of reconstruction is measured by a
bounded single-letter distortion function d : X × U → R+. We denote this source by (X ,U , PX , d).
3) Groups: All groups referred to in this paper are finite Abelian groups. Given a group (G,+), a
subgroup H of G is denoted by H ≤ G. A coset C of a subgroup H is a shift of H by an arbitrary
element a ∈ G (i.e., C = a + H for some a ∈ G). For a subgroup H of G, the number of cosets of
H in G is called the index of H in G and is denoted by |G : H|. The index of H in G is equal to
|G|/|H| where |G| and |H| are the cardinality or size of G and H respectively. For a prime p dividing
the cardinality of G, the Sylow-p subgroup of G is the largest subgroup of G whose cardinality is a
power of p. Group isomorphism is denoted by ∼=.
4) Group Codes: Given a group G, a group code C over G with block length n is any subgroup of
Gn. A shifted group code over G, C+B is a translation of a group code C by a fixed vector B ∈ Gn.
5
Group codes generalize the notion of linear codes over fields to sources with reconstruction alphabets
(and channels with input alphabets) having composite sizes.
5) Achievability for Channel Coding: For a group G, a group transmission system with parameters
(n,Θ, τ) for reliable communication over a given channel (X = G,Y,WY |X) consists of a codebook,
an encoding mapping and a decoding mapping. The codebook C is a shifted subgroup of Gn whose size
is equal to Θ and the mappings are defined as
Enc : {1, 2, · · · ,Θ} → C
Dec : Yn → {1, 2, · · · ,Θ}
such that
max1≤m≤Θ
∑x∈Xn
1{x=Enc(m)}∑y∈Yn
1{m6=Dec(y)}Wn(y|x) ≤ τ
Given a channel (X = G,Y,WY |X), a rate R is said to be achievable using group codes if for all ε > 0
and for all sufficiently large n, there exists a group transmission system for reliable communication with
parameters (n,Θ, τ) such that
1
nlog Θ ≥ R− ε, τ ≤ ε
The group capacity of the channel C is defined as the supremum of the set of all achievable rates using
group codes.
6) Achievability for Source Coding and the Rate-Distortion Function: For a group G, a group trans-
mission system with parameters (n,Θ,∆, τ) for compressing a given source (X ,U = G,PX , d) consists
of a codebook, an encoding mapping and a decoding mapping. The codebook C is a shifted subgroup
of Gn whose size is equal to Θ and the mappings are defined as
Enc : X n → {1, 2, . . . ,Θ},
Dec : {1, 2, . . . ,Θ} → C
such that E [d(Xn, Un)] ≤ ∆, where Xn is the random vector of length n generated by the source,
and Un is the reconstruction vector and is given by Un = Dec(Enc(Xn)). In this transmission system,
n denotes the block length, log Θ denotes the number of “channel uses”, ∆ denotes the distortion level
and the distortion d(x, x) between two vectors x and x is assumed to be the average of the single-letter
6
distortion between the components xi and xi. Let Θi denote the size of the range of the ith component of
the shifted group code. Given a source (X ,U = G,PX , d), a pair of non-negative real numbers (R,D)
is said to be achievable using group codes if for every ε > 0 and for all sufficiently large numbers n,
there exists a group transmission system with parameters (n,Θ,∆) for compressing the source such that
1
nlog Θ ≤ R+ ε, ∆ ≤ D + ε,
1
n
n∑i=1
[log Θi −H(Ui)] ≤ ε.
The optimal group rate-distortion function R∗(D) of the source is given by the infimum of the rates R
such that (R,D) is achievable using group codes.
The rationale for imposing the third constraint regarding the average entropy of single-letter recon-
struction samples is the following. Recall that in channel coding all codewords are used by definition.
However in source coding, we may have a situation where only a small fraction of the codewords in the
group code is used. In which case, one may have to use some form of entropy coding to remove the
redundancy. This leads to a different class of codes called nested group codes. We will not study this
class in this paper. To prevent this situation, we impose the third constraint. This also puts the source
coding problem on a similar footing as compared to the channel coding problem.
7) Typicality: We follow the notion of typicality as found in [13]. Consider two random variables
X and Y with joint probability mass function PX,Y (x, y) over X × Y . Let n be an integer and ε be a
positive real number. The sequence pair (xn, yn) belonging to X n × Yn is said to be jointly ε-typical
with respect to PX,Y (x, y) if
∀a ∈ X , ∀b ∈ Y :
∣∣∣∣ 1nN (a, b|xn, yn)− PX,Y (a, b)
∣∣∣∣ ≤ ε
|X ||Y|
and none of the pairs (a, b) with PX,Y (a, b) = 0 occurs in (xn, yn). Here, N(a, b|xn, yn) counts the
number of occurrences of the pair (a, b) in the sequence pair (xn, yn). We denote the set of all jointly
ε-typical sequence pairs in X n × Yn by Anε (X,Y ).
Given a sequence xn ∈ Anε (X), the set of ε-typical sequences Anε (Y |xn) is defined as
Anε (Y |xn) = {yn ∈ Yn |(xn, yn) ∈ Anε (X,Y )}
8) Notation: In our notation, O(ε) is any function of ε such that limε↓0O(ε) = 0, P is the set of
all primes, Z+ is the set of positive integers and R+ is the set of non-negative reals. Let |x|+ denote
max{x, 0}. Since we deal with summations over several groups in this paper, when not clear from the
context, we indicate the underlying group in each summation; e.g. summation over the group G is denoted
by
(G)︷︸︸︷∑. Direct sum of groups is denoted by
⊕and direct product of sets is denoted by
⊗.
7
III. MOTIVATING EXAMPLES
In this section, we present examples of the use of group codes for two multi-terminal communication
problems, namely the distributed source coding problem and the interference channel coding problem.
In these problems, one can use group code ensembles to improve upon the asymptotic performance of
the standard random coding ensembles used in information theory.
A. Example in Distributed Source Coding
Consider a two-user distributed source coding problem in which the two sources X and Y take values
from Z4 and a centralized decoder is interested in decoding the sum of the two sources losslessly. The
two sources need to be compressed distributively. Furthermore, assume that X is uniformly distributed
over Z4 and Y = −X + Z where Z is independent from X and is distributed over Z4 such that
PZ(0) = 1 − τ and PZ(1) = PZ(2) = PZ(3) = τ3 for some τ ∈ (0, 1). Let R1 and R2 be the rates of
the two encoders. Using Slepian-Wolf coding [35], which uses random unstructured codes, one can show
that the the following sum rate is achievable
RU = R1 +R2 = H(X,Y ) = H(X,−X + Z) = H(X) +H(Z) = 2 + h(τ) + τ log 3,
where h(·) denotes the binary entropy function. We present a coding scheme based on group codes that is
optimal in the sum rate and strictly improves upon the rate achievable using random unstructured codes.
Consider a fictitious discrete memoryless channel with input U and output V related via V = U +Z,
where Z is independent of U . Let C be a good group channel code for the channel PV |U . Using the
channel coding result in this paper (see Section VII), the rate of C can be arbitrarily close to
r = min(I(U ;V ), 2I(U ;V |[U ])
)= min
(2−H(U |V ), 2− 2H(U |[U ]V )
)where for g ∈ Z4, [g] = g + {0, 2}. Note that H(U |V ) = H(Z) and
H(U |[U ]V )(a)= H(V − Z|[Z]V )
(b)= H(Z|[Z])
where (a) follows since there is a one-to-one correspondence between ([V −Z], V ) and ([Z], V ) and (b)
follows since V and Z are independent. Therefore, we have
r = min(
2−H(Z), 2− 2H(Z|[Z]))
(a)= 2−H(Z) = 2− h(τ)− τ log(3),
where in (a) we have used the relation, h(τ)+τ log2(3) ≤ 2h(2τ/3), to show that I(U ;V ) < 2I(U ;V |[U ]).
The encoding scheme for the distributed source coding problem is as follows: Given a pair of source
sequences xn, and yn, the X-, Y -encoder send xn+C, and yn+C, respectively, to the decoder. In other
8
words each encoder sends the index of the coset of C that contains its source word. This implies that,
because of the closure of C with respect to group addition, and the commutativity of the group, the
decoder has zn +C, from which it can recover zn with high probability using the property of the code.
Therefore, using group codes, the following sum rate is achievable:
RG = R1 +R2 = 2(2− r) = 2 max(H(Z), 2H(Z|[Z])
)= 2H(Z) = 2h(τ) + 2τ log(3) < RU .
It can be shown that this rate is not achievable using random linear codes over Galois field of size 4.
The optimality follows from the standard information-theoretic arguments. The algebraic property of the
code–closure with respect to addition modulo-4– is exploited in reducing the sum rate from H(X,Y ) to
2H(Z).
Consider another example with PZ(0) = 0.4014, PZ(1) = 0.2035, PZ(2) = 0.3356 and PZ(3) =
0.0595. Here H(Z) = 1.7669, 2H(Z|[Z]) = 1.8712. The achievable rate for distributed compression
using random unstructured codes or random linear codes is RU = 3.7669 and that using random group
codes is RG = 3.7424. We do not claim optimality for this example. Although random group codes over
Z4 are inferior to random unstructured codes in terms of point-to-point compression of the source Z, in
the problem of distributed compression of X and Y , they outperform the latter.
B. Example for the Interference Channel
Consider the problem of communication over the following interference channel between three pairs
of encoders and decoders. The 3-user interference channel [28, Example 7] has three inputs X1, X2 and
X3 which take values from Z4 and has three outputs, which are given by Y1 = X1 + X2 + X3 + N1,
Y2 = X2 + N2 and Y3 = X3 + N3 where the additions are mod-4 operations and N1, N2 and N3 are
independent random variables distributed according to PNi(0) = 1−δi, PNi(1) = PNi(2) = PNi(3) = δi3
for i = 1, 2, 3. For simplicity, we assume δ2 = δ3 = δ. The input X1 is costed according to w(0) = 0,
w(1) = w(2) = w(3) = 1 but X2 and X3 are not costed. Let τ be the cost constraint on X1. Let us
assume for simplicity δ1, δ <14 and τ < 3
4 . Observe that the channel inputs of the second and the third
transmitters interfere with that of the first. There is no interference for the second and third receivers.
Consider the following optimal coding scheme based on group codes. Let C be a good group channel
code for the channel with input X2 and output Y2. The same code is employed for communication
between X3 and Y3. Using the arguments used in distributed source coding, it follows that the following
rates are achievable for transmitters 2 and 3:
R2 = R3 = 2− h(δ)− δ log(3).
9
Note that X2 +X3 interferes with X1 at Y1, and the decoder wishes to decode the interference first. Using
the closure of C with respect to group addition, note that the rate of Xn2 + Xn
3 is 2 − h(δ) − δ log(3).
The interference can be decoded reliably if the effective noise given by X1 +N1 satisfies: P (X1 +N1 6=
0) ≤ P (N2 6= 0) = P (N3 6= 0). This condition implies that δ1 + τ − 4δ1τ3 ≤ δ. After decoding the
interference, the effective channel becomes (Y1 − X2 − X3) = X1 + N1, and the first transmitter can
communicate at the following rate, given by the capacity of this effective channel,
R1 = C∗ , supPX1
:Ew(X1)≤τI(X1;Y1|X2 +X3).
In summary, the rate triple (C∗, 2−h(δ)− δ log(3), 2−h(δ)− δ log(3)) is achievable using group codes.
It can be shown that [28, Lemma 8] if
C∗ + 2(2− h(δ)− δ log 3) > 2− h(δ1)− δ1 log 3,
then the above triple of rates cannot be achieved using either random unstructured codes or random linear
codes over the Galois field of size 4. The following example of δ1, τ and δ satisfies all the conditions:
δ1 = τ = 34 −
√308 , δ = 1
8 . Note that group codes outperform random codes in this case because there is
a match between the structure of the group code and the structure of the channel.
IV. ABELIAN GROUP CODE ENSEMBLE
In this section, we use a standard characterization of Abelian groups and introduce the ensemble of
Abelian group codes used in the paper.
A. Abelian Groups
For an Abelian group G, let P(G) denote the set of all distinct primes which divide |G| and for a
prime p ∈ P(G) let Sp(G) be the corresponding Sylow subgroup of G. It is known [21, Theorem 3.3.1]
that any Abelian group G can be decomposed as a direct sum of its Sylow subgroups in the following
manner
G =⊕
p∈P(G)
Sp(G) (1)
Furthermore, each Sylow subgroup Sp(G) can be decomposed into Zpr groups as follows:
Sp(G) ∼=⊕
r∈Rp(G)
ZMp,r
pr (2)
10
where Rp(G) ⊆ Z+ and for r ∈ Rp(G), Mp,r is a positive integer. Note that ZMp,r
pr is defined as the
direct sum of the ring Zpr with itself for Mp,r times. Combining Equations (1) and (2), we can represent
any Abelian group as follows:
G ∼=⊕
p∈P(G)
⊕r∈Rp(G)
ZMp,r
pr =⊕
p∈P(G)
⊕r∈Rp(G)
Mp,r⊕m=1
Z(m)pr (3)
where Z(m)pr is called the mth Zpr ring of G or the (p, r,m)th ring of G. Equivalently, this can be written
as follows
G ∼=⊕
(p,r)∈Q(G)
ZMp,r
pr =⊕
(p,r,m)∈G(G)
Z(m)pr
where Q(G) ⊆ P×Z+ is defined as:
Q(G) = {(p, r)|p ∈ P(G), r ∈ Rp(G)}, (4)
and G(G) ⊆ P×Z+ ×Z+ is defined as:
G(G) = {(p, r,m) ∈ P×Z+ ×Z+|(p, r) ∈ Q(G),m ∈ {1, 2, · · · ,Mp,r}}
This means that any element a of the Abelian group can be regarded as a vector whose components
are indexed by (p, r,m) ∈ G(G) and whose (p, r,m)th component ap,r,m takes values from the ring Zpr ,
or as a vector whose components are indexed by (p, r) ∈ Q(G), and whose (p, r)th component ap,r takes
values from the ring ZMp,r
pr .
With a slight abuse of notation, we represent an element a of G as
a =⊕
(p,r,m)∈G(G)
ap,r,m
Furthermore, for two elements a, b ∈ G, we have
a+ b =⊕
(p,r,m)∈G(G)
ap,r,m +pr bp,r,m
where + denotes the group operation and +pr denotes addition mod-pr. More generally, let a, b, · · · , z
be any number of elements of G. Then we have
a+ b+ · · ·+ z =⊕
(p,r,m)∈G(G)
(ap,r,m +pr bp,r,m +pr · · ·+pr zp,r,m) (5)
This can equivalently be written as
[a+ b+ · · ·+ z]p,r,m = ap,r,m +pr bp,r,m +pr · · ·+pr zp,r,m
11
where [·]p,r,m denotes the (p, r,m)th component of it’s argument.
Let IG:p,r,m ∈ G be a generator for the group which is isomorphic to the (p, r,m)th ring of G. Then
we have
a =
(G)︷︸︸︷∑(p,r,m)∈G(G)
ap,r,mIG:p,r,m (6)
where the summations are done with respect to the group operation and the multiplication ap,r,mIG:p,r,m
is by definition the summation (with respect to the group operation) of IG:p,r,m to itself for ap,r,m times.
In other words, ap,r,mIG:p,r,m is the short hand notation for
ap,r,mIG:p,r,m =
(G)︷︸︸︷∑i∈{1,··· ,ap,r,m}
IG:p,r,m
where the summation is the group operation.
Example: Let G = Z4 ⊕Z3 ⊕Z29. Then we have P(G) = {2, 3}, S2(G) = Z4 and S3(G) = Z3 ⊕Z2
9,
R2(G) = {2}, R3(G) = {1, 2}, M2,2 = 1, M3,1 = 1, M3,2 = 2 and
G(G) = {(2, 2, 1), (3, 1, 1), (3, 2, 1), (3, 2, 2)}
Each element a of G can be represented by a quadruple (a2,2,1, a3,1,1, a3,2,1, a3,2,2) where a2,2,1 ∈ Z4,
a3,1,1 ∈ Z3 and a3,2,1, a3,2,2 ∈ Z9. Finally, we have IG:2,2,1 = (1, 0, 0, 0), IG:3,1,1 = (0, 1, 0, 0), IG:3,2,1 =
(0, 0, 1, 0), IG:3,2,2 = (0, 0, 0, 1) so that Equation (6) holds.
In the following section, we introduce the ensemble of Abelian group codes which we use in the paper.
B. The Image Ensemble
Recall that for a positive integer n, an Abelian group code of length n over the group G is a coset of
a subgroup of Gn. Our ensemble of codes consists of all Abelian group codes over G; i.e., we consider
all subgroups of Gn. We use the following fact to characterize all subgroups of Gn:
Lemma IV.1. For an Abelian group G, let φ : J → G be a homomorphism from some Abelian group
J to G. Then φ(J) ≤ G; i.e., the image of the homomorphism is a subgroup of G. Moreover, for any
subgroup H of G there exists a corresponding Abelian group J and a homomorphism φ : J → G such
that H = φ(J).
Proof: The first part of the lemma is proved in [9, Theorem 12-1]. For the second part, Let J be
isomorphic to H and let φ be the identity mapping (more rigorously, let φ be the isomorphism between
J and H).
12
In order to use the above lemma to construct the ensemble of subgroups of Gn, we need to identify all
groups J from which there exist non-trivial homomorphisms to Gn. Then the above lemma implies that
for each such J and for each homomorphism φ : J → Gn, the image of the homomorphism is a group
code over G of length n and for each group code C ≤ Gn, there exists a group J and a homomorphism
such that C is the image of the homomorphism. This ensemble corresponds to the ensemble of linear
codes characterized by their generator matrix when the underlying group is a field of prime size. Note
that as in the case of standard ensembles of linear codes, the correspondence between this ensemble and
the set of Abelian group codes over G of length n may not be one-to-one.
Let G and J be two Abelian groups with decompositions:
G =⊕
(p,r,m)∈G(G)
Z(m)pr
J =⊕
(q,s,l)∈G(J)
Z(l)qs
and let φ be a homomorphism from J to G. For (q, s, l) ∈ G(J) and (p, r,m) ∈ G(G), let
g(q,s,l)→(p,r,m) = [φ(IJ :q,s,l)]p,r,m
where IJ :q,s,l ∈ J is the standard generator for the (q, s, l)th ring of J and [φ(IJ :q,s,l)]p,r,m is the
(p, r,m)th component of φ(IJ :q,s,l) ∈ G. For a =⊕
(q,s,l)∈G(J) aq,s,l ∈ J , let b = φ(a) and write
b =⊕
(p,r,m)∈G(G) bp,r,m. Note that as in Equation (6), we can write:
a =
(J)︷︸︸︷∑(q,s,l)∈G(J)
aq,s,lIJ :q,s,l
=
(J)︷︸︸︷∑(q,s,l)∈G(J)
(J)︷︸︸︷∑i∈{1,··· ,aq,s,l}
IJ :q,s,l
13
where the summations are the group summations. We have
bp,r,m = [φ(a)]p,r,m
=
φ
(J)︷︸︸︷∑(q,s,l)∈G(J)
(J)︷︸︸︷∑i∈{1,··· ,aq,s,l}
IJ :q,s,l
p,r,m
(a)=
(G)︷︸︸︷∑
(q,s,l)∈G(J)
(G)︷︸︸︷∑i∈{1,··· ,aq,s,l}
φ (IJ :q,s,l)
p,r,m
(b)=
(Zpr )︷︸︸︷∑(q,s,l)∈G(J)
(Zpr )︷︸︸︷∑i∈{1,··· ,aq,s,l}
[φ (IJ :q,s,l)]p,r,m
(c)=
(Zpr )︷︸︸︷∑(q,s,l)∈G(J)
aq,s,l [φ (IJ :q,s,l)]p,r,m
=
(Zpr )︷︸︸︷∑(q,s,l)∈G(J)
aq,s,lg(q,s,l)→(p,r,m)
Note that (a) follows since φ is a homomorphism; (b) follows from Equation (5); and (c) follows by
using aq,s,l [φ (IJ :q,s,l)]p,r,m as the short hand notation for the summation of [φ (IJ :q,s,l)]p,r,m to itself for
aq,s,l times.
Note that g(q,s,l)→(p,r,m) represents the effect of the (q, s, l)th component of a on the (p, r,m)th
component of b dictated by the homomorphism. This means that the homomorphism φ can be represented
by
φ(a) =⊕
(p,r,m)∈G(G)
(Zpr )︷︸︸︷∑(q,s,l)∈G(J)
aq,s,lg(q,s,l)→(p,r,m) (7)
where aq,s,lg(q,s,l)→(p,r,m) is the short-hand notation for the mod-pr addition of g(q,s,l)→(p,r,m) to itself
for aq,s,l times. We have the following lemma on g(q,s,l)→(p,r,m):
Lemma IV.2. For a homomorphism described by (7), we have
g(q,s,l)→(p,r,m) = 0 If p 6= q
g(q,s,l)→(p,r,m) ∈ pr−sZpr If p = q, r ≥ s
Moreover, any mapping described by (7) and satisfying these conditions is a homomorphism.
14
Proof: The proof is provided in Appendix IX-A.
This lemma implies that in order to construct a subgroup of G, we only need to consider homomor-
phisms from an Abelian group J to G such that
P(J) ⊆ P(G)
since if for some (q, s, l) ∈ G(J), q /∈ P(G) then φ(a) would not depend on aq,s,l. For p ∈ P(G), define
rp = maxRp(G) (8)
We show that we can restrict ourselves to J’s such that for all (q, s, l) ∈ G(J), s ≤ rq. Let (p, r,m) ∈
G(G) be such that p = q. Since g(q,s,l)→(p,r,m) ∈ Zpr and r ≤ rq, we have(aq,s,lg(q,s,l)→(p,r,m)
)(mod pr) =
((aq,s,l) (mod pr)g(q,s,l)→(p,r,m)
)(mod pr)
=((aq,s,l) (mod prq)g(q,s,l)→(p,r,m)
)(mod pr)
This implies that for all a ∈ J and all (q, s, l) ∈ G(J), in the expression for the (p, r,m)th component of
φ(a) with p = q, aq,s,l appears as (aq,s,l) (mod qrq). Therefore, it suffices for aq,s,l to take values from
Zqrq and this happens if s ≤ rq.
To construct Abelian group codes of length n over G, let G = Gn. we have
Gn ∼=⊕
p∈P(G)
⊕r∈Rp
ZnMp,r
pr =⊕
p∈P(G)
⊕r∈Rp
nMp,r⊕m=1
Z(m)pr =
⊕(p,r,m)∈G(Gn)
Z(m)pr (9)
Define J as
J =⊕
q∈P(G)
rq⊕s=1
Zkq,sqs =
⊕q∈P(G)
rq⊕s=1
kq,s⊕l=1
Z(l)qs =
⊕(q,s,l)∈G(J)
Z(l)qs (10)
for some positive integers kq,s.
Example: Let G = Z8 ⊕Z9 ⊕Z5. Then we have
J = Zk2,12 ⊕Zk2,24 ⊕Zk2,38 ⊕Zk3,13 ⊕Zk3,29 ⊕Zk5,15
Define
k =∑
q∈P(G)
rq∑s=1
kq,s
and wq,s = kq,sk for q ∈ P(G) and s = 1, · · · , rq so that we can write
J =⊕
q∈P(G)
rq⊕s=1
kwq,s⊕l=1
Z(l)qs (11)
15
for some constants wq,s adding up to one. Note that
G(J) = {(q, s, l) : q ∈ P(G), 1 ≤ s ≤ rq, 1 ≤ l ≤ kwq,s}.
Define
S(G) = {(p, s)|p ∈ P(G), 1 ≤ s ≤ rp}. (12)
Note that S(J) = Q(J) = S(G).
The ensemble of Abelian group encoders consists of all mappings φ : J → Gn of the form
φ(a) =⊕
(p,r,m)∈G(Gn)
(Zpr )︷︸︸︷∑(q,s,l)∈G(J)
aq,s,lg(q,s,l)→(p,r,m) (13)
for a ∈ J where g(q,s,l)→(p,r,m) = 0 if p 6= q, g(q,s,l)→(p,r,m) is a uniform random variable over Zpr
if p = q, r ≤ s, and g(q,s,l)→(p,r,m) is a uniform random variable over pr−sZpr if p = q, r ≥ s. The
corresponding shifted group code with parameters (n, k, w) is defined by
C = {φ(a) +B|a ∈ J} (14)
where B is a uniform random variable over Gn. The rate of this code is given by
R =1
nlog |J | = k
n
∑q∈P(G)
rq∑s=1
swq,s log q (15)
Remark IV.3. An alternate approach to constructing Abelian group codes is to consider kernels of
homomorphisms (the kernel ensemble). To construct the ensemble of Abelian group codes in this manner,
let φ be a homomorphism from J into Gn such that for a ∈ Gn,
φ(a) =⊕
(q,s,l)∈G(J)
(Zqs )︷︸︸︷∑(p,r,m)∈G(Gn)
ap,r,mg(p,r,m)→(q,s,l)
where g(p,r,m)→(q,s,l) = 0 if q 6= p, g(p,r,m)→(q,s,l) is a uniform random variable over Zqs if q = p, s ≤ r,
and g(p,r,m)→(q,s,l) is a uniform random variable over ps−rZqs if q = p, s ≥ r. The code is given by
C = {a ∈ Gn|φ(a) = c} where c is a uniform random variable over J .
In this paper, we use the image ensemble for both the channel and the source coding problem; however,
similar results can be derived using the kernel ensemble as well.
16
V. MAIN RESULTS
In this section, we provide an information-theoretic characterization of the optimal rate-distortion
function of a given source and the capacity of a given channel using group codes when the underlying
group is an arbitrary finite Abelian group represented by Equation (3). We define two subgroups of G
and then state two theorems using these subgroups, and finally provide an interpretation of the results
and these subgroups with two examples.
A. Definitions
Consider four vectors θ, w, η and b. The components θp,s and wp,s of θ and w, respectively, are indexed
by (p, s) ∈ S(G), and satisfy: 0 ≤ θp,s ≤ s,∑(p,s)∈S(G)
wp,s = 1, wp,s ≥ 0.
Let 000 denote the all-zero vector, and sss denote the vector whose components satisfy sssp,s = s for all
(p, s) ∈ S(G).
The components ηp,r,m,s of η are indexed by (p, r,m, s), for every (p, r,m) ∈ G(G) and every s ∈
{1, . . . , rp}, and satisfy 0 ≤ ηp,r,m,s ≤ r−|r−s|+. The components bp,r,m of b are indexed by (p, r,m) ∈
P(G) and satisfy bp,r,m ∈ Zpr . Let α be a probability distribution on the set of all η and b, where αη,b
denote the probability assigned to a particular η and b.
For θ, and η, define
θθθ(η) =
min1≤s≤rpwp,s 6=0
|r − s|+ + ηp,r,m,s
(p,r,m)∈P(G)
.
Define Hη ≤ G and Hη+θ ≤ Hη as
Hη =⊕
(p,r,m)∈G(G)
pθθθ(η)p,r,mZ(m)pr (16)
Hη+θ =⊕
(p,r,m)∈G(G)
pθθθ(η+θ)p,r,mZ(m)pr , (17)
where η + θ denote a vector obtained by adding the components with index (p, s) in S(G).
For a given θ and w, define
ωθ =
∑(p,s)∈S(G) θp,swp,s log p∑
(p,s)∈S(G) swp,s log p.
17
B. Main Results
The following theorem is the first main result of this paper.
Consider a channel (X = G,Y,WY |X). For every η and b, let Xη,b be a random variable distributed
uniformly over Hη+b. Let [Xη,b]θ = Xη,b+Hη+θ which takes values from the cosets of Hη+θ in Hη+b.
Theorem V.1. For a channel (X = G,Y,WY |X), the group capacity is given by
C = supα,w
minθ 6=sss
1
1− ωθ
∑η,b
αη,bI(Xη,b;Y |[Xη,b]θ). (18)
Proof: The proof is provided in Section VII-A2.
The following theorem is the second main result of this paper.
Consider a source (X ,U = G,PX , d), and distortion level D. For every η and b, let Uη,b be a
reconstruction random variable distributed uniformly over Hη + b and is statistically correlated with the
source X . Let us denote this collection of random variables as UUU . Let [Uη,b]θ = Uη,b +Hη+θ.
Theorem V.2. For a source (X ,U = G,PX , d), group rate-distortion function is given by
R∗(D) = infUUU
infα,w
maxθ 6=000
1
ωθ
∑η,b
αη,bI([Uη,b]θ;X), (19)
where infimum is over all α, w and UUU , such that
D ≥∑η,b
αη,bE[d(X,Uη,b)]
Proof: The proof is provided in Section VII-B2.
C. Interpretation of the Result
In this section, we give some intuition about the result and the quantities defined above using several
examples. At a high level, wp,s characterizes the normalized weight given to the Zps component of
the input group J in constructing the homomorphism from J to Gn, and θ indexes subgroups of J . η
characterizes the the collection of input distributions used on the channel. 1(1−ωθ)
I(Xη,b;Y |[Xη,b]θ) in
channel coding and 1ωθI([Uη,b]θ;X) in source coding denote the rate constraints imposed by the subgroup
Hη+θ. Due to the algebraic structure of the code, in the ensemble, two random codewords corresponding
to two distinct indexes are statistically dependent, unless G is a finite field. For the channel coding
problem, when a random codeword corresponding to a given message index is transmitted over the
channel, consider the event that all components of the difference between the codeword transmitted and
a random codeword corresponding to another message index belong to a proper subgroup Hη+θ of G.
18
Then the probability that the latter is decoded instead of the former is higher than the case when no
algebraic structure on the code is enforced. For the source coding problem, when the code is chosen
randomly, consider the event that all components of their difference belong to a proper subgroup Hη+θ
of G. Then if one of them is a poor representation of a given source sequence, so is the other with a
probability that is higher than the case when no algebraic structure on the code is enforced. This means
that the code size has to be larger so that with high probability one can find a good representation of the
source.
Example: We start with the simple example where G = Z8. In this case, we have P(G) = {2}, r2 = 3,
S(G) = {(2, 1), (2, 2), (2, 3)}, and Q(G) = {(2, 3)}. Let η be the all-zero vector and b = 0. For vectors
w, θ and θθθ defined as above, we have w = (w2,1, w2,2, w2,3), θ = (θ2,1, θ2,2, θ2,3) and θθθ = θθθ2,3,1. Recall
that the ensemble of Abelian group codes used in the random coding argument consists of the set of all
homomorphisms from some J = Zkw2,1
2 ⊕Zkw2,2
4 ⊕Zkw2,3
8 , and hence the vector of weights w determines
the input group of the homomorphism. Any vector θ = (θ2,1, θ2,2, θ2,3) with 0 ≤ θ2,1 ≤ 1, 0 ≤ θ2,2 ≤ 2
and 0 ≤ θ2,3 ≤ 3 corresponds to a subgroup Kθ of the input group J given by
Kθ = 2θ2,1Zkw2,1
2 ⊕ 2θ2,2Zkw2,2
4 ⊕ 2θ2,3Zkw2,3
8
Similarly, any θθθ(η + θ) = θθθ2,3,1 corresponds to a subgroup Hη+θ = 2θθθ2,3,1Z8 of the group G.
Example: Next, we consider the case where G = Z4⊕Z3. In this case, we have P(G) = {2, 3}, r2 = 2,
r3 = 1, S(G) = {(2, 1), (2, 2), (3, 1)}, and Q(G) = {(2, 2), (3, 1)}. For vectors w, θ and θθθ defined as
before, we have w = (w2,1, w2,2, w3,1), θ = (θ2,1, θ2,2, θ3,1) and θθθ = (θθθ2,2,1, θθθ3,1,1). Let η be the all-zero
vector and b = 0. The ensemble of Abelian group codes consists of the set of all homomorphisms from
some J = Zkw2,1
2 ⊕ Zkw2,2
4 ⊕ Zkw3,1
3 . Any vector θ = (θ2,1, θ2,2, θ3,1) with 0 ≤ θ2,1 ≤ 1, 0 ≤ θ2,2 ≤ 2
and 0 ≤ θ3,1 ≤ 1 corresponds to a subgroup Kθ of the input group J given by
Kθ = 2θ2,1Zkw2,1
2 ⊕ 2θ2,2Zkw2,2
4 ⊕ 3θ3,1Zkw3,1
8
Similarly, any θθθ(η + θ) = (θθθ2,2,1, θθθ3,1,1) corresponds to a subgroup Hη+θ = 2θθθ2,2,1Z4 ⊕ 3θθθ3,1,1Z3 of the
group G.
VI. PROOF OF CONVERSE
Consider an arbitrary shifted group code C with parameters (n, k, w). We assume that the associated
homomorphism is a one-to-one mapping. We can express the code compactly as follows:
C =
n⊕i=1
⊕(p,r,m)=G(G)
rp∑s=1
ap,sg(i)(p,s)→(r,m) +B(i)
: ap,s ∈ Zkwp,sps ,∀(p, s) ∈ S(G)
.
19
For every pair of vectors (η, b) as defined in Section V-A, define
Γη,b ={i ∈ [1, n] : g
(i)(p,s)→(r,m) ∈ p
ηp,r,m,s+|r−s|+Zkwp,spr \pηp,r,m,s+1+|r−s|+Z
kwp,spr , B(i) = b,∀(p, r,m, s)
}Let θ be an arbitrary vector whose components θp,s are indexed by (p, s) ∈ S(G) and satisfy 0 ≤ θp,s ≤
s. Construct a one-to-one correspondence ap,s ↔ (ap,s, ap,s) where ap,s ∈ pθp,sZkwp,sps and ap,s ∈ Zkwp,spθp,s.
A. Channel Coding
For an arbitrary ap,s ∈ Zkwp,spθp,s, consider the following subcode of C:
C1(θ, a) =
n⊕i=1
⊕(p,r,m)=G(G)
rp∑s=1
(ap,s + ap,s)g(i)(p,s)→(r,m) +B(i)
: ap,s ∈ pθp,sZkwp,sps ,∀(p, s) ∈ S(G)
.
The rate of the code C1(θ, a) is given by (1− ωθ)kn
∑(p,s)∈S(G) swp,s log p.
For a given channel (X = G,Y,WY |X), suppose that rate R is achievable using group codes. Consider
an arbitrary ε > 0. This implies that there exists a shifted group code C with parameters (n, k, w) that
yields a maximal error probability τ such that τ ≤ ε and kn
∑(p,s)∈S(G) swp,s log p ≥ R − ε. Since the
maximal error probability of the code is τ , the average error probability is no greater than τ . Using a
uniform distribution on a, we let Xi denote the random channel input at the ith channel use induced by
this code.
For the subcode C1(θ, a), the average error probability is no greater than τ . Using the fact that a is
uniformly distributed over its range, for i ∈ Γη,b, in the code C1(θ, a), the channel input Xi(θ, a) at the
ith channel use has the following distribution
P (Xi(θ, a) = β) =∏
(p,r,m)∈G(G)
p−|r−θθθ(η+θ)(p,r,m)|+ =1
|Hη+θ|,
if β(p,r,m) ∈∑rp
s=1 ap,sg(i)(p,s)→(r,m) +bp,r,m+pθθθ(η+θ)(p,r,m)Zpr for all (p, r,m) ∈ G(G), and P (Xi(θ, a) =
β) = 0 otherwise. Using Fano’s inequality, and the standard information theoretic arguments, we have
for every θ with 0 ≤ θp,s ≤ s, and a with ap,s ∈ Zkwp,spθp,s
(1− ωθ)(R− ε)(1− τ)− 1
n≤ 1
n
n∑i=1
I(Xi(θ, a);Yi)
20
Hence, averaging with uniform distribution on a, we get for all θ 6= sss,
(1− ωθ)(R− ε)(1− τ)− 1
n≤ 1
n
n∑i=1
∑a
P (a)I(Xi(θ, a);Yi) (20)
(a)=∑η,b
∑i∈Γ(η,b)
1
n
∑a
P (a)I(Xi;Yi|Xi ∈ aggg(i) + b+Hη+θ) (21)
(b)=∑η,b
∑i∈Γ(η,b)
1
nI(Xi;Yi|Xi ∈ Aggg(i) + b+Hη+θ) (22)
(c)=∑η,b
|Γ(η, b)|n
I(Xη,b;Y |[Xη,b]θ) (23)
where in (a) we have expressed⊕
(p,r,m)∈G(G)
∑rps=1 ap,sg
(i)(p,s)→(r,m) +bp,r,m as aggg(i) +b, in (b) A denotes
the random variable corresponding to a, in (c) we have used the fact that Aggg(i) + b+Hη+θ is uniform
over the set of cosets of Hη+θ + b in Hη + b. Hence the converse follows.
B. Source Coding
Let us express C as
C =
⊕η,b
⊕i∈Γ(η,b)
⊕(p,r,m)=G(G)
rp∑s=1
ap,sg(i)(p,s)→(r,m) + b
: ap,s ∈ Zkwp,sps ,∀(p, s) ∈ S(G)
.
Consider the following code:
C2(θ) =
⊕η,b
⊕i∈Γ(η,b)
⊕(p,r,m)=G(G)
rp∑s=1
ap,sg(i)(p,s)→(r,m) + b+Hη+θ
: ap,s ∈ Zkwp,sps ,∀(p, s) ∈ S(G)
(a)=
⊕η,b
⊕i∈Γ(η,b)
⊕(p,r,m)=G(G)
rp∑s=1
ap,sg(i)(p,s)→(r,m) + b+Hη+θ
: ap,s ∈ Zkwp,spθp,s, ∀(p, s) ∈ S(G)
where (a) follows from the fact that for i ∈ Γ(η, b), we have ⊕
(p,r,m)=G(G)
rp∑s=1
ap,sg(i)(p,s)→(r,m) : ap,s ∈ pθp,sZkwp,sps , ∀(p, s) ∈ S(G)
= Hη+θ.
The rate of the code C2(θ) is given by ωθkn
∑(p,s)∈S(G) swp,s log p.
For a given source (X ,U = G, pX , d), suppose the pair of rate and distortion (R,D) is achievable
using group codes. Consider an arbitrary ε > 0. This implies that there exists a shifted group code C with
parameters (n, k, w) that yields a distortion ∆ such that ∆ ≤ D+ε, kn∑
(p,s)∈S(G) swp,s log p ≤ R+ε, and1n
∑ni=1[log Θi−H(Ui)] ≤ ε. Let Un = Dec(Enc(Xn)), induced by the code C. Note that for i ∈ Γ(η, b),
the ith sample Ui takes values in Hη + b. Let [Ui]θ = Ui + Hη+θ denote the unique coset of Hη+θ in
21
Hη + b that contains Ui. Observe that [Ui]θ is a function of Ui for i ∈ [1, n]. Let [Un]θ =⊕
i∈[1,n][Ui]θ.
Note that [Un]θ ∈ C2(θ). Hence we get the following conditions for the rate of the code for all θ 6= 000:
ωθ(R+ ε) ≥ 1
nH([Un]θ) =
1
nI([Un]θ;X
n) ≥ 1
n
n∑i=1
I([Ui]θ;Xi) (24)
=∑η,b
|Γ(η, b)|n
∑i∈Γ(η,b)
1
|Γ(η, b)|I([Ui]θ;Xi) (25)
(a)=∑η,b
|Γ(η, b)|n
∑i∈Γ(η,b)
1
|Γ(η, b)|I(PX , Pθ,i) (26)
(b)
≥∑η,b
|Γ(η, b)|n
I
PX , ∑i∈Γ(η,b)
1
|Γ(η, b)|Pθ,i
, (27)
(c)
≥∑η,b
|Γ(η, b)|n
I([Uη,b]θ;X). (28)
Here (a) follows by denoting the conditional probability P ([Ui]θ = u|Xi = x) as Pθ,i(u|x), and writing
explicitly the mutual information as a function of the source distribution and the conditional distribution of
the corresponding function of the reconstruction given the source. (b) follows from convexity of mutual
information. In (c) [Uη,b]θ denote the random variable which is related to X through the conditional
probability distribution 1|Γ(η,b)|
∑i∈Γ(η,b) Pθ,i.
Note that for θ = sss we have [Uη,b]sss = Uη,b. Hence we have the following conditions regarding the
distortion of the code:
D + ε ≥ ∆ ≥ 1
n
n∑i=1
E(d(Xi, Ui)) =∑η,b
|Γ(η, b)|n
∑i∈Γ(η,b)
1
|Γ(η, b)|∑x∈X
PX(x)∑
u∈Hη+b
Psss,i(u|x)d(x, u)
=∑η,b
|Γ(η, b)|n
∑x∈X
PX(x)∑
u∈Hη+b
P (Uη,b = u|X = x)d(x, u) (29)
=∑η,b
|Γ(η, b)|n
Ed(X,Uη,b). (30)
Finally, we ensure that Uη,b is distributed nearly uniformly over its range. For every η and b, let
Uη,b be uniformly distributed over Hη + b. Now using the relation between the variational distance and
22
information divergence [13, p.58], we have∑η,b
|Γ(η, b)|2n ln 2
d2v(Uη,b, Uη,b) ≤
∑η,b
|Γ(η, b)|n
D(Uη,b‖Uη,b) (31)
=∑η,b
|Γ(η, b)|n
D
1
|Γ(η, b)|∑
i∈Γ(η,b)
Ui
∥∥∥∥Uη,b (32)
≤∑η,b
|Γ(η, b)|n
1
|Γ(η, b)|∑
i∈Γ(η,b)
D(Ui‖Uη,b) (33)
=∑η,b
1
n
∑i∈Γ(η,b)
D(Ui‖Ui) =1
n
n∑i=1
[log Θi −H(Ui)] ≤ ε. (34)
The converse follows from the continuity of mutual information.
VII. ACHIEVABILITY
A. Channel Coding
We give proof of only the essential elements for conciseness. Fix a triple of code parameters (n, k, w).
Let R denote the rate of the code. First we consider the special case where for every channel use, the
input can take all possible values in G. This corresponds to the choice: αη∗,b = 1, where η∗ is the all-
zero vector and b is arbitrary, and hence Hη∗ = G. Generalization to arbitrary probability distributions
is relatively straightforward.
1) Encoding and Decoding: Following the analysis of Section IV-B, we construct the ensemble of
group codes of length n over G as the image of all homomorphisms φ from some Abelian group J into
Gn where J and Gn are as in Equations (11) and (9), respectively. The random homomorphism φ is
described in Equation (13).
To find an achievable rate, we use a random coding argument in which the random encoder is
characterized by the random homomorphism φ and a random vector B uniformly distributed over Gn.
Given a message a ∈ J , the encoder maps it to x = φ(a) + B and x is then fed to the channel. At
the receiver, after receiving the channel output y ∈ Yn, the decoder looks for a unique a ∈ J such that
φ(a) + B is jointly typical with y with respect to the distribution PXWY |X where PX is uniform over
G. If the decoder does not find such a or if such a is not unique, it declares error.
2) Error Analysis: Let a, x and y be the message, the channel input and the channel output, respec-
tively. The error event can be characterized by the union of two events: E(a) = E1(a) ∪ E2(a) where
E1(a) is the event that φ(a) +B is not jointly typical with y and E2(a) is the event that there exists a
23
a 6= a such that φ(a) + B is jointly typical with y. We can provide an upper bound on the probability
of the error event as P (E(a)) ≤ P (E1(a)) + P (E2(a) ∩ (E1(a))c). Using the standard approach, one
can show that P (E1(a)) → 0 as n → ∞. The probability of the error event E2(a) ∩ (E1(a))c can be
written as
Pavg(E2(a) ∩ (E1(a))c) =∑x∈Gn
1{φ(a)+B=x}∑
y∈Anε (Y |x)
WnY |X(y|x)1{∃a∈J :a6=a,φ(a)+B∈Anε (X|y)}
The expected value of this probability over the ensemble is given by E{Pavg(E2(a)∩ (E1(a))c)} = Perr
where
Perr =∑x∈Gn
∑y∈Anε (Y |x)
WnY |X(y|x)P (φ(a) +B = x,∃a ∈ J : a 6= a, φ(a) +B ∈ Anε (X|y))
Using the union bound, we have
Perr ≤∑x∈Gn
∑y∈Anε (Y |x)
∑a∈Ja6=a
∑x∈Anε (X|y)
WnY |X(y|x)P (φ(a) +B = x, φ(a) +B = x)
We need the following lemmas to proceed.
Lemma VII.1. For a, a ∈ J , x, x ∈ Gn and for (p, s) ∈ Q(J) = S(G), let θp,s ∈ {0, 1, · · · , s} be such
that
ap,s − ap,s ∈ pθp,sZkwp,sps \pθp,s+1Zkwp,sps
Then,
P (φ(a) +B = x, φ(a) +B = x) =
1|G|n
1|Hη∗+θ|n
If x− x ∈ Hnη∗+θ
0 Otherwise
Proof: The proof is provided in Appendix IX-B
Lemma VII.2. Let X be a random variable taking values from the group G and for a subgroup H of
G, define [X] = X +H . For y ∈ Anε (Y ) and x ∈ Anε (X|y), let z = [x] = x+Hn. Then we have
(x+Hn) ∩Anε (X|y) = Anε (X|zy)
and
(1− ε)2n[H(X|Y [X])−O(ε)] ≤ |(x+Hn) ∩Anε (X|y)| ≤ 2n[H(X|Y [X])+O(ε)]
Proof: The proof is provided in Appendix IX-C
24
For a ∈ J , and for θ with θp,s ∈ {0, 1, · · · , s} for all (p, s) ∈ S(G), let
Tθ(a) = {a ∈ J |ap,s − ap,s ∈ pθp,sZkwp,sps \pθp,s+1Zkwp,sps ,∀(p, s) ∈ S(G)}
It follows that for all a ∈ J
|Tθ(a)| =∏
(p,s)∈S(G)
p(s−θp,s)kwp,s ,
and
Perr ≤∑x∈Gn
∑y∈Anε (Y |x)
∑θ 6=sss
∑a∈Tθ(a)
∑x∈Anε (X|y)
WnY |X(y|x)P (φ(a) +B = x, φ(a) +B = x)
Using Lemmas VII.1, and VII.2, we have
Perr ≤∑θ 6=sss
∑x∈Gn
∑y∈Anε (Y |x)
∑a∈Tθ(a)
∑x∈Anε (X|y)x∈x+Hn
η∗+θ
WnY |X(y|x)
1
|G|n1
|Hη∗+θ|n
≤∑θ 6=sss
∑x∈Gn
∑y∈Anε (Y |x)
∑a∈Tθ(a)
WnY |X(y|x)2n[H(Xη∗,b|Y,[Xη∗,b]θ)+O(ε)] 1
|G|n1
|Hη∗+θ|n
≤∑θ 6=sss
∏(p,s)∈S(G)
p(s−θp,s)kwp,s
2n[H(Xη∗,b|Y,[Xη∗,b]θ)+O(ε)] 1
|Hη∗+θ|n
=∑θ 6=sss
exp2
nkn
∑(p,s)∈S(G)
(s− θp,s)wp,s log p+H(Xη∗,b|Y, [Xη∗,b]θ)− log |Hη∗+θ|+O(ε)
Recall that R = k
n
∑(p,s)∈S(G) swp,s log p. Noting that Perr is the probability of error for message a
averaged over the ensemble, in order for the maximal error probability to go to zero, we require the
exponent of all the terms to be negative; or equivalently, for all θ 6= sss,
R
∑(p,s)∈S(G)(s− θp,s)wp,s log p∑
(p,s)∈S(G) swp,s log q< log |Hη∗+θ| −H(Xη∗,b|Y, [Xη∗,b]θ)−O(ε).
Therefore, the achievability conditions are
R ≤ 1
1− ωθI(Xη∗,b;Y |[Xη∗,b]θ)
for all θ 6= sss. This means that the following rate is achievable
R = minθ 6=sss
1
1− ωθI(Xη∗,b;Y |[Xη∗,b]θ).
For a general α, we use an extended random coding argument. We construct the ensemble of group
codes of length n over G as the image of all homomorphisms φ from some Abelian group J into⊕η,b (Hη + b)nαη,b . The random homomorphism φ is described in Equation (13). Given a message
25
a ∈ J , the encoder maps it to x = φ(a) + B and x is then fed to the channel. At the receiver, after
receiving the channel output y ∈ Yn, the decoder looks for a unique a ∈ J such that for every (η, b),
the appropriate set of nαη,b samples of φ(a) + B is jointly typical with the corresponding set of nαη,b
samples of y with respect to the distribution Pη,bWY |X where Pη,b is uniform over Hη+b. If the decoder
does not find such a or if such a is not unique, it declares error. Using a similar analysis, one gets the
desired achievability result.
3) Z4: Evaluating the expression for the group capacity for codes over Z4, we get three non-redundant
terms as follows:
R < supα1,α2,w0min{T1, T2, T3}
where supremum is over all α1 and α2 such that 0 ≤ α1, α2, α1 + α2 ≤ 1, and 0 ≤ w0 ≤ 1 and
T1 = α1I4 + α2I2 + (1 − α1 − α2)I ′2, T2 = (1 + w0)[α1
2 (I2 + I ′2) + α2I2 + (1− α1 − α2)I ′2]
and
T3 = (1+w0)w0
α1
2 (I2 + I ′2), I4 = I(X;Y ), I2 = I(X;Y |X ∈ 2Z4) and I ′2 = I(X;Y |X 6∈ 2Z4). This can
be solved to obtain the following
C = max{min{I4, (I2 + I ′2)}, I2, I′2}.
B. Source Coding
Fix a triple of code parameters (n, k, w). Let R denote the rate of the code. We consider the special
case where αη∗,b = 1, where η∗ is the all-zero vector, and hence Hη∗ = G. Generalization to arbitrary
probability distributions is straightforward. Fix a conditional distribution PU |X on G such that U is
uniform on G.
1) Encoding and Decoding: To find an achievable rate for a distortion level D, we use a random coding
argument as in channel coding. Consider a random shifted group code as C = {φ(a) + B : a ∈ J},
where φ and B are uniformly distributed over their range and are independent of each other. Given the
source output sequence x ∈ X n, the random encoder looks for a codeword u ∈ C such that u is jointly
typical with x with respect to pXU . If it finds at least one such u, it encodes x to u (if it finds more
than one such u it picks one of them at random). Otherwise, it declares error. The decoder outputs u as
the source reconstruction.
2) Error Analysis: Let x = (x1, · · · , xn) and u = (u1, · · · , un) be the source output and the
decoder output, respectively. Note that if the encoder declares no error then since x and u are jointly
typical, (d(xi, ui))i=1,··· ,n is typical with respect to the distribution of d(X,U). Therefore for large n,1nd(x, u) = 1
n
∑ni=1 d(xi, ui) ≈ E{d(X,U)} ≤ D. It remains to show that the rate can be as small as
26
given in the theorem while keeping the probability of encoding error small.
Given the source output x ∈ X n, define
α(x) =∑
u∈Anε (U |x)
1{u∈C} =∑
u∈Anε (U |x)
∑a∈J
1{φ(a)+B=u}
An encoding error occurs if and only if α(x) = 0. We use the following Chebyshev’s inequality to show
that under certain conditions the probability of error can be made arbitrarily small:
P (α(x) = 0) ≤ var{α(x)}E{α(x)}2
We have
E{α(x)} =∑
u∈Anε (U |x)
∑a∈J
P (φ(a) +B = u)
=|Anε (U |x)| · |J |
|G|n
and
E{α(x)2} = E
∑u,u∈Anε (U |x)
∑a,a∈J
1{φ(a)+B=u,φ(a)+B=u}
=
∑u,u∈Anε (U |x)
∑a,a∈J
P ({φ(a) +B = u, φ(a) +B = u})
=∑θ
∑a∈J
∑u∈Anε (U |x)
∑a∈Tθ(a)
∑u∈Anε (U |x)u−u∈Hn
η∗+θ
1
|G|n· 1
|Hη∗+θ|n
Note that the term corresponding to θ = 000 is bounded from above by E{α(x)}2. Using Lemma VII.2,
we have ∣∣∣Anε (U |x) ∩(u+Hn
η∗+θ
)∣∣∣ ≤ 2n[H(Uη∗,b|[Uη∗,b]θ,X)+O(ε)]
Therefore,
var{α} = E{α(x)2} − E{α(x)}2
≤∑θ 6=000
|J | · |Anε (U |x)|
∏(p,s)∈S(G)
p(s−θp,s)kwq,s
2n[H(Uη∗,b|[Uη∗,b]θ,X)+O(ε)]
|G|n · |Hη∗+θ|n
27
Therefore,
P (α(x) = 0) ≤ var{α(x)}E{α(x)}2
≤∑θ 6=000
∏(p,s)∈S(G)
p(s−θp,s)kwq,s
2−n[H(Uη∗,b|X)−H(Uη∗,b|[Uη∗,b]θX)−O(ε)]|G|n
|J | · |Hη∗+θ|n
Note that H(Uη∗,b|X)−H(Uη∗,b|[Uη∗,b]θ, X) = H([Uη∗,b]θ|X) and
|J | =∏
(p,s)∈S(G)
pkswp,s
Therefore,
P (α(x) = 0) ≤∑θ 6=000
exp2
−nH([Uη∗,b]θ|X)−log |Hη∗ : Hη∗+θ|+
k
n
∑(p,s)∈S(G)
θp,swp,s log p−O(ε)
In order for the probability of error to go to zero as n increases, we require the exponent of all the terms
to be negative; or equivalently,
R = maxθ 6=000
1
ωθI([Uη∗,b]θ;X)
is achievable. Using an extension of the random coding argument to general αη,b, similar to that considered
in channel coding, we get the achievability of the theorem.
VIII. CONCLUSION
We derived the achievable set of rates using Abelian group codes for arbitrary discrete memoryless
channels. In the case of linear codes, it simplifies to the symmetric capacity of the channel i.e., the Shannon
capacity with the additional constraint that the channel input distribution is uniformly distributed. For the
case where the underlying group is not a field, we observe that several subgroups of the group appear
in the achievable rate and this causes the rate to be smaller than the symmetric capacity of the channel
in general.
We derived a similar result for the source coding problem; i.e., the achievable rate-distortion function
using Abelian group codes for arbitrary discrete memoryless sources. When the underlying group is a
field, these group codes are linear codes, and this function is equivalent to the symmetric rate-distortion
function i.e., the Shannon rate-distortion function with the additional constraint that the reconstruction
random variable is uniformly distributed. We showed that when the underlying group is not a field, due
to the algebraic structure of the code, certain subgroups of the group appear in the rate-distortion function
and cause a larger rate for a given distortion level.
28
IX. APPENDIX
A. Proof of Lemma IV.2
We first prove that for a homomorphism φ, g(q,s,l)→(p,r,m) satisfies the above conditions. First assume
p 6= q. Note that the only nonzero component of IJ :q,s,l takes values from Zqs and therefore
qsIJ :q,s,l =
(J)︷︸︸︷∑i=1,··· ,qs
IJ :q,s,l = 0
Note that since φ is a homomorphism, we have φ(qsIJ :q,s,l) = 0. On the other hand,
φ(qsIJ :q,s,l) = φ(
(J)︷︸︸︷∑i=1,··· ,qs
IJ :q,s,l)
=
(G)︷︸︸︷∑i=1,··· ,qs
φ(IJ :q,s,l)
=⊕
(p,r,m)∈G(G)
(G)︷︸︸︷∑
i=1,··· ,qsφ(IJ :q,s,l)
p,r,m
=⊕
(p,r,m)∈G(G)
(Zpr )︷︸︸︷∑i=1,··· ,qs
[φ(IJ :q,s,l)]p,r,m
=⊕
(p,r,m)∈G(G)
qs [φ(IJ :q,s,l)]p,r,m
=⊕
(p,r,m)∈G(G)
qsg(q,s,l)→(p,r,m)
Therefore, we have qsg(q,s,l)→(p,r,m) = 0 (mod pr) or equivalently qsg(q,s,l)→(p,r,m) = Cpr for some
integer C. Since p 6= q, this implies pr|g(q,s,l)→(p,r,m) and since g(q,s,l)→(p,r,m) takes value from Zpr , we
have g(q,s,l)→(p,r,m) = 0.
Next, assume p = q and r ≥ s. Note that same as above, we have φ(qsIJ :q,s,l) = 0 and
φ(qsIJ :q,s,l) =⊕
(p,r,m)∈G(G)
qsg(q,s,l)→(p,r,m)
and therefore, qsg(q,s,l)→(p,r,m) = 0 (mod pr). Since g(q,s,l)→(p,r,m) takes values from Zpr and p = q,
this implies pr−s|g(q,s,l)→(p,r,m) or equivalently g(q,s,l)→(p,r,m) ∈ pr−sZpr .
29
Next we show that any mapping described by (7) satisfying the conditions of the lemma is a homo-
morphism. For two elements a, b ∈ J and for (p, r,m) ∈ G(G) we have
[φ(a+ b)]p,r,m =
φ ⊕
(q,s,l)∈G(J)
(aq,s,l +qs bq,s,l)
p,r,m
=
φ
(J)︷︸︸︷∑(q,s,l)∈G(J)
(aq,s,l +qs bq,s,l)IJ :q,s,l
p,r,m
=
φ
(J)︷︸︸︷∑(q,s,l)∈G(J)
(J)︷︸︸︷∑i=1,··· ,aq,s,l+qsbq,s,l
IJ :q,s,l
p,r,m
=
(G)︷︸︸︷∑
(q,s,l)∈G(J)
(G)︷︸︸︷∑i=1,··· ,aq,s,l+qsbq,s,l
φ (IJ :q,s,l)
p,r,m
=
(Zpr )︷︸︸︷∑(q,s,l)∈G(J)
(Zpr )︷︸︸︷∑i=1,··· ,aq,s,l+qsbq,s,l
[φ (IJ :q,s,l)]p,r,m
=
(Zpr )︷︸︸︷∑(q,s,l)∈G(J)
(Zpr )︷︸︸︷∑i=1,··· ,aq,s,l+qsbq,s,l
g(q,s,l)→(p,r,m) (35)
On the other hand, we have
[φ(a) + φ(b)]p,r,m = [φ(a)]p,r,m +pr [φ(b)]p,r,m
=
(Zpr )︷︸︸︷∑
(q,s,l)∈G(J)
aq,s,lg(q,s,l)→(p,r,m)
+pr
(Zpr )︷︸︸︷∑
(q,s,l)∈G(J)
bq,s,lg(q,s,l)→(p,r,m)
=
(Zpr )︷︸︸︷∑
(q,s,l)∈G(J)
(Zpr )︷︸︸︷∑i=1,··· ,aq,s,l
g(q,s,l)→(p,r,m)
+pr
(Zpr )︷︸︸︷∑
(q,s,l)∈G(J)
(Zpr )︷︸︸︷∑i=1,··· ,bq,s,l
g(q,s,l)→(p,r,m)
=
(Zpr )︷︸︸︷∑(q,s,l)∈G(J)
(Zpr )︷︸︸︷∑i=1,··· ,aq,s,l+bq,s,l
g(q,s,l)→(p,r,m) (36)
where the addition in aq,s,l + bq,s,l is the integer addition.
In order to show that φ is a homomorphism, it suffices to show that under the conditions of the lemma,
Equations (35) and (36) are equivalent. We show that for a fixed (q, s, l) ∈ G(J), if the conditions of the
30
lemma are satisfied, then(Zpr )︷︸︸︷∑
i=1,··· ,aq,s,l+bq,s,l
g(q,s,l)→(p,r,m) =
(Zpr )︷︸︸︷∑i=1,··· ,aq,s,l+qsbq,s,l
g(q,s,l)→(p,r,m) (37)
Note that if p 6= q, then both summations are zero. Note that we have(Zpr )︷︸︸︷∑
i=1,··· ,aq,s,l+bq,s,l
g(q,s,l)→(p,r,m) =
(Zpr )︷︸︸︷∑i=1,··· ,(aq,s,l+bq,s,l) (mod pr)
g(q,s,l)→(p,r,m)
and(Zpr )︷︸︸︷∑
i=1,··· ,aq,s,l+qsbq,s,l
g(q,s,l)→(p,r,m) =
(Zpr )︷︸︸︷∑i=1,··· ,(aq,s,l+qsbq,s,l) (mod pr)
g(q,s,l)→(p,r,m)
If p = q and r ≤ s, then we have (aq,s,l +qs bq,s,l) (mod pr) = (aq,s,l + bq,s,l) (mod pr) and hence it
follows that Equation (37) is satisfied. If p = q and r ≥ s, since g(q,s,l)→(p,r,m) ∈ pr−sZpr we have
(Zpr )︷︸︸︷∑i=1,··· ,aq,s,l+bq,s,l
g(q,s,l)→(p,r,m) =
(Zpr )︷︸︸︷∑i=1,··· ,(aq,s,l+bq,s,l) (mod ps)
g(q,s,l)→(p,r,m)
and hence it follows that Equation (37) is satisfied.
B. Proof of Lemma VII.1
Note that since g(q,s,l)→(p,r,m)’s and B are uniformly distributed, in order to find the desired joint
probability, we need to count the number of choices for g(q,s,l)→(p,r,m)’s and B such that for (p, r,m) ∈
G(Gn), (Zpr )︷︸︸︷∑
(q,s,l)∈G(J)
aq,s,lg(q,s,l)→(p,r,m)
+pr Bp,r,m = up,r,m
(Zpr )︷︸︸︷∑
(q,s,l)∈G(J)
aq,s,lg(q,s,l)→(p,r,m)
+pr Bp,r,m = up,r,m
and divide this number by the total number of choices which is equal to
|G|n ·∏
(p,r,m)∈G(Gn)
∏(q,s,l)∈G(J)
q=p
pmin(r,s) = |G|n ·
∏(p,r,m)∈G(G)
∏(q,s,l)∈G(J)
q=p
pmin(r,s)
n
31
where the term pmin(r,s) appears since the number of choices for g(q,s,l)→(p,r,m) is pr if p = q, r ≤ s and
is equal to ps if p = q, r ≥ s. Since B can take values arbitrarily from Gn, the number of choices for
the above set of conditions is equal to the number of choices for g(q,s,l)→(p,r,m)’s such that,(Zpr )︷︸︸︷∑
(q,s,l)∈G(J)
(aq,s,l − aq,s,l)g(q,s,l)→(p,r,m)
= up,r,m − up,r,m
Let θp,s,l ∈ {0, 1, . . . , s} be such that ap,s,l − ap,s,l ∈ pθp,s,lZps\pθp,s,l+1Zps . Note
θp,s = min{1≤l≤kwp,s}
θp,s,l.
Note that for all (q, s, l) ∈ G(J), (aq,s,l−aq,s,l)g(q,s,l)→(p,r,m) ∈ pθθθp,r,mZpr . Therefore we require up,r,m−
up,r,m ∈ pθθθp,r,mZpr and therefore we require u− u ∈ Hnη∗+θ
or otherwise the probability would be zero.
For fixed p ∈ P(G) and r ∈ Rp(G), let (q∗, s∗, l∗) ∈ G(J) be such that q∗ = p and
θθθ(η∗ + θ)p,r,m = |r − s∗|+ + θq∗,s∗,l∗ .
For fixed (p, r,m) ∈ G(Gn), and for (q, s, l) 6= (q∗, s∗, l∗), choose g(q,s,l)→(p,r,m) arbitrarily from it’s
domain. The number of choices for this is equal to∏
(p,r,m)∈G(G)
∏(q,s,l)∈G(J)
q=p(q,s,l)6=(q∗,s∗,l∗)
pmin(r,s)
n
For each (p, r,m) ∈ G(Gn), we need to have
(aq∗,s∗,l∗ − aq∗,s∗,l∗)g(q∗,s∗,l∗)→(p,r,m) = up,r,m − up,r,m −
(Zpr )︷︸︸︷∑
(q,s,l)∈G(J)(q,s,l)6=(q∗,s∗,l∗)
(aq,s,l − aq,s,l)g(q,s,l)→(p,r,m)
Note that the right hand side is included in pθθθp,r,mZpr and (aq∗,s∗,l∗−aq∗,s∗,l∗) is included in pθq∗,s∗,l∗Z(q∗)(s∗) .
We need to count the number of solutions for g(q∗,s∗,l∗)→(p,r,m) in p|r−s∗|+Zpr . Using Lemma IX.1, we
can show that the number of solutions is equal to pθq∗,s∗,l∗ . The total number of solutions for φ is equal
to
∏(p,r,m)∈G
∏(q,s,l)∈G(J)
q=p(q,s,l) 6=(q∗,s∗,l∗)
pmin(r,s)
· pθq∗,s∗,l∗n
32
Hence we have
P (φ(a) +B = u, φ(a) +B = u) =
∏(p,r,m)∈G(G)
pθq∗,s∗,l∗ ·∏ (q,s,l)∈G(J)q=p
(q,s,l)6=(q∗,s∗,l∗)
pmin(r,s)
n
[∏(p,r,m)∈G(G)
∏(q,s,l)∈G(J)
q=ppmin(r,s)
]n
=
∏
(p,r,m)∈G(G)
∏(q,s,l)∈G(J)
q=p(q,s,l)=(q∗,s∗,l∗)
pθq∗,s∗,l∗
pmin(r,s)
n
Note that for (q, s, l) = (q∗, s∗, l∗) we have
min(r, s) = min(r, s∗) = r − |r − s∗|+ = r −(θθθp,r,m − θq∗,s∗,l∗
)Therefore, the above probability is equal to
∏(p,r,m)∈G
∏(q,s,l)∈G(J)
q=p(q,s,l)=(q∗,s∗,l∗)
pθq∗,s∗,l∗
pr−(θθθp,r,m−θq∗,s∗,l∗)
n
=
∏
(p,r,m)∈G
∏(q,s,l)∈G(J)
q=p(q,s,l)=(q∗,s∗,l∗)
1
pr−θθθp,r,m
n
=
∏(p,r,m)∈G
pθθθp,r,m
pr
n =1
|Hη∗+θ|n
Since the dither B is uniform, we conclude that
P
φ(u) +B = x
φ(u) +B = x
=1
|G|n1
|Hη∗+θ|n
C. Proof of Lemma VII.2
First, we show that (x+Hn) ∩ Anε (X|y) is contained in Anε (X|zy). Since z is a function of x, we
have (x, z, y) ∈ Anε (X, [X], Y ). For x′ ∈ (x+Hn) ∩Anε (X|y), we have [x′] = x′ +Hn = x+Hn = z
and (x′, z, y) = (x′, [x′], y) ∈ Anε (X, [X], Y ). Therefore, x′ ∈ Anε (X|zy) and hence,
(x+Hn) ∩Anε (X|y) ⊆ Anε (X|zy)
Conversely, for x′ ∈ Anε (X|zy), since (x, z) ∈ Anε (X, [X]) where [X] is a function of X , we have
[x′] = z. This implies x′ ∈ z + Hn = x + Hn. Clearly, we also have x′ ∈ Anε (X|y). The claim on the
size of the set follows since (z, y) ∈ Anε ([X]Y ).
33
D. Useful Lemma
Lemma IX.1. Let p be a prime and s, r a positive integer such that s ≤ r. For a ∈ Zps and b ∈ Zpr ,
let 0 ≤ θ ≤ s and θ ≤ θ ≤ r be such that
a ∈ pθZps\pθ+1Zps
b ∈ pθZpr
Write a = pθα for some invertible element α ∈ Zpr and b = pθβ for some β ∈ β ∈ {0, 1, · · · , pr−θ−1}.
Then, the set of solutions to the equation ax (mod pr) = b is{pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1
}Proof: Note that the representation of b as b = pθβ is not unique and for any β of the form
β = β+ ipr−θ for i = 0, 1, · · · , pθ−1, b can be written as pθβ. Also, the representation of a as a = pθα
is not unique and for any α = α+ ipr−θ for i = 0, 1, · · · , pθ − 1, we have a = pθα. The set of solutions
to ax = b is identical to the set of solutions to pθx = pθα−1β. The set of solutions to the latter is{pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1
}It remains to show that this set of solutions is independent of the choice of α and β. First, we show that
the set of solutions is independent of the choice of β. For β = β+jpr−θ for some j ∈ {0, 1, · · · , pθ2−1},
we have {pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1
}={pθ−θα−1
(β + jpr−θ
)+ iα−1pr−θ|i = 0, 1, · · · , pθ − 1
}={pθ−θα−1β + (i+ j)α−1pr−θ|i = 0, 1, · · · , pθ − 1
}(a)={pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1
}where (a) follows since the set pr−θ{0, 1, · · · , pθ − 1} is a subgroup of Zpr and jpr−θ lies in this set.
Next, we show that the set of solutions is independent of the choice of α. For α = α+ jpr−θ for some
j ∈ {0, 1, · · · , pθ − 1}, we have
α(α−1 − α−1jpr−θα−1
)= 1
34
Therefore, it follows that the unique inverse of α satisfies α−1 − α−1 ∈ α−1pr−θZpr . Assume α−1 =
α−1 + kα−1pr−θ. We have,{pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1
}={pθ−θ
(α−1 + kα−1pr−θ
)β + i
(α−1 + kα−1pr−θ
)pr−θ|i = 0, 1, · · · , pθ − 1
}={pθ−θα−1β +
(i+ ikpr−θ + kβpθ−θ
)α−1pr−θ|i = 0, 1, · · · , pθ − 1
}(a)={pθ−θα−1β + iα−1pr−θ|i = 0, 1, · · · , pθ − 1
}where same as above, (a) follows since the set pr−θ{0, 1, · · · , pθ−1} is a subgroup of Zpr and (ikpr−θ+
kβpθ−θ)pr−θ lies in this set.
REFERENCES
[1] T. J. Goblick, Jr., “Coding for a discrete information source with a distortion measure”, Ph.D. dissertation, Dept. Electr.
Eng., MIT , Cambridge, MA, 1962.
[2] H. A. Loeliger and T. Mittelholzer, “Convolutional codes over groups”, IEEE Trans. Inform. Theory,vol. 42, no. 6, pp. 1660–
1686, November 1996.
[3] H. A. Loeliger, “Signal sets matched to groups”, IEEE Trans. Inform. Theory, vol. 37, no. 6, pp. 1675–1682, November
1991.
[4] I. Csiszar, “Linear codes for sources and source networks: Error exponents, universal coding,” IEEE Trans. Inform. Theory,
vol. IT- 28, pp. 585–592, July 1982.
[5] S. Shridharan, A. Jafarian, S. Vishwanath, and S. A. Jafar, “Lattice Coding for K User Gaussian Interference Channels”,
IEEE Transactions on Information Theory, April 2010. .
[6] R. Ahlswede. Group codes do not achieve Shannons’s channel capacity for general discrete channels. The annals of
Mathematical Statistics, 42(1):224–240, Feb. 1971.
[7] R. Ahlswede and J. Gemma. Bounds on algebraic code capacities for noisy channels I. Information and Control, 19(2):124–
145, 1971.
[8] R. Ahlswede and J. Gemma. Bounds on algebraic code capacities for noisy channels II. Information and Control,
19(2):146–158, 1971.
[9] N. J. Bloch. Abstract Algebra With Applications. Prentice-Hall, Inc, Englewood Cliffs, New Jersey, 1987.
[10] G. Como. Group codes outperform binary coset codes on non-binary symmetric memoryless channels. IEEE
Trans. Information Theory, 56(9):4321–4334, Sept. 2010.
[11] G. Como and F. Fagnani. The capacity of finite abelian group codes over symmetric memoryless channels. IEEE
Transactions on Information Theory, 55(5):2037–2054, 2009.
[12] J. H. Conway and N. J. A. Sloane. Sphere Packings, Lattices and Groups. Springer-Verlag, New York, 1988.
[13] I. Csiszar and J. Korner. Information theory: Coding theorems for Discrete memoryless Systems. 1981.
[14] R. L. Dobrushin. Asymptotic optimality of group and systematic codes for some channels. Theor. Probab. Appl., 8:47–59,
1963.
35
[15] R. L. Dobrusin. Asymptotic bounds of the probability of error for the transmission of messages over a discrete memoryless
channel with a symmetric transition probability matrix. Teor. Veroyatnost. i Primenen, pages 283–311, 1962.
[16] P. Elias. Coding for noisy channels. IRE Conv. Record, part. 4:37–46, 1955.
[17] U. Erez and S. tenBrink. A close-to-capacity dirty paper coding scheme. IEEE Trans. Inform. Theory, 51:3417–3432,
October 2005.
[18] S. Litsyn G. Cohen, I. Honkala and A. Lobstein. Covering Codes. Elsevier-North-Holland, Amsterdam, 1997.
[19] R. Garello and S. Benedetto. Multilevel construction of block and trellis group codes. IEEE Trans. Inform. Theory,
41:1257–1264, Sep. 1995.
[20] G. D. Forney Jr and M. Trott. The dynamics of group codes: State spaces, trellis diagrams, and canonical encoders. IEEE
Transactions on Information Theory, 39(9):1491–1513, 1993.
[21] M. Hall Jr. The Theory of Groups. The Macmillan Company, New York, 1959.
[22] J. Korner and K. Marton. How to encode the modulo-two sum of binary sources. IEEE Transactions on Information
Theory, IT-25:219–221, Mar. 1979.
[23] D. Krithivasan and S. S. Pradhan. Distributed source coding using abelian group codes. 2011. IEEE Transactions on
Information Theory(57)1495-1519.
[24] F. J. MacWilliams and N. J. A. Sloane. The theory of error-correcting codes. Elsevier-North-Holland, 1977.
[25] B. A. Nazer and M. Gastpar. Computation over multiple-access channels. IEEE Transactions on Information Theory,
53(10):3498–3516, Oct. 2007.
[26] A. Padakandla and S.S. Pradhan. A new coding theorem for three user discrete memoryless broadcast channel. 2012.
Online: http://128.84.158.119/abs/1207.3146v2.
[27] A. Padakandla, A.G. Sahebi, and S.S. Pradhan. A new achievable rate region for the 3-user discrete memoryless interference
channel. In Information Theory Proceedings (ISIT), 2012 IEEE International Symposium on, pages 2256–2260, 2012.
[28] A. Padakandla, A.G. Sahebi, and S.S. Pradhan. An achievable rate region for the 3-user interference channel based on
coset codes. 2014. Online: http://arxiv.org/abs/1403.4583.
[29] W. Park and A. Barg. Polar codes for q-ary channels, q = 2r . 2012. Online: http://arxiv.org/abs/1107.4965.
[30] T. Philosof and R. Zamir. On the loss of single-letter characterization: The dirty multiple access channel. IEEE Transactions
on Information Theory, 55:2442–2454, June 2009.
[31] S. S. Pradhan and K. Ramchandran. Distributed source coding using syndromes (DISCUS): Design and construction. IEEE
Transactions on Information Theory, 49(3):626–643, 2003.
[32] A. G. Sahebi and S. S. Pradhan. Multilevel Polarization of Polar Codes Over Arbitrary Discrete Memoryless Channels.
Proc. 49th Allerton Conference on Communication, Control and Computing, Sept. 2011.
[33] E. Sasoglu, E Telatar, and E Arikan. Polarization for arbitrary discrete memoryless channels. IEEE Information Theory
Worshop, Dec. 2009. Lausanne, Switzerland.
[34] D. Slepian. Group codes for for the Gaussian channel. Bell Syst. Tech. Journal, 1968.
[35] D. Slepian and J. K. Wolf. A coding theorem for multiple access channels with correlated sources. bell Syst. tech. J.,
52:1037–1076, 1973.
[36] A. D. Wyner. Recent results in the Shannon theory. IEEE Trans. Inform. Theory, IT-20:2–10, January 1974.
[37] R. Zamir, S. Shamai, and U. Erez. Nested linear/lattice codes for structured multiterminal binning. IEEE Trans. Inform.
Theory, 48:1250–1276, June 2002.