classical and quantum low-density parity-check codes: constructions and properties

8/18/2019 Classical and Quantum Low-density parity-check codes: Constructions and properties

1/39

Classical and Quantum Low-density

parity-check codes: Constructions and

properties

Pavithran Iyer

Doctorat En Physique, Universitè de Sherbrooke

email: [email protected]

Term project for the course

QIC890: Quantum Error Correction and Fault Tolerance

at

Institute for Quantum Computing (IQC), University of Waterloo.

Instructors: Robert König and Daniel Gottesman

Submitted on April 4, 2014

1

mailto:[email protected]:[email protected]


2/39

Contents

1 Introduction 2

2 Essentials of classical coding theory 3

2.1 Linear codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2 Graphical representation of linear codes . . . . . . . . . . . . . . . . . . . . . . . . . 4

3 Classical LDPC codes 5

3.1 Shannon’s construction of parity check codes . . . . . . . . . . . . . . . . . . . . . . 6

3.2 The Gallager code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 Essentials of quantum coding theory 18

4.1 The quantum setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.2 Stabilizer formalism and graphical representation . . . . . . . . . . . . . . . . . . . . 19

4.3 CSS construction of quantum codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5 Quantum LDPC codes 21

5.1 Constructing the Hypergraph product codes . . . . . . . . . . . . . . . . . . . . . . . 23

5.2 Visualizing the graph product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.3 Properties of hyper graph product codes . . . . . . . . . . . . . . . . . . . . . . . . . 27

6 Other constructions 33

6.1 Homological product codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

6.2 Mckay codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

7 Conclusion 36

References 36

A Hardness of decoding LDPC Codes 38

1 Introduction

It is well known that to the first order of approximation, the rate and distance of a code play

a crucial part in assessing the desirability of its application in error correction [1, 2, 3]. Ideally,

we would like to use a code with the maximum possible rate and distance, for a choice of a code

length. The celebrated result of Shannon in his seminal paper [1] indicates that the rate of a code

cannot exceed a quantity that depends only on the channel used, called the capacity . However inthe same paper, Shannon also mentioned that of all (n, k) codes, if we were to randomly choose

a code, it will have a rate that is almost equal to the capacity. See any standard text, such as

[2, 4], on information theory for detailed discussions. But Shannon does not provide a construction

to realize such codes. About thirty years later, Robert McLiecee and Berlekemp provided a useful

result in [5] explaining that the decoding process becomes extremely difficult (NP-Hard) if one is

not given any prior information about the code, as in the case of a random code (and later a

similar result was shown for computing the distance of a linear code in [6]). Hence it seemed that

2


3/39

though codes actually exist with a high encoding rate, they cannot be used in any application as

the corresponding decoding problem is intractable. In the middle of these two discoveries, Robert

Gallager (1963) proposed an alternative type of codes [7] which do not have all the features of a

random code, used in Shannon’s proof, but yet they have a good encoding rate. The bargain for the

loss in randomness is the ease of decoding such codes. Gallager termed these codes as Low-density

Parity-check codes (LDPC). From the name suggests they are indeed a type of linear codes, butthey have an additional structure – they are specified by parity check matrices that are sparse.

However, this remarkable discovery was largely ignored for the next thirty years, until 1990 when

their potential was rediscovered [8]. Today, there exists proposals for applications of LDPC codes,

in almost any context of classical communication.

The purpose of the content in the sections to follow is to highlight important properties of LDPC

codes and compare them to the properties of random linear codes that were used by Shannon to

prove his result. Almost all of the major results on LDPC codes have been studied from Robert

Gallager’s PhD thesis [7, 9]. In some cases, the proofs have been expanded to include details. In the

LDPC setting, it turns out that unfortunately the distance of a code alone does not access its error

correcting capabilities at best [10, 11]. Also, the distance distribution of a linear code, which is alsoits weight distribution, is very useful in computing various properties of the code with reference

to the performance of decoding algorithms. In this regard, we will also mention an expression

for the asymptotic weight distribution of the type of LDPC codes, introduced by Gallager. Even

otherwise, computing the weight distribution of a linear code is an interesting problem by itself and

has been studied widely [12, 13]. We have also used the PhD thesis of Sarah Sweatlock [14] to show

derivations for the weight distribution of LDPC codes.

A lot of the motivation for designing LDPC codes is the ease in their decoding. The decoding

procedure is tailored to take special advantage of the sparse structure of the parity check matrix

defining these codes. We will not touch upon any details of the decoding procedure and merely

mention that an efficient decoder exists. We would like to refer the interested reader to [7, 15, 14]

for an extensive study.

2 Essentials of classical coding theory

2.1 Linear codes

A classical code is simply a collection of strings, in this case, over the binary alphabet. By itself a

set of binary sequences does not have enough structure for us to be able to represent it compactly.

Hence we will resort to those codes that have a compact description. This is essential for various

reasons, one of the important ones being the efficiency of detecting errors on this code. A particular

subset of classical binary codes can be represented as the kernel of a matrix. By definition, codesin this subset must form vector spaces over the binary field and such codes are called Linear Codes.

For the sake of completeness, we will define a linear code.

Definition 2.1 Linear Code

A (n ,k,d) linear code C is a k−dimensional subspace of Fn2 , such that any two sequences in C are separated by a hamming distance of at least d. The linear code C is presented in two ways, namely,as

3


4/39

1. The kernel of a m × n matrix H , whose row rank is n − k, called the Parity Check Matrix,

C = Ker (H ) = {x ∈ Fn2 : H · x = 0} , (2.1)

2. The Row space of a k × n full rank matrix G , called the Generator Matrix,

C = Row (G ) = {x · G : x ∈ Fk2} . (2.2)

The quantity n is referred to as the “length” of the code, k/n as the “rate” and k as the

“number of encoded bits”.

In situations where we do not want to specify the distance of C, we will simply refer to C as a (n, k)linear code.

The parity check matrix H is usually assumed to be full rank for linear codes, in which case, one

can tell that the difference between the number of columns and the number of rows, of H, is the

number of encoded bits or the dimension of the code, k . Unless stated explicitly, we will relax this

assumption. Nevertheless, the the rows of H will span the dual code C⊥ [15]. As a consequence of this relaxation, we cannot be precise about the rate of such codes.

2.2 Graphical representation of linear codes

We are always interested in compactly representing a code. Since a sparse parity check matrix can

have large dimensions, specifying the entire matrix can take too much space and moreover, the

relevant information, which consists of the position of ones, in the parity check matrix, does not

grow as rapidly as the size of the matrix. For this reason, we would like to store only the positions

of ones in the parity check matrix. One method of realizing this was first introduced by Tanner

in 1978. In [16], he suggested that a linear code can be identified with a bipartite graph, whose

adjacency matrix is closely related to H. Such a graph is called a Tanner graph . Since we will recall

this graph multiple times, let us define it.

Definition 2.2 Tanner graph

A Tanner graph corresponding to a m × n parity check matrix H is a bipartite graph, denoted by T

def = (C ,V,E ) whose nodes are C ∪ V and edges are E , where |C | = m, |V | = n and

E = {(ci, v j) : 1 ≤ i ≤ m, 1 ≤ j ≤ n, H ij = 1}. (2.3)

Elements of C are referred to as check nodes and elements of V as vertex nodes.

Below is a simple illustration of a Tanner graph corresponding to the parity check matrix of a(5, 1, 3) repetition code. We will subsequently see more complicated examples.

4


5/39

H =

1 1 0 0 00 1 1 0 0

0 0 1 1 0

0 0 0 1 1

Figure 1: The Tanner graph for a (5, 1, 3) repeti-

tion code, whose parity check matrix is given on

the left.

The following properties a Tanner graph corresponding to a parity check matrix can be inferred

straightforwardly.

Lemma 2.1 Properties of a Tanner graph

Let T be a Tanner graph corresponding to a (n,k,d) linear code C specified by the parity check matrix H . Then,

1. For V ⊂ V such that #V < d, if T is the vertex induced subgraph of T of the set V ∪ C ,then there are no basis vectors of C whose support is entirely inside T .

2. For a tanner graph T , its transpose T T is a Tanner graph T T def = (V,C ,E ) that describes all the sequences in the kernel of H T , which specifies a linear code CT with kT encoded bits, where kT = n − m + k.

We say that a check is satisfied by a binary sequence if the corresponding row of the parity checkmatrix is orthogonal to the binary sequence. Along these lines, the decoding problem can be

formulated analogously in the Tanner graph setting. See [7, 15] for details.

3 Classical LDPC codes

We finally impose the requirement of sparseness directly on the parity check matrix and equivalently

impose that the corresponding Tanner graph will have all (check and vertex) nodes of constant

bounded degree. This will lead us to formally define a Low Density Parity Check Code or an LDPC

code, as introduced1 by Gallager in [9].

Definition 3.1 Classical LDPC Codes A family of (n, k) linear codes is called a Low-density parity-check ( LDPC ) family if each (n, k)

code of the family has a parity check matrix that has ωc ones in each column and ωr ones in

each row, for some constants ωc, ωr, independent of n. The LDPC family of codes is referred to a

(n, ωc, ωr)−regular LDPC family.1Gallager introduced only regular-LDPC codes. Subsequently, analogous definitions of an irregular LDPC family

has been provided by assuming H to have a column weight of at most ωc and a row weight of at most ωr. See [17, 14]

for details.

5


6/39

Equivalently, each vertex node of the Tanner graph corresponding to every (n, k) code in the LDPC

family must have a degree of at most ωc and each check node of the Tanner graph must have a

degree at most ωr, where ωc, ωr are constants, independent of n. Since the rows of a parity check

matrix are not assumed to be linearly independent, we cannot precisely tell the rate of a code from

its parity check matrix. However, we can provide the following lower bound on the rate, given ωr, ωc

for a LDPC family.

Lemma 3.1 Rate of LDPC codes

The rate of any code in a (n, ωc, ωr) − LDPC family is at least 1 − ωc/ωr .

Proof : For a linear code described by a m ×n parity check matrix H, the rate R is defined in (Def.2.1) as

R = n − rank(H)

n . (3.1)

As the rows of H are not guaranteed to be linearly independent, rank(H) ≤ m. The number of onesin H can be expressed in two ways, i.e, ωc × n and ωr × m, both of which are indeed equal, i.e,m = n ωc/ωr . The lemma now follows from (Eq. 3.1).

The quantity 1 − ωc/ωr is often referred to as the design rate of a (n, ωc, ωr) − LDPC code family.The one other property that is required to assess the error correcting property of a family of

LDPC codes is the code distance. For any code in the family, its distance is given by the weight of its

least weight codeword. The number of binary sequences of a given weight w is

n

w

and only some

of them belong to a particular code. The only method to verify which ones do and which ones do

not, is to naively test the membership of every possible sequence in this set. Hence the complexity

of determining the lowest distance in this way, will increase sharply with n and furthermore, the

problem of determining the distance of a LDPC code is known to be NP-Hard, see [6, 18].On the other hand, as n increases, the number of sparse matrices for given (n, ωc, ωr) also

increases rapidly [17, 14, 9]. So, rather than precisely telling the distance of a particular code, it

makes more sense to talk about the distance of “most” of the codes in a family. We will formalize

this notion by an expression for the probabilistic distance of a family of LDPC codes.

3.1 Shannon’s construction of parity check codes

Let us first start by studying the probabilistic distance of the family of all (n, k) linear codes, which

was used by Shannon to prove his result. Each sequence can be associated to a probability, describing

the likelihood of the sequence being in a (n, k) linear code. Likewise, we can then associate the

probability of an (n, k) linear code to have a distance d for some d > 0.

Suppose a binary sequence has ones in it. It will belong to a code C described by a parity checkmatrix H if and only if it is orthogonal to every row of this parity check matrix. Let us denote the

probability of this event, by Prob(). Now the number of binary sequences in a (n, k) linear code,

of weight , denoted by N () can be expressed as

N () =

n

Prob() . (3.2)

6


7/39

As a result, the probability of a (n, k) code to have a distance d, of at least δn, can be expressed as

Prob(d > δn) ≥ 1 −δn=2

n

Prob() , (3.3)

for some δ > 0. We have assumed that the linear code has distance that is greater than one (Oneway of enforcing this is by requiring that all codewords in the code to be of even weight). By itself,

such an assumption is completely genuine since a distance 1 code cannot handle bit flip errors any

better than the unencoded message itself. We have not elaborated on the specific form or bound on

Prob() for a linear code. In what follows, we will first provide a simplification of (Eq. 3.2) and (Eq.

3.3) assuming a general (n, k) linear code, with no further restrictions on the parity check matrix.

Subsequently, we will provide a similar statement of (Eq. 3.2) and (Eq. 3.3) for an LDPC family,

assuming the restrictions on H provided by (Def. 3.1).

For the sake of formality we must define an ensemble of (n, k) linear codes so that any computed

parameter of a general (n, k) code can be viewed as the average over all values of the parameter,

computed for each code in the ensemble. Since every code is completely specified by a parity

check matrix, the ensemble of codes can be held in correspondence with an ensemble of parity check

matrices. So, we can talk about ensembles of parity check matrices, in the exact same way as we talk

about ensembles of the corresponding linear codes that each of them specify, see [7, 17, 14, 19, 15].

One of the most common ensembles of codes, is the one formulated by Shannon, described below.

Definition 3.2 Shannon Ensemble

The Shannon ensemble is an equiprobable ensemble of m × n parity check matrices, where each matrix is constructed by choosing each bit in the matrix, independently, with identical probabilities

for all bits.

In other words, each parity check matrix is described as a collection of nm statistically independent

binary digits, each of which have an equal probability of being one or zero. Clearly, each row

would contain on an average, n/2 ones, which indicates that the Shannon ensemble is not an LDPC

ensemble. All the rows of the parity check matrix need not be linearly independent and so the

corresponding code will have a rate that is at least 1 − m/n. The definition in (Def. 3.2) allowsus to study the average weight distribution and subsequently the average distance of codes in this

ensemble. Let us begin with the following result on the average weight distribution.

Lemma 3.2 The weight distribution of a (n, k) code, averaged over all codes in the Shannon en-

semble, denoted by N (), is bounded above as follows 2.

N () ≤

2[H 2( /n )−(1−k/n )] (3.4)

where H 2 is the binary entropy function defined by

H 2(λ) = λ log2 λ − (1 − λ)log2(1 − λ) (3.5)

2We would like to point our that the left side of the inequality in (Eq. 2.1 of [9]) must be N () =

n

2n(1−R).

7


8/39

Proof : The average of N () over equiprobable ensemble of all binary (n, k) linear codes, can be

expressed similar to (Eq. 3.2) as

N () =

n

Prob() , (3.6)

where Prob() is the probability of a binary sequence to be orthogonal to all the rows of a paritycheck matrix in the Shannon ensemble, averaged over all possible parity check matrices in the

ensemble. Consider the jth row of a parity check matrix. This row will have n bits independently

chosen, each being zero or one with equal probability. Let x be the position of a one in this row. Any

n−bit sequence which satisfies this parity check will violate this check if its xth bit is flipped andany binary string which violates this check will satisfy this check if its xth bit is flipped. Therefore

if every bit of an n−bit sequence is chosen independently, to be zero or one, with equal probability,then the probability of the sequence satisfying a row of the parity check matrix is the probability of

choosing the xth bit to be a particular value (either zero or one, depending upon the particular parity

check), which is always 1/2. Since each row of the parity check matrix is chosen independently,

the probability that a binary sequence will satisfy all rows of the parity check matrix, i.e, will be acodeword is just 1/2m . Therefore, the probability that any weight sequence will be a codeword is

Prob() = 2−m . (3.7)

Hence the total number of weight sequences in a code (following Eq. 3.2) is

N () =

n

2−m. (3.8)

The combinatorial factor in the above expression can be bounded above using the Stirling’s approx-

imation formula in [20], given by3 n

λn

≤ 1

2πnλ(1 − λ) 2log2 p

λn(1−λ)n(1−λ) = 1 2πnλ(1 − λ) 2

−nH 2(λ) , (3.9)

where λ = n/ and H 2 is the binary entropy function described in (Eq. 3.5). Combining the above

with (Eq. 3.8) gives (Eq. 3.4).

Notice that the number of codewords of a particular weight grows sharply with n, in fact, it scales

exponentially. Hence, we will define the logarithm of the quantity in (Eq. 3.4), in the large n limit,

to be of interest. Adopting the notations in [14], this quantity is defined by

E (θ) = limn→∞

1

n

ln N 1 (θ n) (3.10)

where E θ is called the Spectral distribution associated with a particular ensemble. We have implicitly

taken to be θ/n . See [17, 19, 14] for this terminology. In the case of the Shannon ensemble, we

find the spectral distribution, given by E 0(θ), where

E 0(θ) = H 2(θ) − (1 − R) , (3.11)

3In Eq. 2.3 of [9], the

n

must be replaced by n!.

8


9/39

and R is the design rate of the Shannon ensemble consisting of m × n parity check matrices.Combining (Lemma. 3.2) with (Eq. 3.3) yields the average distance of a code in the Shannon

ensemble. We will show below that the probability of a code in the ensemble to have a distance

at least δn for some 0 ≤ δ ≤ 1 approaches one for any constant δ as n → ∞. This just meansthat as the block length increases, there are many codes in the Shannon ensemble that have a large

distance, i.e, that which scales linearly with the length of the code. This is a particularly desirablefeature and indicates that there are many “good” codes in this ensemble.

Corollary 3.1 Average distance of a (n, k) linear code

The probability of a (n, k) code from the Shannon ensemble, to have a distance d > δn is bounded

below as

Prob (d > δn) ≥ 1 − 2−n(1−k/n−H 2(δ)) 1 − δ 1 − 2δ , (3.12)

where H 2(δ ) is the binary entropy function, in (Eq. 3.5 ).

Proof : The corollary follows by combining the expression for the average number of codewords of a given weight in (Eq. 3.4), with (Eq. 3.3). For details, see [7, 21].

Despite the simple description and generality of the Shannon ensemble, there are some remark-

able features. It can be shown that the first d⊥ moments of the weight distribution for any (n,k,d)

linear code C is the same as the first d⊥ moments of the Shannon ensemble, where d⊥ denotes thedistance of the dual code C⊥. See [14] for details. Though this fact is unrelated to the progress of this writeup, it is worth mentioning it.

3.2 The Gallager code

As we promised, we will now derive average weight distributions and a result analogous to (Lemma.3.2) as well as (Eq. 3.11), eventually describing the average distance of a (n, ωc, ωr) − LDPC codeensemble. Since, the parity check matrices in the Shannon ensemble need not necessary satisfy

constrains of an LDPC parity check matrix (because a random choice of a binary sequence, to

represent a row of the parity check matrix, will imply n/2 ones in a row of the parity check matrix),

in order to specify an LDPC ensemble, one must adopt a different specification of an ensemble. There

are numerous methods to do this, a few straightforward methods involving independent choices of

n−bit sequences of bounded weight, constituting the rows of the parity check matrices in theensemble. Alternatively, one can define a LDPC ensemble as a collection of m × n parity checkmatrices, each of which are constructed by independently choosing αn + βm and assigning the value

of the corresponding bit in the parity check matrix at the respective locations to be one. It isimportant that α, β be a constant independent of m. Though these are valid considerations for an

ensemble (and many more are also provided in [17, 14, 19]), we will analyze the ensemble proposed

by Robert B. Gallager in his PhD thesis in 1964 [7]. This ensemble is called the Gallager ensemble

and is defined by a subset of parity check matrices having weight ωc columns and weight ωr rows.

This only those parity check matrices realized by the following construction are defined to be in

this subset.

9


10/39

Definition 3.3 Gallager construction

A parity check matrix H in a (n, ωc, ωr) Gallager ensemble, if and only if it can be constructed as

follows. Let H 1 be a ( n/ωr ) × n binary matrix whose ith row has 1 at column positions {(i − 1)ωr +1, (i − 1)ωr + 2, (i − 1)ωr + 3, . . . , i ωr}. Let H 2, . . . , H ωc be ( n/ωr ) × n matrices, each defined as a random column permutation of H 1. The parity check matrix H is then specified as

H =

H 1H 2

...

H ωc

. (3.13)

A few constraints have been implicitly assumed, namely n must divide ωr and m = nωc/ωr , see

[9, 15] for details. A simple example of a parity check matrix from the (24, 3, 6) Gallager ensemble

is given below.

H =

1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1

0 0 0 0 0 1 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 0 0 1

1 0 1 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0

0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0

0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 1 0

0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 0 0 0 0 1 1

0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0

1 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0

0 1 0 0 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0

(3.14)

Another example of a parity check matrix and the associated Tanner graph for a code from the

(36, 3, 4) Gallager ensemble.

10


11/39

Figure 2: Above is a Parity check matrix from

the (36, 3, 4) − LDPC Gallager ensemble and its

corresponding Tanner graph is given on the right.The embedding of nodes of the Tanner graph is

chosen to emphasize the constant degree of each

vertex as well as check nodes.

Though this construction may seem artificial, it is remarked in [17, 19, 14] that many other con-

structions of LDPC code ensembles (which by definition are not contained in the Gallager ensemble)

are also observed to have the same average values of the code parameters.

As for any regular (n, ωc, ωr) − LDPC code family, it follows from (Lemma. 3.1) that rate of theabove code is at least 2/5. The distance of a code in the Gallager ensemble cannot be efficiently

exactly computed as in the case of the Shannon ensemble. We will analyze the average distance as

well as weight distribution for codes in the Gallager ensemble, following similar techniques as in theanalysis of the case of Shannon’s ensemble (in Eq. 3.11, Cor. 3.1), but the specific construction of

parity check matrices, in (Def. 3.3), of the Gallager ensemble will lead to some distinctions here.

Let us introduce some terminologies first. The m rows of a parity check matrix, H (in Def. 3.3),

of the Gallager ensemble can be separated into a ωc blocks, each containing n/ωr rows4. Within a

block, each bit is checked at most by one of the n/ωr rows only. Conversely, all the parity check

sets are distinct supports and together they check all of the n−bits. Indeed any codeword whichsatisfies all the m parity checks in H is a codeword, but since each of the ωc blocks are chosen

independently, the event that a binary sequence satisfies all m checks in H is the joint outcome of

ωc independent events – each one being the event that the corresponding binary sequence satisfies

an independently chosen block of H. Let us suppose that the number of binary sequences of weight that satisfy any one the j blocks of H for ωc > 1, is denoted by N 1(). Then the probability that a

binary sequence of weight satisfies all m checks in H, and subsequently in a code of the ensemble

is given by

Prob() =

N 1()n

ωc

. (3.15)

4In his original text [7], Gallager refers to each of the rows within a block as a parity check set .

11


12/39

The above statement is very similar to the one in (Eq. 3.7) and consequently, the average number

of codewords of weight in the Gallager ensemble is just

N () =

n

N 1()

n

ωc. (3.16)

What remains is to determine N 1(). In the case of the Shannon ensemble, P () was 2−m regardless

of . This made much of the analysis simple. However, in this case, each row of H is not chosen

independently, but every set of n/ωr rows is. So, the P () will depend on . Consider the block

j0 for some fixed j0 > 1. Let us describe a construction of a set, S , of all binary sequences which

satisfy every row of j0. Since each row has a distinct support which doesn’t have any overlap with

the others, it suffices to choose the union of all binary strings that satisfy each of the individual

rows of j0. Consider the following construction of S .

1. Let S contain the all zero binary sequence. This satisfies all the rows of j0 trivially and

|S | = 1. For the first row of j0, let i1, i2, . . . , iωr denote the position of 1s in this row. Forevery sequence in S , add every possible sequence that can result from this sequence by flippingbits an an even number of positions in i1, i2, . . . , iωr . Each such binary sequence produced

by these bit flips will have an even overlap with the first row if j0. Moreover since the rows

of j0 have distinct supports (mutually exclusive), the new binary sequence will have a trivial

overlap with all rows of j0. Note that the number of locations at which the bit flip occurs

is not important. We will choose every even subset of i1, i2, · · · , iωr with equal probabilities.Consequently, the probability of a weight r sequence in the ensemble is just

ωrr

2−(ωr−1),

where 2ωr−1 = |S |, is the total number of even weight of weight at most ωr. The next stepis clear – we will repeat the same process in set of distinct positions i1, . . . , i

ωr , and choose

to flip corresponding bits independent of which bits were flipped previously. However, it

will be difficult to compute the updated probability of a sequence in this new ensemble to

have weight r since it will involve a convolution of the probability distribution functions of

the first two steps. In order to ease the updating of the probability distribution function,

let us instead update the moment generating function. It is well known that the moment

generating function for a sum of random variables is merely the product of the individual

moment generating functions. The moment generating function for the number of ones in a

binary sequence in S is given by g (s) expressed as follows5.

g(s) =ωr

i=2i∈even

ωri

2−(ωr−1) = 2−ωr

(1 + e−s)ωr + (1 − e−s)ωr

(3.17)

⇒ g(s) def = 2−(ωr−1)Z (s) (3.18)

where Z (s) = (1 + e−s)ωr + (1 − e−s)ωr

2 (3.19)

5Note: for a symmetric random variable, the moment generating function is also symmetric about the origin.

Hence the s in the exponent of (Eq. 3.17) can be replaced by −s to see an equivalent expression, as presented in [7].

We choose to follow the notation in [19, 14].

12


13/39

2. Let i1, . . . , iωr now denote the position of ones in the second row of j0. Clearly, ix = iy for any

x = y. We take each sequence in S and add all possible sequences that can result from bitflips at an even number of locations in i 1, . . . , i

ωr . Now the number of strings in the ensemble

S is 2ωr−1 ×2ωr−1. The number of ones in a n-bit string of the ensemble is simply the numberof ones added in this step plus the number of ones added in the previous step. The two steps

are independent, they can be performed in any order and will not interfere with each other.Hence the moment generating function for the number of ones (produced by two independent

events) is g(s) × g(s).3. Repeating the above two steps for each row of j0 produces an ensemble S of size 2

n(ωr−1)/ωr

with the corresponding moment generating function for the weight distribution in the ensemble

given by [g(s)]n/ωr .

Finally, we must infer the probability of a sequence to have weight in the ensemble from the

moment generating function.

Denoting the probability of a weight binary sequence to satisfy any one of ωc blocks of H by

Q(), we can express the probability that a weight sequence satisfied all the m checks in H as

[g(s)]n/ωr Q() =n=0

exp(s)Q() (3.20)

which subsequently provides an upper bound on Q() given by (see [7, 21, 22] for details)

Q() ≤ exp

n

ωrln g(s) − s

(3.21)

= expn

k ((ωr − 1) ln2 + ln Z (s)) − s

(3.22)

where we have used (Eq. 3.18). The average number of sequences in S of weight as

N 1() ≤ 2 n(ωr−1)/ωr exp

nωr

(−(ωr − 1) ln2 + ln Z (s)) − s

(3.23)

= exp

n

ωr(ωr − 1) ln 2 + n

ωr(−(ωr − 1) ln2 + ln Z (s)) − s

(3.24)

Notice that s is a free parameter in the above expression, however since we are only interested in

the upper bound for N 1(), we can assume to be the function of s that maximizes the right hand

side of the above inequality. This is found by setting the derivative of the expression in (Eq. 3.24)

to zero and it gives

= n

ωr

d

ds(ln Z (s))

. (3.25)

Hence using this value of , we find

N 1() ≤ exp

n

ωr

ln Z (s) − s d

ds(ln Z (s))

(3.26)

and combining the above expression with (Eq. 3.16) yields the average number of words in a

Gallager code, of weight , given by

N () =

n

−(ωc−1)exp

ωc

n

ωr

ln Z (s) − s d

ds(ln g(s))

. (3.27)

13


14/39


15/39

Figure 3: Similar plots can be found in many articles that discuss the asymptotic weight distribution

of LDPC codes, like [14, 19, 21, 22] to name a few.

We will close this section with the discussion about the average distance of a code in the

(n, ωc, ωr)−Gallager LDPC family. This will follow as a corollary to the result in (Eq. 3.27) inthe same way as in (Cor. 3.1).

Corollary 3.2 Distance of codes in the Gallager ensemble

The probability of the distance of a code in the (n, ωc, ωr)−Gallager’s LDPC family realized from the construction in (Def. 3.3 ), to be greater than δn is bounded below by the following inequality.

Prob (d > δn) ≥ 1 − (ωr − 1)ωc

(n − 1)ωc−2 + [2πnδ (1 − δ )](ωc−1)/2

δn=4

exp

nE (θ) +

ωc − 112nθ(1 − θ)

(3.33)

Proof : Combining (Eq. 3.16) with (Eq. 3.3) which relates the probability of a code to have a

distance at least d in an ensemble to the average weight distribution of the ensemble, we find

Prob(d > δn) ≥ 1 −δn=2

∈ even

n

−(ωc−1)N ωc1 () (3.34)

15


16/39

= 1 −

n

2

−ωc+1N ωc1 (2) −

δn=4

∈ even

n

−(ωc−1)N ωc1 () (3.35)

Now, N 2() is simply the number of weight two sequences that will satisfy any block of a parity

check matrix of the ensemble. Take the j th block, it has n/ωc rows, each of weight ωr. Any binary

sequence b that is orthogonal to a row must have even overlap with it and remember that all the

rows in a block are mutually exclusive, which means that b must have even overlap with each row

of the j th block. To have an even overlap, we can choose 2 locations out of ωr (and there are

ωr2

ways of doing so), at each row of the jth block, which leads to the expression

N 1(2) = n

ωr

ωr2

. (3.36)

Combining this with (Eq. 3.35) gives

Prob(d > δn) ≥ 1 − (ωr

−1)ωc

(n − 1)ωc−2 −δn=4

∈ even

n−(ωc−1)

N ωc1 () , (3.37)

where to obtain the inequality, we have assumed n ≤ 2(n − 1) in the expansion of the binomialcoefficient. For a bound on the second term in the right side of (Eq. 3.37), we will use a lower

bound to the combinatorial factor, resulting from the form of the stirling approximation, presented

in [20]. n

nθ

≤ 1

2nπθ(1 − θ) exp

nH 2(θ) − 112nθ(1 − θ)

(3.38)

This results in the below set of simplifications, starting from the second term in the right side of

(Eq. 3.37).

≤δn=4

[2πnθ(1 − θ)] (ωc−1)/2 exp

nH 2(θ) − 112nθ(1 − θ)

−ωc+1exp

ωcn

ωr

sf s − f (s) (3.39)

≤δn=4

[2πnθ(1 − θ)] (ωc−1)/2 exp−n(ωc − 1)H 2(θ) + nωc

ωr

sf (s) − f (s) + ωc − 1

12nθ(1 − θ)

(3.40)

≤ [2πnδ (1 − δ )] (ωc−1)/2δn=4

exp

n

ωcωr

sf (s) − f (s)− (ωc − 1)H 2(θ)

+

ωc − 112nθ(1 − θ)

(3.41)

≤ [2πnδ (1 − δ )] (ωc−1)/2δn=4

exp

nE (θ) + ωc − 112nθ(1 − θ)

(3.42)

where in (Eq. 3.40) we have used the functions we defined in (Eq. 3.31), along with θ = /n and

in (Eq. 3.42) we have used the spectral distribution function from (Eq. 3). Finally, combining (Eq.

3.42) with (Eq. 3.37) provides the statement in the corollary.

The exponent in the upper bound in (Eq. 3.33) suggests that if the exponent is positive, then the

probability of having a distance d > δn is vanishingly small. However, when the exponent is negative

16


17/39

we see that the probability of having a distance d > δn code in the Gallager (n, ωc, ωr) − LDPCfamily is almost equal to one. We can now plot the exponent of (Eq. 3.42) to observe that the

upper bound behaves likes a step function, guaranteeing a very good distance of the LDPC code

when the exponent is negative.

Figure 4: The above plot shows the exponent in (Eq. 3.33) as a function of θ = /n . Here ωc = 6

and the design rate ωc/ωr = 1/2. Note that as n increases, the exponent becomes negative for

some θ . Moreover a larger value of n denotes a higher value of θ corresponding to which the y−co-ordinate is negative. This means codes of larger lengths have (with probability arbitrarily close

to one) a distance that increases with the the code length. Though the linear scaling cannot be

inferred from this plot alone, better bounds in [9] suggest that the distance of a code scales linearly

with n, with a probability that is arbitrarily close to one, as long as the scaling factor, δ , is boundedby a function of the resign rate of the code, see [ 9].

17


18/39

4 Essentials of quantum coding theory

4.1 The quantum setting

In the previous sections, we saw a class of classical parity check codes that have sparse parity

check matrices. In this section, we will develop the necessary framework to introduce the quantum

analogues of classical LDPC codes. The similarity in the decoding problem for general classicallinear codes and quantum codes is that they are both intractable. Hence there is yet again a need

to consider some structure on the codewords of a quantum code in order to achieve fast decoding.

One such requirement is imposed on a quantum code, which is very much related to the requirement

of a spree parity check matrix. This leads us into the study of Quantum LDPC Codes . However,

the study of quantum LDPC codes has started only recently, unlike its classical counterpart.

The stabilizer formalism developed by D. Gottesman in his PhD thesis [23] allows us to represent

a quantum code as a common eigenspace of an abeliean group formed by n−qubit Pauli operators,known as the stabilizer subgroup. He also showed that one can perform error correction without

having to explicitly worry about the structure of the n−qubit sequences in the code. Conversely,any abelian subgroup of the Pauli group on n−qubits can be associated to a quantum code onn−qubits. The construction of quantum codes is therefore achieved by constructing the suitablestabilizer subgroup [23, 24]. It turns out that the complexity of the decoding algorithm increases

sharply with the weight of the generators of the stabilizer group. Hence it is beneficial to consider

codes that have stabilizer groups whose generators are all low weight, in particular, families of

stabilizer codes whose stabilizer generators have at most a constant weight. In the symplectic

encoding of Pauli matrices [23] this condition is synonymous with the requirement of a sparse

parity check matrix. The parity check matrix of a stabilizer code is a matrix whose rows represent

the symplectic encoding of the corresponding stabilizer generators. This defines for us a quantum

LDPC code.

Recall that in the classical case, constructing a parity check matrix for a linear code was trivial– it required a arbitrary choice of m sequences of length n each and assigning them to the rows

of the parity check matrix. However, in this case, the abelian structure of the stabilizer group

rules out an analogous construction technique for a stabilizer group. It is important that the the

random choice of Pauli operators, all commute with one another. One method of achieving this was

proposed by Calderbank, Steane and Shor in 1998 [25] which involved the construction of an abelian

group, starting from the parity check matrix of a classical linear code. In particular, the X −andZ −type operators were encoded using orthogonal binary strings that ensured that the correspondingn−qubit operators always anti commuted on an even number of qubits, eventually making themmutually commuting n−qubit operators. This meant that given a parity check matrix, we require aset of vectors that are orthogonal to every row of the parity check matrix. Following the discussion

on classical codes in (Eq. 2.1, 2.2) we find that the set of orthogonal vectors to the rows of the

parity check matrix is precisely the rows of the corresponding Generator matrix.

However, there are some serious flaws in the procedure when we try to construct a quantum

LDPC code naively following the CSS construction. This is because of the observation in (Cor. 3.2)

that good LDPC codes have a distance that scales linearly with n, in other words, the weights of

the rows of the generator matrix scale with n. So, they cannot be used to encode the stabilizer

generators for an LDPC code. We will elaborate on this point later on too, in (Sec. 4.3). However,

18


19/39

some ideas [26, 27, 28, 29, 30] have been proposed to achieve a CSS construction for a quantum

LDPC code. We will study one of them in detail in the following sections. The key feature of this

approach is that it involves the construction of stabilizer generators of a CSS code, starting from two

classical LDPC codes whose words are not necessarily orthogonal to each other. This construction

was first proposed by Jean Pierre Tillich and Gilles Zemor in [27] under the name of Hypergraph

product codes . Other constructions of CSS codes which are proposed include one by David McKay,which requires special types of classical linear codes, whose generator matrix as well as parity check

matrices are both sparse. They are called dual containing LDPC codes and the family is also called

LDGM (Low-Density generator-matrix codes). We will briefly mention the construction of CSS

codes proposed by McKay in [29] in (Sec. 6.2) and a recent construction of CSS codes, by Sergey

Bravyi and Barbara Tehral in [28], called Homological product codes , in (Sec. 6.1).

4.2 Stabilizer formalism and graphical representation

A quantum code is simply a set of n−qubit states. For similar reasons as in the classical case, wewill again confine to those quantum codes which have a compact description. It turns out that these

codes can be identified with an abelian set of Pauli operators. This is the treatment of Stabilizer

formalism. For the sake of completeness, let us list the necessary properties of a stabilizer code that

we will recall later.

Definition 4.1 Stabilizer codes

A [[n,k,d]] stabilizer code Q corresponding to a subgroup S , is the common eigenspace of all oper-ators in S with the following properties.

1. S is generated by a set of n − k independent generators and it has 2n−k distinct elements.2. The minimum weight of an operator in N (S )\S is dQ.

3. The number of generators for N (S )/S is 2kQ.For a stabilizer code Q, n is referred to as the length, kQ as the number of encoded qubits and d as its distance.

We refer the reader to [24, 23] to get an overview of the various terminologies and concepts involved

with the Stabilizer formalism.

As we mentioned in the previous section, one can associate a parity check matrix to a stabilizer

code. This matrix is constructed by encoding each generator of the corresponding stabilizer sub-

group in the symplectic method, outlined in [24, 23]. Like in the classical case, we can also define

the Tanner graph for a quantum code.

Definition 4.2 Tanner graph for Quantum codes

A Tanner graph for a [[n, k]] stabilizer code Q with the stabilizer generators S 1, S 2, . . . , S n−k is a bipartite graph T def = (C ,V,E ) whose vertex nodes are given by the set V ∪ C where

V = {vi : 1 ≤ i ≤ n} , C = {ci : 1 ≤ i ≤ n − k} . (4.1)The edges between nodes are specified by E = E X ∪ E Y ∪ E Z where

E X = {(ci, v j) : 1 ≤ i ≤ n − k, 1 ≤ j ≤ n, S ji = X } , (4.2)

19


20/39

E Y = {(ci, v j) : 1 ≤ i ≤ n − k, 1 ≤ j ≤ n, S ji = Y } , (4.3)E Z = {(ci, v j) : 1 ≤ i ≤ n − k, 1 ≤ j ≤ n, S ji = Z } . (4.4)

where S ji denotes the 2 × 2 Pauli matrix which forms the jth tensor factor of the ith stabilizer generator. The edges in the sets E X , E Y and E Z are distinguished by explicit labels over these

edges.Though the Tanner graph picture provides a succinct description of the stabilizer group S , it failsautomatically imply the restriction that the checks represented by nodes in C of the above definition,

mutually commute with each other. This is something that we must explicitly verify before assuming

T to be a Tanner graph for a stabilizer code. The Tanner graph notation is widely used in graphicaldescription of stabilizer codes. The [[5, 1, 3]] code in (Eqs. 3.17, 3.18 of [23]) has the following

Tanner graph associated to it.

S = XZZXI,IXZZX,XIXZZ,ZXIXZ ⇒ HS =

1 0 0 1 0 0 1 1 0 1

0 1 0 0 1 0 0 1 1 0

1 0 1 0 0 0 0 0 1 1

0 1 0 1 0 1 0 0 0 1

(4.5)

Figure 5: Tanner graph for the [[5, 1, 3]] code.

Note that every two checks nodes must represent not only any Pauli operators, but those which

mutually commute as well. As a result it is not completely trivial to produce a new Tanner graphfrom an old one, by merely altering the neighbourhood of a check node. This is trivially possible in

the classical setting. For instance, in the above example (of Fig. 5), we cannot delete all outgoing

edges in the check node c1, leaving out the edge (c1, v1) alone, just because the new operator

which will correspond to c1 will be X 1, and it will fail to commute with the stabilizer generator

represented by c2. Therefore, the resulting graph will no more be a valid Tanner graph for a

stabilizer code. Furthermore, at each vertex there are three possible type of operations – X, Y and

Z , which significantly increases the complexity of constructing a valid Tanner graph.

20


21/39

4.3 CSS construction of quantum codes

One of the widely studied type of stabilizer codes are the Calderbank Shor Steane (CSS) codes.

These code were not originally proposed in the stabilizer formalism, we will refer to their definitions

in [24, 23]. For a CSS code, the corresponding stabilizer subgroup has the property that each of

its generators is either a tensor product of only Z −

type operators or only X −

type operators. In

the symplectic encoding this implies that the parity check matrix of the stabilizer code is block

diagonal. Each of the diagonal blocks specify the X −type and Z −type generators respectively.

HQ =

H1 0

0 H2

(4.6)

Hence the condition that the generators must commute can be restated as

H1 · HT 2 = 0 (4.7)

Hence if H1 is the parity check matrix for a classical code

C1 and H2 for a classical code

C2, the

above condition implies that C⊥2 ⊆ C1 and subsequently C⊥1 ∈ C2. The CSS code Q is thereforecompletely specified by providing the parity check matrices for two classical codes that obey the

condition in (Eq. 4.7). Consequently, the CSS code is denoted by Q = CSS(C1, C2). It turns outthat the parameters n, k of the quantum code Q code can be related to the parameters n1, k1, n2, k2of the two classical codes C1 and C2, respectively, as below.Lemma 4.1 Properties of a CSS Code

A [[n,k,d]] stabilizer code Q = CSS (CX , CZ ), for a (n, kX , dX ) classical code CX and a (n, kZ , dZ )classical code CZ satisfies the following properties.

1. kQ = dim(

CX /

C⊥Z ) = dim(

CZ /

C⊥X ) = kX + kZ

−n,

2. d = min{dX , dZ } where

dX = min{|e| : e ∈ CX /C ⊥Z } , (4.8)dZ = min{|e| : e ∈ CZ /C ⊥X } . (4.9)

5 Quantum LDPC codes

The Tanner graph for a CSS code will have at most two types of edges. We will choose to represent

them by distinct colours in our diagrams, whenever necessary. The Shor’s [[9, 1, 3]] code is a well

known example of a CSS code. For another extensive example, see (Fig. 11.7 of [31]). The

abelian nature of S indicates that there are always an even number of vertex nodes in the commonneighbourhood6 of any two pair of opposite type check nodes. It suffices to ensure the validity of

this property for every pair of opposite type checks for the resulting Tanner graph to describe a

CSS codes. One can now describe families or ensembles of CSS codes. Since every CSS code can be

associated to a parity check matrix, let us consider an ensemble of parity check matrices again to

6By a “common neighbourhood” of two nodes x1, x2, we mean all the nodes of the graph (excluding x1, x2) that

are connected by an edge to both x1 as well as x2.

21


22/39

represent an ensemble of CSS codes, based on (Eq. 4.6). This immediately leads to the possibility

of introducing another constraint on the parity matrices in an ensemble, i.e, they be sparse. This

ensemble of CSS codes is known as a quantum LDPC code ensemble.

Definition 5.1 Quantum LDPC codes

A quantum LDPC ensemble is a family of [n, k] stabilizer codes, such that for each code Q

in

the ensemble, the corresponding stabilizer group S can be described by a choice of generators, S =S 1, S 2, . . . , S n−k such that wt (S 1) < c as n → ∞, for some constant c > 0, independent of n.By this definition, the family of generalized Shor codes is not an LDPC family. A very famous

example a LDPC ensemble of CSS stabilizer codes is the Toric Code ensemble, see [32, 27], and is

given by the following Tanner graph.

Figure 6: Tanner graph for a [[18, 2, 3]] code Kitaev’s Toric code ensemble. The codes can be made

larger by increasing the size of the lattice and introducing stabilizer generators on the new vertices.

The Kitaev’s Toric code family would have codes of the type [[2n2, 2, n]]. However, the generators

are all exactly of weight 4, irrespective of n .

In this case, all stabilizer generators are of weight exactly 4, i.,e each parity check matrix

has a constant row degree, which is 4 and a constant column degree, which is 4 again. One

particular desirable feature of having a stabilizer generator set is that a quantum circuit to measure

any syndrome for Q can be achieved with at most a constant number of gates. This has severeimplications on the value of the error threshold for fault tolerant quantum computation using these

codes [24].

The CSS construction alone does not guarantee that any ensemble of CSS codes will be an

LDPC ensemble. There is an even more serious problem. In (Cor. 3.1) we had seen that random

classical codes, those in the Shannon ensemble, all have distances that increase proportionately

with n. This is also the case with the Gallager ensemble, as shown in (Cor. 3.2). Furthermore, if

a classical ensemble does not guarantee the presence of codes with a growing minimum distance,

22


23/39

then it does not offer good error correction properties too. So, it is inevitable to expect a growing

minimum distance to have “good” classical codes. On the contrary, if we were to allow for a

growing minimum distance, the dual space will have a parity check matrix (Tanner graph) whose

row weights (check vertex degrees) increase with its length. Naively adopting the CSS construction

will lead to inclusion of these unbounded weight rows and hence we will never realize an LDPC

code ensemble by combining the parity check matrix of an arbitrary LDPC ensemble with that of its dual, as suggested in (Eq. 4.7). The question still remains as to how one could combine two

parity check matrices of classical code ensembles to produce an LDPC ensemble of quantum codes.

One of the solutions to this problem was proposed by providing an explicit construction of a parity

check matrices for quantum stabilizer code, starting with two parity check matrices from an LDPC

ensemble, which do not necessarily obey (Eq. 4.7). Hence they can both even be equal. We will

review this construction, in the form it was first proposed in [27].

5.1 Constructing the Hypergraph product codes

In [27] the authors have motivated the general idea behind producing a Tanner graph for a CSS

code, given the Tanner graphs for two classical codes T 1 and T 2 respectively. They have shown that

the Tanner graph for the Toric code (Fig. 6) can be described as a cartesian product 7 (see also the

illustration in [34]) of two (bipartite) graphs. In what follows, we will attempt a slightly different

motivation but largely following the same idea as in [27]. Let us take two Tanner graphs T 1 and T 2respectively describing two classical codes C1 and C2, drawn below, in (Fig. 7a).

7See any standard text in graph theory, such as [33].

23


24/39

(a) (b)

(c) (d)

Figure 7: Motivation for the hyper graph product operation

The most straightforward method of constructing a Tanner graph describing a quantum code

is to simply define it as a disjoint union of T 1 and T 2 where T 1 describes X −type checks and T 2describes Z −type checks, as shown in (Fig. 7a). Of course, the goal is to make a better code. In(Fig. 7b) this will involve connections between nodes in a particular column (marked by dashed

lines). But naively this is not possible because all nodes in a column are of one type – either vertex

nodes or check nodes. At this point, let us relax this rule and connect them as if they were of alternating type – vertex, check, vertex, check, etc – this is denoted by the gray outer circles in

(Fig. 7c). Now let us look at the outer markers of the vertices and connect them – precisely how is

specified by the second graph T 2. It is not hard to convince ourselves that there will be no resulting

edges of the type → or → . Moreover, if a vertex edge pair in T 1 is connected by an edgeand a vertex edge pair in T 2 is also connected by an edge, then the corresponding and type

vertices will have two vertices in their common neighbourhood, i.e, { , }. This says that we canassociate and nodes to checks of opposite types and , nodes to vertices, in (Fig. 7d). This

24


25/39

suggests a tanner graph for CSS code T . What we have done with T 1 and T 2 is formally called a

graph cartesian product (defined in [27], illustrated in [34]). In what follows, we will formalize this

construction. If T 1 and T 2 are simple cycles, T will be the Tanner graph for the Toric code, as in

(Fig. 6). The Toric code also has an equivalent formulation (as it was first formulated, see [32])

where the vertices of a square lattice are identified with X −type stabilizer generators (check nodesof the Tanner graph) and the edges of the lattice are identified with the qubits (vertex nodes

of the Tanner graph). The Z −type stabilizer generators are identified with the vertices of the line dual (see [32]) of the lattice. It then turns out that the parity check matrix of the X −type checksof the Toric code is precisely the vertex-edge incidence matrix of the square lattice and the parity

check matrix of the Z −type checks of the Toric code is precisely the vertex-edge incidence matrixof the line dual of the square lattice. However, we want a similar property to hold for all codes

constructed as outlined in (Fig. 7), not just the Toric code. For this purpose we allow the Tanner

graph of the CSS code to have column weights greater than two, in other words, we assume it to be

a hyper graph . In that case, the graph cartesian product is generalized to a hyper graph cartesian

product or simply the hyper graph product . The corresponding CSS code is called a hyper graph

product code.One can extend this pictorial description to observe that if T 1, T 2 are from a LDPC ensemble,

then T too is part of an LDPC ensemble. Formally, one can define the Hyper graph product codes

as those described by a cartesian product of two Tanner graphs.

Definition 5.2 Hyper graph product

Let C1 and C2 be two classical codes described by the Tanner graphs T 1 and T 2 respectively, where T 1 = (C 1, V 1, E 1) and T 2 = (C 2, V 2, E 2). The hyper graph product of T 1 and T 2, is a Tanner graph

T def = T 1 × T 2 where T = (C ,V,E ) such that the following holds.

C X def = C 1

×V 2 (5.1)

C Z def = V 1 × C 2 (5.2)

C def = C X ∪ C Z (5.3)

V def

= V 1 × V 2 ∪ C 1 × C 2 (5.4)E

def = {((c, v), (v, v)) : (c, v) ∈ E 1} ∪ {((c, v), (c, c)) : (c, v) ∈ E 2} (5.5)

A hyper graph product code is Q is defined as Q = CSS (CX , CZ ) where CX is defined by the Tanner graph T X

def = T C X∪V (vertex induced subgraph of T with the vertex set C X ∪ V ) and CZ is given by

the Tanner graph T Z def = T C Z ∪V .

The two blocks of the parity check matrix for Q will be described by the Tanner graphs producedby the vertex induced subgraph of T of the vertex sets C X and C Z . For an illustrative diagram

showing the vertex and edge sets of the product Tanner graph, see (Fig. 5 of [27]). What is essential

here to note is that the two classical codes C1, C2 that can be thought of as “input” classical codesto produce the CSS code Q, are not the same as CX , CZ in (Def. 5.2). In particular, the codes C1, C2can be of different lengths, however CX and CZ are always of equal lengths. For the product Tannergraph in (Fig. 7c), the Tanner graphs T X , T Z describing CX , CZ respectively, are specified by thefollowing subgraphs.

25


26/39

(a) Tanner graph T X describing C X (b) Tanner graph T Z describing C Z

Figure 8: The hyper graph product code Q is given by Q = CSS(CX , CZ ).

5.2 Visualizing the graph product

In (Def. 5.2) we saw explicit edge rules for the hyper graph product, T , which indeed specifies

the graph completely but it does not give a good visual picture of the graphical structure of T .

Let us now describe a method of constructing a Tannger graph T starting with two Tanner graphs

T 1, T 2 and argue in the course that indeed T constructed this way is the same as the hyper graph

product of T 1, T 2, according to its definition in (Def. 5.2). Throughout this section, let us assume

that T 1 = (C 1, V 1, E 1) and T 2 = (C 2, V 2, E 2) are two Tanner graphs describing the classical codesC1 and C2 respectively and T = (C ,V,E ) is a Tanner graph describing a quantum CSS code Q. Letus assume an ordering on the nodes of T 1 and T 2 respectively as

C 1 ∪ V 1 = {x1, x2, x3, . . . , xm1+n1} , (5.6)C 2 ∪ V 2 = {y1, y2, y3, . . . , ym1+n1} . (5.7)

Construct the graph for T by the following method. For each i from 1 to m2 + n2, create a copy of

T 1 and call this copy K i (see for instance in Fig. 7b). To see the connection with the definition in

(Eq. 5.4), relabel the vertices of K i with the below rules.

x j → (c1 j , v

2i ) if x j ∈ C 1 , yi ∈ V 2 (5.8)

(c1 j , c2i ) if x j ∈ C 1 , yi ∈ C 2 (5.9)

(v1 j , v2i ) if x j ∈ V 1 , yi ∈ V 2 (5.10)

(v1 j , c2i ) if x j ∈ V 1 , yi ∈ C 2 (5.11)

Associate every node of the type in (Eqs. 5.9, 5.10) to a qubit, every node of the type in (Eq. 5.8)

to a X −type stabilizer generator and every node of the type in (Eq. 5.11) to a Z −type stabilizergenerator.

26


27/39

The above method will create the same set of nodes in T as a hyper graph product of T 1, T 2must posses, according to in (Eq. 5.4). Moreover, since we have m2 + n2 disconnected copies of

the Tanner graph T 1, the edges in the graph ∪m2+n2i=1 K i is simply the edges within each copy of T 1.With the new labelling proposed in (Eqs. 5.8 to 5.11), the edges (in the disjoin union graph of all

copies of T 1) can be listed as

Ẽ = ∪n2+m2i , where , Ẽ i = {((c1k, c2 j), (v1r , c2 j )) : (c1k, v1r ) ∈ E 1} ∪ {((v1k, v2 j ), (c1r , v2 j )) : (c1r , v1k) ∈ E 1} .(5.12)

Let us proceed to the second step of the construction. Draw edges between the same-index nodes

of different copies, as if they were the nodes of T 2. In other words, for each edge (y j1, y j2) ∈ E 2,draw an edge between the node xi of the copy K j1 and the node xi of copy K j2.

That is all there is to the construction. To establish that T constructed this way is indeed the

hyper graph product of T 1, T 2, it only remains to establish that the set of edges in T is just the

same as the set in (Eq. 5.5). In the labelling of vertices proposed in (Eqs. 5.8 to 5.9), we find the

following set of new edges resulting from the second step of our construction.

˜̃E = {((v1i , c2 j ), (v1i , v2 j )) : (c2 j , v2 j ) ∈ E 2} ∪ {((c1i , v2 j ), (c1i , c2 j )) : (c2 j , v2 j ) ∈ E 2} (5.13)

Combining the set of edges in ˜̃E those in Ẽ in (Eq. 5.12) readily implies that E = Ẽ ∪ ˜̃E is indeed just the edge set of T 1 × T 2, in (Eq. 5.5). This shows that the construction for T described herewill indeed yield the Tanner graph corresponding to cartesian hyper graph product T 1 × T 2. TheKitaev’s toric code in (Fig. 6) can be realized [27] as a hyper graph product of two (cycle Tanner

graphs) Tanner graphs T 1, T 2 which corresponding to two classical repetition codes as in (Fig. 1).

5.3 Properties of hyper graph product codes

We will now analyze the properties of the CSS code Q that is specified by the hyper graph productT

def = T 1 × T 2. The properties that we will be interested in are the number of qubits encoded in Q

and its code distance, both of which are summarized in (Lemma. 4.1). However, before that we

must first prove that Q is indeed a CSS stabilizer code. Remember that the Tanner graph does notguarantee the commuting nature of the generators which are identified with the check nodes of T .

Hence, at first we will show that indeed Q defined in (Eq. 5.2). A demonstration of this fact is alsopresented in [27]. The following properties of the hyper graph product will be useful.

Lemma 5.1 Let T 1 and T 2 be two tanner graphs describing classical codes C1 and C2 and their hyper graph product be the tanner graph T = (C ,V,E ) which describes a quantum code Q. Then Qhas the following properties.

1. The Pauli operators corresponding to any two nodes in C , mutually commute.

2. The degree of every node in the tanner graph T can be bounded above by a constant if the

degree of every node in the tanner graphs T 1 and T 2 can also be bounded above by a constant.

Proof : 1. Let us consider ∈ C 1, ∈ V 1, ∈ C 2 and ∈ V 2. Using (Eq. 5.8), we findthat all X −type checks will have labels of the form ( , ) and from (Eq. 5.11) we find thatall Z −type checks will have labels of the form ( , ) in the construction of T in (Sec. 5.2).

27


28/39

Two X −type checks (or two Z −type checks will always commute), irrespective of the bits intheir neighbourhood.

It only remains to establish that operators corresponding any two checks of the form ( ,

)

and ( ,

) will commute too, i.e, the number of qubit vertices in T which are in the common

neighbourhood of ( ,

) and ( ,

) will always be even. Suppose there is an edge in T 1

of the form → , then we have edges in T of the form ( , ) → ( , ) and ( , ) →( ,

). If T 2 has no edge of the form

→ , then there will not be any qubits that aresimultaneously checked by both ( ,

) and ( ,

) and hence the operators corresponding

to these two checks will trivially commute. Otherwise, there will be two additional edges of

the type ( ,

) → ( , ) and ( , ) → ( , ). Consequently, the checks ( , ) and( ,

) will have two qubits vertices in their common neighbourhood. In this case again,

an even number of anti-commutating locations will imply that the corresponding operators

mutually commute. See the special case of this fact in (Fig. 7d).

2. Let us suppose 1 ∈ C 1, 1 ∈ V 1, 2 ∈ C 2, 2 ∈ V 2 and deg(·) denotes their respective

degrees in their respective tanner graphs (T 1 and T 2). Vertices in T are labelled by ( 1, 2)and ( 1, 2). All the edges on ( 1, 2) can be divided into those within a copy of T 1 and

those across different copies of T 1. Each copy of T 1 has deg( 1) edges incident on 1. An

edge from ( 1, 2) is drawn to another copy of T 1 if and only if the second copy contains a

vertex of the form ( 1, ) where and 2 share an edge in T 2. So, the number of edges

from ( 1, 2) that go between different copies of T 1 are deg( 2). Therefore, the degree of

the vertex ( 1, 2) in T is just deg( 1) + deg( 2). The following results are analogously

obtained.

deg( 1, 2) = deg( 1) + deg( 2) (5.14)

deg( 1, 2) = deg( 1) + deg( 2) (5.15)

deg( 1, 2) = deg( 1) + deg( 2) (5.16)

Hence the lemma follows.

The above properties will now imply the following conditions for the quantum code described

by T .

Corollary 5.1 Let T 1, T 2 and T as in (Lemma. 5.1) and T def = T 1 × T 2. Then Q specified with T

can be classified by the following.

1. Q

is a CSS quantum stabilizer code.

2. Q belongs to a quantum LDPC family (Def. 5.1) if both C1 as well as C2 belong to classical LDPC families (Def. 3.1).

Finally, we comment on the code properties mentioned in (Lemma. 4.1), of the quantum stabilizer

code Q that is described by the Tanner graph T in the above corollary.Lemma 5.2 The code described by the Tanner graph in (Cor. 5.1) satisfies dim(Q) = k1k2 + kT 1 kT 2where ki = dim(C1) and kT i = dim(CT i ).

28


29/39

Proof : An expression relating the dimension of Q = CSS(CX , CZ ) to the dimension of the classicalcodes CX and CZ is already provided in (Eq. 4.1). It only remains to compute the dimensions of theclassical codes CX and CZ respectively. Let us begin with CX , which is described by T X the vertexinduced subgraph of T with the vertex set C X ∪ V , defined in (Def. 5.2). When the nodes of T areembedded on a grid, the subgraph T X has the general form as shown below.

Figure 9: The edges cannot be drawn as we would loose generality. One way to interpret this graph

is to compare it to (Fig. 7c) with the outer-labels of the nodes represented as the second co-ordinate

of the tuple and the inner label as the first co-ordinate of the tuple. The first label is the label of a

node in T 1 and the second is a label of a node in T 2.

The above is just a Tanner graph for a classical code whose dimension is simply the number of

linearly independent assignments to the vertices, i.e, nodes of the types ( , ), ( , ). However,

the assignments along a row and column are not independent. Hence let us consider its transposed

version, as shown below.

29


30/39

Figure 10: We have retained the same graph as drawn in (Fig. 9) but relabelled ( , ), ( , )

to and the others to check nodes . Following this, we have expressed it as a transpose (see

Lemma. 2.1) of a Tanner graph that is obtained by simply exchanging the check nodes with the

vertex nodes. Any assignment to the vertex nodes (in this transposed graph) must be such that it

satisfies both the Tanner graphs – T 1 as well as T T 2 .

In the transposed graph, it is clear that any assignment to the vertex nodes must be such that

it must satisfy the Tanner graph T 1 (indicated by a red dashed line) and the transposed Tanner

graph T T 2 indicated by a dashed blue line. Hence every possible assignment to the vertices can be

held in one-one correspondence with m2

×n2 binary matrices whose rows are any basis vectors of

C1 and columns are any basis vectors of CT 2 . The dimension of the space8 of such matrices is justthe product of dimensions of C1 and CT 2 , which is k1kT 2 . Similarly, we will find that the dimensionof the space consisting of all assignments to the Tanner graph T Z is k

T 1 k2.

dim(CT X ) = k1kT 2 (5.17)dim(CT Z ) = kT 1 k2 (5.18)

Using the relation between the dimensions of the transpose code and the code itself, as mentioned

in (Lemma. 2.1) along with the fact that the number of check nodes in T X is m1n2 (those in T Z is

n1m2) and the number of bit (vertices) is n1n2 + m1m2 we find that

dim(CX ) = n1n2 + m1m2 − m1n2 + k1kT 2 , dim(CZ ) = n1n2 + m1m2 − n1m2 + kT 1 k2 . (5.19)

Finally, using (Lemma. 4.1) that relates the kQ to the dimensions of the two classical codes CZ , CX ,we find

dim(Q) = n1n2 + m1m2 − m1n2 + k1kT 2 + n1n2 + m1m2 − n1m2 + kT 1 k2 − n1n2 − m1m2 (5.20)8In [27], this set of assignments is told to form a product code C 1 ⊗ C 2, however we do not need the concept of the

product code explicitly here.

30


31/39

= (n1 − m1)(n2 − m2) + k1kT 2 + kT 1 k2 (5.21)= (k1 − kT 1 )(k2 − kT 2 ) + k1kT 2 + kT 1 k2 (5.22)= k1k2 + k

T 1 k

T 2 , (5.23)

where we have used the expression in (Lemma. 2.1) relating the dimensions of the code and its

transpose, to arrive at (Eq. 5.22) from (Eq. 5.21). The lemma now follows.

The rate of Q in the above lemma suggests that if C1 and C2 are two finite rate codes used toconstruct the CSS code Q then Q will have finite rate too. In the case of the Kitaev’s toric codefamily, the two classical codes C1, C2, CT 1 , CT 2 where C1 = C2 are repetition codes, which have unitrate. As a result, we have kQ = 2 for all codes in the Toric code family, which is consistent with

[32].

We now move on to analyzing the distance of the hyper graph product CSS code Q. Thedistance of a CSS code Q = CSS(CX , CZ ) is bounded below by the minimum of the two distancesof the constituent classical codes CX and CZ . However, this naive definition will not lead to a goodestimate of the distance of a code in a LDPC family. This is because of the additional requirementthat C⊥Z ∈ CX – the least weight codeword in CX cannot have a weight that is more than the leastweight of a codeword in C⊥Z , the latter of which we know is a constant because rows of the paritycheck matrix of CZ (which are words of the dual code) have constant weight. Therefore, we will beforced to conclude that the distance of the LDPC family of hyper graph product codes is at least

a constant. In [27], there is a method to obtain a better distance, which is to use the formula of

the distance mentioned in (Lemma. 4.1). This definition ensures that the distance of a quantum

code corresponds to the least weight operator (least weight codeword in CX ) that is not one of thestabilizers itself (not in the dual C⊥Z ). Let us use this definition to formulate a better estimate of the distance of the hyper graph product LDPC code, as follows.

Lemma 5.3 The code described by the Tanner graph in (Cor. 5.1) satisfies d(Q) = Ω(√ n).Proof : The minimum distance of Q is provided in terms of the minimum distances of the classicalcodes C1, C2, CT 1 , CT 2 in (Eq. 4.1). We begin with a claim that for dX ≥ min{d1, dT 2 }. This meansthat the least (hamming) weight of an assignment that satisfies all the checks in T X is at least

min{d1, dT 2 }. To validate this claim, we will show that any assignment of a lowest weight will notsatisfy the checks in T X . Suppose there exists such an assignment with support on vertices in a

region R1 as shown below.

31


32/39

Figure 11: The vertices coloured red, belong to R1. The vertices are referred as the support of R1.

Let R1 = {(u, v) : u ∈ V 1, v ∈ V 2} ∪ {(c, d) : c ∈ C 1, d ∈ C 2} and #R1 < min(d1, dT 2 ). LetT 1 be the vertex induced subgraph of T 1 of the vertex set V

1 ∪ C 1 and T 2 be the vertex induced

subgraph of T 2 of the vertex set V 2 ∪ C 2. This is the smallest rectangle in (Fig. 11) that enclosesthe support of R1. Moreover, let k1 and k2 be the number of basis vectors of C1 and C2 whosesupport is contained entirely within V 1 and V 2 respectively. Similarly, let k

T 1 and k

T 2 be the number

of basis vectors of CT 1 and CT 2 whose support is contained entirely within C 1 and C 2 respectively.Note that all basis vectors of C1 and CT 2 have weights greater than min(d1, dT 2 ), by definition of thedistance. Now #R1 < min(d1, d

T 2 ) implies #V

1, #C

2 < min(d1, d

T 2 ) and hence k1 = k

2 = 0. The

assignment in R1 cannot satisfy all the checks in T X . Hence a contradiction. This implies that thedistance of CX is at least min{d1, dT 2 }. A similar argument will show that the distance of CZ is atleast min{dT 1 , d2}.

In [27] it is also demonstrated that min{d1, d2, dT 1 , dT 2 } is also an upper bound for the distance.Since good classical codes have distance that is linear in their length, i.e, n and the length of Q islinear in n2, we find that dQ = Ω(n).

We will close the discussion of by mentioning the explicit form of the parity check matrix of the

resulting CSS code Q. Note that until now we have only described its Tanner graph. However,this is sufficient to describe the CSS code, nevertheless we will mention its parity check matrix

too. The adjacency matrix of a graph, strictly speaking is a square matrix, and in the case of a

bipartite graph, like a Tanner graph described by a parity check matrix, the adjacency matrix A

corresponding to the parity check matrix H is simply

A =

0 H

0 HT

. (5.24)

32


33/39

It turns out that the adjacency matrix A of a hyper graph product T def = T 1 × T 2 is related to the

adjacency matrices A1,A2 of T 1, T 2 respectively, by [33]

A = A1 ⊗ 1l + 1l ⊗ A2 . (5.25)

Combining the above fact with (Eq. 5.24) and (Eq. 4.6), it is derived in [27] that the parity checkof the CSS code described by T is given by

HQ =

H1 × 1ln2 1lr1 ⊗HT 2 0

0 1ln1 ⊗H2 HT 1 ⊗ 1lr2

, (5.26)

where H1,H2 are r1 × n1 and r2 × n2 parity check matrices corresponding to the Tanner graphsT 1, T 2 respectively. With the above formula, it can be appreciated that indeed the input classical

codes C1 and C2 do not straightforwardly make up the parity check matrix of Q appearing in (Eq.4.6).

6 Other constructions

There are a few other constructions of LDPC ensembles of CSS stabilizer codes starting with two

classical codes from classical LDPC code ensembles. We will briefly outline some of them.

6.1 Homological product codes

It is clear from the form of the parity check matrix of a CSS stabilizer code in (Eq. 4.7) that the two

sub-matrices describing X −type and Z −type checks, denoted H1 and H2, respectively, must satisfyH1 ·HT 2 = 0. The correspondence of this requirement with properties of certain operators known asboundary operators in homology theory was used by Sergey Bravyi in [28], to introduce a type of

LDPC code ensemble called Homological Product codes . A boundary operator δ , is represented by asquare matrix, in this case, with all binary entries and it has the property δ 2 = 0. This immediately

suggests that one can assign

H 1 = δ , H 2 = δ T (6.1)

to observe the condition in for a CSS code in (Eq. 4.7) to hold. This is tantamount to considering

the rows of δ to represent the binary encoding of X −type stabilizer generators and the columns of δ to represent an encoding of Z −type stabilizer generators. Moreover, if δ is a boundary operator,then so is ∆ = δ 1 ⊗ 1l + 1l ⊗ δ 2. The key question is to construct a boundary operator such as δ ,starting from the (r + n)

×(r + n) adjacency check matrix A of the Tanner graph of a classical linear

code, which is addressed in [28, 35]. Then a straightforward application of (Eqs. 6.1, 4.6) yields the

corresponding quantum CSS code. For details and a derivation of the various code parameters of

the resulting CSS code, refer to [28].

6.2 Mckay codes

This is a different method of constructing a LDPC ensemble of CSS stabilizer codes, however, yet

it relies on the requirement of the dual containing nature of the parity check matrix of a CSS code,

33

classical and quantum low-density parity-check codes: constructions and properties

Documents