[ieee icc/supercomm'94 - 1994 international conference on communications - new orleans, la, usa...

9
METHODS FOR SYNCHRONIZING (d, k)-CONSTRAINED SEQUENCES Mario Blaum* Jehoshua Bruck* C. Michael Melas’ Henk C. A. van Tilborgt ‘IBM Almaden Research Center San Jose, CA 95120, USA t Eindhoven University of Technology Eindhoven, The Netherlands Abstract Given a (1,7) code, a loss of synchronization may occur by insertion or deletion of a 0 or a 1. That event is catastrophic, i.e., it will cause an unlimited number of errors in the data since the moment of the loss of synchronization. We introduce two methods for recovering against insertions or deletions of symbols. The first one allows for identification of up to 3 in- sertions and/or deletions in a given block, permitting quick synch recovery. This method is based on vari- able length codes. The second method is of block type, and allows for detection of large numbers of insertions and deletions. Both methods can be extended to general (d,k) constrained codes. Keywords: Magnetic Recording, Run-length Con- strained Sequences, Synchronization, Insertions, Dele- tions, Modulation Codes. 1 Introduction A typical encoding configuration for a magnetic or optical recording channel consists of encoding the in- formation bits with an error-correcting code [4] fol- lowed by a (d,k) modulation code [5]. The error- correcting code is selected according to the statistics of errors produced by the channel. The choice of the (d, IC) constraints for the modulation code depends on the type of signal detection used. The number d in- dicates the minimum number of 0’s between two con- secutive 1’9, and its purpose is to reduce intersymbol interference. The number k indicates the maximum number of 0’s between two consecutive 1’s. Its choice determines self-clocking properties of the sequences. The modulated sequence is transmitted through a noisy channel and then demodulated. The demodn- lated sequence goes to a decoder, which attempts to correct possible errors. The output of the decoder is taken as an estimate of the transmitted sequence. One of the problems with this approach is that the demodulator propagates errors: a single error may be- come a burst. Hence, the error-correcting code must be a burst correcting code, even when noise in the channel is dominated by random errors. It should be noted that in general, more redundancy is needed to correct t bursts than to correct t random errors. The most widely used codes for burst correction are Reed-Solomon codes [4]. Reed-Solomon codes are actually byte-correcting codes which are interleaved in order to achieve burst correction. In most a p plications, the size of a byte is 8 bits, so usually Reed-Solomon codes over GF(P) = GF(256) are con- sidered. By interleaving a t-byte correcting Reed- Solomon code over GF(2’) to depth m, it is possible to correct up to t bursts of length up to v(m - 1) + 1 bits each. Popular modulation codes in magnetic recording are the rate 1/2 (2,7) [2] and the rate 2/3 (1,7) [l] codes. An important characteristic of the (2,7) and the (1,7) codes is that they both have limited error propagation. Specifically, a bit error in the (2,7) code propagates at most over 4 bits after demodulation, while a bit error in the (1,7) code propagates at most over 5 bits. Hence, if the channel produces only ran- dom errors, no more than a doubly interleaved Reed- 0-7803-1825-0194 $4.00 0 1994 IEEE 1800

Upload: hca

Post on 09-Mar-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE ICC/SUPERCOMM'94 - 1994 International Conference on Communications - New Orleans, LA, USA (1-5 May 1994)] Proceedings of ICC/SUPERCOMM'94 - 1994 International Conference on Communications

METHODS FOR SYNCHRONIZING ( d , k)-CONSTRAINED SEQUENCES

Mario Blaum* Jehoshua Bruck* C. Michael Melas’ Henk C. A. van Tilborgt

‘IBM Almaden Research Center San Jose, CA 95120, USA

t Eindhoven University of Technology Eindhoven, The Netherlands

Abstract Given a (1,7) code, a loss of synchronization may

occur by insertion or deletion of a 0 or a 1. That event is catastrophic, i.e., it will cause an unlimited number of errors in the data since the moment of the loss of synchronization. We introduce two methods for recovering against insertions or deletions of symbols. The first one allows for identification of up to 3 in- sertions and/or deletions in a given block, permitting quick synch recovery. This method is based on vari- able length codes. The second method is of block type, and allows for detection of large numbers of insertions and deletions.

Both methods can be extended to general (d,k) constrained codes.

Keywords: Magnetic Recording, Run-length Con- strained Sequences, Synchronization, Insertions, Dele- tions, Modulation Codes.

1 Introduction A typical encoding configuration for a magnetic or

optical recording channel consists of encoding the in- formation bits with an error-correcting code [4] fol- lowed by a (d,k) modulation code [5]. The error- correcting code is selected according to the statistics of errors produced by the channel. The choice of the (d , I C ) constraints for the modulation code depends on the type of signal detection used. The number d in- dicates the minimum number of 0’s between two con- secutive 1’9, and its purpose is to reduce intersymbol interference. The number k indicates the maximum number of 0’s between two consecutive 1’s. Its choice

determines self-clocking properties of the sequences. The modulated sequence is transmitted through a noisy channel and then demodulated. The demodn- lated sequence goes to a decoder, which attempts to correct possible errors. The output of the decoder is taken as an estimate of the transmitted sequence.

One of the problems with this approach is that the demodulator propagates errors: a single error may be- come a burst. Hence, the error-correcting code must be a burst correcting code, even when noise in the channel is dominated by random errors. It should be noted that in general, more redundancy is needed to correct t bursts than to correct t random errors.

The most widely used codes for burst correction are Reed-Solomon codes [4]. Reed-Solomon codes are actually byte-correcting codes which are interleaved in order to achieve burst correction. In most a p plications, the size of a byte is 8 bits, so usually Reed-Solomon codes over G F ( P ) = GF(256) are con- sidered. By interleaving a t-byte correcting Reed- Solomon code over GF(2’) to depth m, it is possible to correct up to t bursts of length up to v(m - 1) + 1 bits each.

Popular modulation codes in magnetic recording are the rate 1/2 (2,7) [2] and the rate 2/3 (1,7) [l] codes. An important characteristic of the (2,7) and the (1,7) codes is that they both have limited error propagation. Specifically, a bit error in the (2,7) code propagates at most over 4 bits after demodulation, while a bit error in the (1,7) code propagates at most over 5 bits. Hence, if the channel produces only ran- dom errors, no more than a doubly interleaved Reed-

0-7803-1825-0194 $4.00 0 1994 IEEE 1800

Page 2: [IEEE ICC/SUPERCOMM'94 - 1994 International Conference on Communications - New Orleans, LA, USA (1-5 May 1994)] Proceedings of ICC/SUPERCOMM'94 - 1994 International Conference on Communications

Solomon code is needed for error correction. So, Reed-Solomon codes can easily handle the most

common type of errors: random errors and peak shifts. A random error can be of two types: a 0 becomes a 1, denoted 0 4 1 , or a 1 becomes a 0, denoted 1+0. Peak shifts are also of two types: 0 1+1 0 or 1 0 4 0 1.

However, there are other types of errors that cause a catastrophic failure due to loss of synchronization. They are, deletion of a symbol (0 or 1) and insertion of a symbol (0 or 1). Although deletions and insertions are not as common as the other types of error, it is important to have a method to detect them and dif- ferentiate them from regular errors. If we are able to determine how many insertions or deletions occurred in an interval, by inserting or deleting a proper amount of symbols we will be able to restore synchronization and limit the damage. The interval in question may have many errors after demodulating, but its length will not exceed a certain bound. Hence, we are going to have a burst error that will either be corrected by the outer error-correcting code or, if uncorrectable, at least it will have a limited length.

The following example illustrates the catastrophic nature of an insertion or a deletion when the original data are retrieved.

Example 1 Assume that we want to transmit the fol- lowing information string of length 100 bits:

11 11001 110100010011 111001 0101 11 10011011 110001 11 100 001 1001101110000011 110111 0011011010000001010101111

A =

Encoding the string above into a (1,7) modulated string of length 150, using, for instance, the scheme described in [l], we obtain

010100000010101000100010100010 000101010100100100000010101010

B = 000100010100000010010101010101 010000010010010000100101000001 000101001010010010100100100100

Now, assume that B is transmitted but C is ceived, where C is given by

010100000010101000100010100010 000101011001001000000101010100

c = 001000101000000100101010101010 100000100100100001001010000010 00101001010010010100100100100

re-

We observe that at bit 39, a 0 has been deleted from B , causing a loss of synchronization of one bit to the left (moreover, in this particular example, the loss of synchronization brakes the d constraint, since we have two consecutive zeros). When we demodulate and exclusive-OR the result with the original string A, we obtain the following error vector:

000000000000000000000001 1 11 100001 111 11 10001 1000010 111111 111000101011 1011100 1011111011010111111110000

E =

We observe that starting at bit 24, we have an un- bounded string of errors. In other words, a failure that involves a single deletion or insertion of a symbol is catastrophic.

The purpose of this paper is to overcome the situa- tion described in Example 1, by providing two meth- ods that allow identification of a certain number of insertions and deletions of a symbol for (d,k) con- strained sequences. The methods can be extended to other types of constrained sequences. The first method involves arithmetic modulo k - d+ 1 and vari- able length codes at the binary level. I t can detect up to a total of three insertions and/or deletions of a sym- bol for a (d, k) code. The second method is based on inserting appropriate synchronization vectors at fixed locations in the ( d , k) sequence. In some cases, it can detect large numbers of insertions and deletions.

In general, we will work with (1,7)-constrained se- quences, although the methods can be easily extended to more general (d,k) sequences, as well as to other types of constrained sequences. In the next section, we describe the method based on arithmetic modulo k - d + 1 and variable length codes at the binary level, while in Section 3, we describe the block cod- ing method.

Method based on arithmetic modulo

From now on, consider (1,7) sequences. We make the following 1-1 mapping between a (1,7) sequence and symbols in 2, (i.e., set of integers modulo 7): to each run of 0’9, we associate the number of zeros minus one.

2 k - d + I

Example 2 Assume that we have the (1,7) sequence

001000001000101000100001.

1801

Page 3: [IEEE ICC/SUPERCOMM'94 - 1994 International Conference on Communications - New Orleans, LA, USA (1-5 May 1994)] Proceedings of ICC/SUPERCOMM'94 - 1994 International Conference on Communications

1 Error TvDe I Binarv 11 7- ary 11 Remarks I 1 - *

No Errors ... ai-1 a, %+I ai+Z Peak Shift 1 O+O 1 . . . ai-1 ai + 1 a, - 1 a,+2 ...

Random Error 0 141 0 ... ai-1 a; - 1 ai + 1 a;+2 ...

0+1 ... ai-1 U v a,+l a,+2 u + v = a, - 2

Deletion

Insertion

1+0 ... ai-1 ai + ai+l + 2 ~ i + 2 ai+3 . . . of 1 ... ai-1 a , +a,+1 + 1 a;+2 a,+3 .. . of 0 ... ai-1 a; - 1 of 1 ... ai-1 U 2/ ai+l U,+Z U + V = U , - 1

%+I ai+Z

The corresponding representation with symbols in 2 7 is

I o f 0 11 ... I a;-1 I ai + 1

1 4 2 0 2 3 .

ai+i %+a ...

If we denote by L the length of the binary string, by C the length of the 7-ary string and by S the sum of the symbols in the 7-ary string, these three parameters are related by the following equation:

the middle n - 2 symbols carry the information. We require that in each block, the sum of the symbols modulo 7 is 0. The last symbol in a block and the first symbol in the next block are chosen in such a way that their sum is equal to 6. Thus, we are inserting ex- actly 10 binary symbols between blocks in the binary sequence. It is important to have a fixed amount of redundancy while attempting to recover synchroniza- tion. Finally, we set the initial condition a0 = 0. So, if we have a 7-ary sequence

In Example 2, we see that L = 24, S = 12 and C = 6, satisfying Equation (1).

The correspondence described above for (1,7) se- quences can be easily extended to general (d , k) con- strained sequences. In effect, each run of zeros is as- sociated with the number of zeros minus d. This way, we associate each string with the set Zk-d+l (i.e., the set of integers modulo k - d + 1).

Assume that we have a (1,7) binary constrained sequence. When a bit is affected by an error, it will have an effect on the representation over 27 that will depend on the type of error considered. Table 1 gives the possible types of binary errors and their effects on the 7-ary representation. We are assuming that symbol a, has been affected by the error but ai-1 has not.

The next subsection describes the encoding. 2.1 Encoding

At the 7-ary level, we encode the information using an [n,n - 21 block code, where n 2 7. The first and the last symbols in a block are redundant, while

the redundant symbols

are obtained as follows:

- . . . - ...,

1802

Page 4: [IEEE ICC/SUPERCOMM'94 - 1994 International Conference on Communications - New Orleans, LA, USA (1-5 May 1994)] Proceedings of ICC/SUPERCOMM'94 - 1994 International Conference on Communications

where the sums are taken over 2 7 (in the general case, they are taken over Z ~ - d + l ) .

We illustrate the encoding process in the following ex ample.

Example 3 Consider the binary string A in Exam- ple 1, and its (1,7) modulated version B . Writing B in 7-ary representation, we obtain

2. After the last block in error, say block m + ~ , there are at least 3 error-free blocks.

3. The length n of each block is at least 7 (in general, ic - d + 1).

Assume also that these errors do not affect the block sum modulo 7 in the first m blocks, but the (m + 1)th block is affected. We are assuming that the blocks have length n, so according to the notation of

D = 0050022023000115000320510000004113104 Subsection 2.1, this means

2010110111

The length of D is 47, so we add a 0 to it in order to n-1

have six 7-ary blocks of length 8 each. Applying the

length 8, we obtain

Cbl.+, = o for o 5 I 5 m - 1, encoding procedure to D with respect to the blocks of i = O

a-1

C b m u + i + 0. i =O

E = 0 0 0 5 0 0 2 2 0 5 1 2 3 0 0 0 1 1 5 1

5 0 0 0 3 2 0 5 1 5 1 0 0 0 0 0 0 4 1 1

5 1 3 1 0 4 2 0 1 4 2 0 1 1 0 1 1 1 0 0

Writing E in binary representation, we obtain

We need now to perform three steps:

1. Fieestablish synchronization at the 7-ary level, i.e., look for the smallest number N, (m + j ) n - 3 5 N 5 ( m + j ) n + 3 and 1 5 j 5 X - 1, such that

F = 01010 10000001010 100010001010000001 and 0010001000010101010010010000001001 0000001010101000010001010000001001 bN+ln-l + b ~ + r r = 6, 1 5 1 5 3, (3) 0000001001010101010101000001001001 0000001001000010010100000100010100

where X and 3 were defined above and are estab- lished using: the error statistics of the channel. -

1000001000101001001010010010010101 Of course, the 28 equations (2) and (3) can be satisfied even in the presence of errors, but the probability that this will occur is (1/7)2', a small number if s is big enough.

The next subsection describes the procedure for re- covery of synchronization, which is the main result in this section. 2.2 Recovery of Synchronization

Assume that we receive the 7-ary sequence b o , b l , b 2 , . . ., and that errors have occurred, includ- ing possible insertions and/or deletions of symbols. We will be able to recover synchronization with high probability under the following conditions (that are determined by the error statistics of the channel):

1. At most 3 errors in at most X consecutive 7-ary blocks of length n have occurred, say in blocks m, m+l , . . ., m+r, where T 5 A - 1 (in the general case, we assume at most [(k - d)/2J errors).

2. Find the number of deletions minus the num- ber of insertions that occurred in the binary interval corresponding to the 7-ary interval bmu, b, ,+l , . . . , blv-1 (this process will be de- scribed below and is one of the more important features of the present method).

3. Reestablish synchronization at the binary level by deleting the redundant symbols in the unaffected interval, and deleting the redundant symbols in the affected interval plus the number of deletions minus the number of insertions. The symbols in

1803

Page 5: [IEEE ICC/SUPERCOMM'94 - 1994 International Conference on Communications - New Orleans, LA, USA (1-5 May 1994)] Proceedings of ICC/SUPERCOMM'94 - 1994 International Conference on Communications

Error Type 1 o+o 1 0 1+10

0+1 1+0

Deletion of 1 Deletion of 0 Insertion of 1 Insertion of 0

Add to (length, sum) (0, 0)

Table 2: Types of errors and their effect on the length and the sum

the affected interval can be set equal to 0, so an erasure is declared there. When demodulating, an erasure will also affect a string in the demodu- lated binary sequence. However, for the error cor- recting code it is easier to correct erasures than to correct errors, and the error-correcting power of the code is enhanced.

The various errors will affect two parameters: the length of the affected interval (the length of the orig- inal transmitted interval is (T + 1)n) and the sum of the entries modulo 7 in the interval (the sum of the transmitted entries is C,"=','

Table 2, based on the results of Table 1, gives the ef- fect of the different types of errors over the length and the sum. For instance, if we write that error 0+1 cor- responds to (1, -2), it means that the length (7 + 1). is increased by 1 modulo n and the sum modulo 7 of the entries am,,, am,,+l , . . . , a(m+r)n+,,-l is decreased by 2.

Since peak shifts will not have a global influence over the length and the sum of entries, they will be ignored.

Let p be the difference between the number of 0+1 errors and the number of 1+0 errors, 7 the difference between the number of deletions of 1 and the number of insertions of 1, and 6 the difference between the number of deletions of 0 and insertions of 0. In order to recover synchronization, we need to find y + 6, the total difference between the number of deletions and insertions.

If the total length of the affected 7-ary interval is C = N - mn, and the total sum modulo 7 of the entries in that interval is S = b , , then, according to Table 2,

am+, = 0 in 27).

p(l, -2) + ~ ( - 1 ~ 1 ) + 6(0, -1) (Cl S) mod (n, 7).

Writing each coordinate separately, we obtain

P - 7 E 1 mod n E N mod n (4)

- 2 P + 7 - 6 E S m o d 7 (5)

Since we are assuming that the number of errors is at most 3, P-7 cannot exceed 3 in absolute value, i.e., -3 5 /3 - 7 5 3. Hence, P - 7 = 2, where -3 5 i 5 3 and i = N mod n.

Using this relation in (5) to eliminate 0, we obtain

7 + 6 E - 2 i - S mod 7. (6)

Since, in particular, -3 7 + 6 5 3, there is a unique 3 such that -3 5 -2C - S + 71 5 3. Hence,

7 + 6 = - 2 i - s + 7 1 . ( 7)

The following example illustrates the process of re- covery of synchronization.

Example 4 Assume that the string E of Example 3 is transmitted but the following string is received:

F = 0 0 0 5 0 0 2 2 0 5 1 2 3 0 0 0 1 1 5 1 5 0 0 -1 3 2 0 5 7 1 0 0 0 0 0 0 4 1 1 5 1 3 1 0 4 2 0 1 4 2 0 1 1 0 1 1 1 0 0

In binary representation, the received string is

0101010000001010100010001010000 0010010001000010101010010010000 0010010000001010110000100010100 0000100000000100101010101010100 0001001001000000100100001001010 0000100010100100000100010100100 1010010010010101

Now, let us apply the steps of recovery of syn- chronization to F. The received symbols are F = b o , b l , . . . , b58 . R e c d that n, the size of a block, is equal to 10. Notice that

1804

Page 6: [IEEE ICC/SUPERCOMM'94 - 1994 International Conference on Communications - New Orleans, LA, USA (1-5 May 1994)] Proceedings of ICC/SUPERCOMM'94 - 1994 International Conference on Communications

Now, we have to find the N that retrieves synchro- nization according to the procedure described above. In this example, we take X = 2 and s = 2. We easily verify that N = 29 satisfies (2) and (3) for X = 2 and s = 2.

Now that we have found N , we have to find the difference 7 + 6 between the number of deletions and the number of insertions.

Notice that, since N = 29 3 -1 mod 10, I = -1. Also, since the affected interval in 7-ary is [20,28], we obtain

2a

S = z b , = 2 1 ~ O mod 7. 20

Therefore, - 2 i - S = 2, and according to (6) and (7), the difference between the number of deletions and the number of insertions in the affected interval is 2. Hence, we assume that 2 deletions occurred in the 7-ary interval [20,28]. Using (l), we see that in the binary representation the affected interval is [69,108]. Now, symbol 19 in the 7-ary representation is a 1 and is presumed to be correct, so the next symbol has to be a 5 , which corresponds to 7 binary digits that have to be deleted from the binary interval [69,108]. Sim- ilarly, symbol 29 in the 7-ary sequence is 1, so we assume that the previous symbol was a 5 , i.e., we also have to delete 7 binary symbols in interval [69,108]. Since we also have to insert 2 symbols, due to the 2 deletions in the interval, we have to delete a total of 12 binary symbols. We write 0’s for the bits in the affected binary interval, so we have a total of 28 zeros in that interval. After deleting the redundant symbols in the unaffected intervals, we end with the following string:

010100000010101000100010100010 000101010100100100000010000000 000000000000000000000101010101 010000010010010000100101000001 000101001010010010100100100101

Demodulating this string, we obtain the following output:

111100111010001001111100101011 110111111111111111111111001101 110000011110111001101101000000 1010101111

If we exclusive-OR this output with the input A, we obtain

As we

000000000000000000000000000000 0001001000011100001 11 100000000 000000000000000000000000000000 0000000000

original

can see, the size of the burst is limited and synchronization has been restored. The reader is en- couraged to compare this example with Example 1, in which the loss of a single bit caused a catastrophic failure.

We end this section with the following remarks:

The safety parameter s that makes (1/7)2’ snffi- ciently small, needs ns consecutive error free 7- ary symbols. If that is too much, there are two possibilities:

(a) Add another equation to each block, like the

(b) Decrease the block length.

first momentum.

L. Often more can be said about the errors. Some- times they can even be corrected. For instance, if the gap has length I = 2n, the sum in that gap is congruent to 0 modulo 7 and the sum in the first block is congruent to 1 modulo 7 (resp. to -1 modulo 7), then we conclude that there was a peak shift at the border between the two blocks of length n.

3. Without the possible insertions of 0’s and 1’s at the binary level, up to 6 errors can be detected for n 2 13, since 7 and 6 can no longer be negative and thus 7 + 6 belongs in { 0,1,2,3,4,5,6}.

3 Method based on insertion of a fixed sequence

Consider a (1, 7) sequence. We divide the sequence into blocks of predetermined length. This length is determined according to the statistics of errors of the channel. At the end of each block, we add the follow- ing vector:

1805

Page 7: [IEEE ICC/SUPERCOMM'94 - 1994 International Conference on Communications - New Orleans, LA, USA (1-5 May 1994)] Proceedings of ICC/SUPERCOMM'94 - 1994 International Conference on Communications

- 7 = 2 0 1 0 0 0 0 0 0 0 0 1 0 y , (8)

where x is the complement of the previous bit and y is the complement of the next bit. As we can see, I: has length 14. So, a block is composed by m bits: the first m - 14 bits correspond to the original (1,7) sequence, and the last 14 bits correspond to I: as defined in (8). Notice that for a general (d,k) code, we define I: as the vector of length k + 2d + 5 given by

d L + 1 d --- - 7 = zoo ... 0 1 0 0 ... 0 1 0 0 ... oy, (9)

where x is the complement of the previous bit and y is the complement of the next bit. As we can see, the idea is that there is a unique run of 8 (in general, k + 1) zeros in the absence of errors at the end of each block. The location of this run is easily identifiable. If a limited number of insertions and/or deletions has occurred, then, by determining how many locations to the right or to the left the pattern with 8 zeros has been shifted, we can tell the difference between the number of deletions and the number of insertions in a block.

The next example illustrates the encoding process described above.

Example 5 Assume that we have the random se- quence of length 64 (8 bytes)

A = 00001100101 110001 11 1101011011 101 00100111011100001000101010110001

Encoding this sequence into a (1,7) code as in Ex- ample 1, and dividing the (1,7) sequence into blocks of length 16, we obtain

0100100101010101 0010100001010000 0100100101010000 0010101000100000 1000001001010001 0100100100101010

B =

Concatenating the vector I: as described in (8) to the end of each block of length 16, we obtain

0100100101010101 0010100001010000 0100100101010000 00 10 101000 100000 1000001001010001 0 100100 100 10 101 0

c =

00100000000101 10100000000101 10100000000101 10100000000100 00100000000101 10100000000101

There are severalother posible patterns for the vec- tor ?I. We can also write 7 zeros instead of 8 in (8), i.e.,

- 7 = 2 0 1 0 0 0 0 0 0 0 1 0 y . (10)

In this case, I: has length 13 instead of 14. More im- portantly, the (1,7) constraints are not broken in the encoding. However, there is a tradeoff. The code that uses 1: as defined by (8) can recover from a number of insertions or deletions equal to roughly half the block length. The code that uses as defined by (10) can recover against at most 6 insertions or deletions. The reason a run of 7 zeros works nearly as well as a run of 8 zeros is given by the fact that runs of 7 zeros a p pear quite infrequently: their relative frequency with respect to other runs (provided random data before encoding) is .77% [3].

Other possibilities for I: involve eliminating the bits x and y. In that case, (8) gives

(11) - 7 = 0 1 0 0 0 0 0 0 0 0 1 0 ,

while (IO) gives

1: = 0 1 0 0 0 0 0 0 0 1 0 . (12)

The process of recovery of synchronization, to be described in the next section, works similarly with the different definitions of 1. Although in the preferred embodiment we use the definition given by (8), the different implementations are obvious as well as dif- ferent definitions of I:. The next subsection describes the process of recovery of synchronization.

3.1 Recovery of Synchronization In this section, we denote a (1,7) (or (d,k)) se-

quence by % = gl, g2,. . . I s l . . ., where 3 is a run of zeros followed by a 1. By U, we denote the actual length of the run.

1806

Page 8: [IEEE ICC/SUPERCOMM'94 - 1994 International Conference on Communications - New Orleans, LA, USA (1-5 May 1994)] Proceedings of ICC/SUPERCOMM'94 - 1994 International Conference on Communications

Example 6 Assume that we have the sequence

- U = 00100010100000001000100100001

With the notation above, 2 can be represented as

where g1 = 001, g2 = 0001,143 = 01, 4 = 00000001, &j = 0001,

Notice that u1 = 3, u 2 = 4, u 3 = 2, u4 = 8, US = 4, 116 = 3 and u 7 = 5.

= 001 and x7 = 00001.

For encoding, we use the vector 1: defined by (8), although the algorithm works similarly with other def- initions of 1:. The idea of the algorithm for recovery of synchronization is to search for the runs of 8 zeros, i.e., for the 3 ' s with U; = 9. If no insertions or deletions have occurred in the current block and the run of 8 zeros corresponding to 1: is error free, then this run of 8 zeros will start at location -10 (mod m) (recall, m is the length of a block). However, if, say, 1 deletions have occurred, where 1 5 6, the run of zeros will start at location -10 - 1 (mod m). Similarly, if 1 insertions have occurred, the run of zeros will start at location -10 + 1 (mod m). So, the algorithm essentially deter- mines where the run of 8 zeros starts, say at L, and computes the number -10 - L modulo m. Let us call this number 1. If 1 is zero, the algorithm decides that no deletions nor insertions have occurred, so it does not make any decision and searches for the next run of 8 zeros. If 1 is not zero but its absolute value is greater than 6, it ignores the event and continues to move forward. However, if 1 5 1 5 6, it concludes that 1 deletions have occurred. So, a process for recovery of synchronization is required. Since we do not know where the deletions have occurred, 1 symbols in the middle of the current block are inserted, i.e., at loca- tion (m- 14)/2. Similarly, if -1 5 l 5 -6, this means that 1 insertions have occurred, so, in order to restore synchronization, we delete 1 symbols from the middle of the block. Of course, errors will propagate by doing this, but the damage will be limited in general to half a block.

Next we describe the algorithm formally.

Algorithm 3.1 Let yl, yz, . . . , 3, .. . be a received sequence. Then:

1. Set 24-0, S t O and i c l .

2. Input 3.

3. If U, = 9 go to step 5 .

4. Set g+(s%), S t S - t u ; , i+i+l and go to step 2.

5. Set Y t - S - 11 (mod m).

6. IfY = 0 or IYI > 6 go to step 4.

7. If Y < 0 go to step 10.

8. Insert Y symbols starting at location S - [ (m - 14)/2J of g.

9. Set g+(S%), S t S + U, + Y, iti + 1 and go to step 2.

10. Delete Y symbols starting at location s - [(m - 14)/2] of 2 and go to step 9.

The following example illustrates Algorithm 3.1.

Example 7 Assume that we have the string of data of length 60

A = 00001100101110001111l0l0l1011l 010010011101110000100010101011

Encoding A into a (1,7) sequence of length 90, we obtain

010010010101010100101000010100 B = 000100100101010000001010100010

000010000010010100010100100100

Adding to each block of length 30 vector 1: a de- fined in (8), we obtain

010010010101010100101000010100 10100000000101 000100100101010000001010100010 10100000000101 000010000010010100010100100100 10100000000101

c =

Therefore, the length of each block is m = 44. Now, assume that we receive the following string:

01001001010101010010100001 10010100 0000001010010001001010100000010100 0001010100000000101000010000010010 100010100100100101000000001010

D =

1807

Page 9: [IEEE ICC/SUPERCOMM'94 - 1994 International Conference on Communications - New Orleans, LA, USA (1-5 May 1994)] Proceedings of ICC/SUPERCOMM'94 - 1994 International Conference on Communications

Notice that the vector of runs of zeros of D is given by

G = 2 3 3 2 2 2 2 3 2 5 1 3 2 9 2 3 4 3 2 2 7 2 6 2 2 9 2 5 6 3 2 4 2 3 3 3 2 9 2 .

Observing G, we find that the first run of 8 zeros (i.e., U, = 9) occurs at S = 32 and i = 13. Notice that Y = -S - 11 = 1 (mod 44), then, since Y > 0, the algorithm inserts 1 symbol (say, a 0), in location S - 15 = 17 of D. With this insertion, D becomes (omit now the last symbol)

01001001010101010001010000110 01010000000010100100010010101

10000100000100101000101001001 0010100000000101

D = 00000010100000101010000000010 (13)

Now, S is set as S + 9 + Y = 42, i as 14, and the process continues. The next run of 8 zeros is encoun- tered at S = 77 and i = 25. Now, Y = -S - 11 = 0 (mod 44), so the algorithm keeps updating S and i normally and does not make any change on D . Simi- larly, the next run of 8 zeros is encountered at S = 121 and i = 37. Again, Y = 0, therefore no further change on D is made and the version of D given by (13) is the output of the algorithm. In order to complete the recovery of synchroniza-

tion, the redundant 14 bits at the end of each block of length 44 are eliminated from D given by (13). Hence, we obtain the (1,7) sequence

010010010101010100010100001100 B1 = 001000100101010000001010000010

000010000010010100010100100100

Demodulating B1 using the procedure described in [l], we obtain

00001 1001 000 1001 100 11 010110 11 1 010111011101110000100010101111

A1 =

If we exclusive-OR A1 with the original information string A, we obtain the error string

As we can see, we have a burst of limited length plus a couple of random errors. These errors are easily handled by an adequate error-correcting code. If a loss of synchronization is left unchecked, we have seen that an unbounded string of errors is produced.

Algorithm 3.1 can recover against a loss of synchro- nization of up to 6 symbols per block. If more than 6 symbols have been inserted or deleted, the algorithm may be further enhanced (at the price of, perhaps, more complexity). Assume that, after some blocks have been affected by errors, there are at least s blocks that are error free. If at a certain point IYI > 6 but, after this, for s consecutive blocks the runs of 8 ze- ros are separated by exactly m bits, then the decoder concludes that IYI bits have been inserted or deleted (according to the sign of Y). Obviously, for this im- plementation, it is better to use a run of 8 zeros than a run of 7 zeros, since runs of 7 zeros may appear without errors being involved.

Simulations of Algorithm 3.1 have been made and it seems to be robust. The algorithm survived the test of severe patterns of errors and deletions, probably much worse than those likely to encounter in practice. Using runs of 7 zeros, the algorithm appears to work as well as using rung of 8 zeros.

References [l] R. Adler, M. Hassner and J. Moussouris,

“Method and Apparatus for Generating a Noise- less Sliding Block Code for a (1,7) Channel with Rate 2/3,” U.S. Patent 4,413,251, 1982.

[2] P. Fkanaazek, “Run-Length-Limited Variable Length Coding with Error Propagation Limita- tion,” U.S. Patent 3,689,899, 1972.

[3] T. D. Howell, “Statistical Properties of Selected Recording Codes,” IBM Research Report, RJ 5814 (58544).

[4] F. J. MuWilliams and N. J. A. Sloane, “The Theory of Error-Correcting Codes,” Amsterdam, The Netherlands: North-Holland, 1977.

[5] P. H. Siegel, “Recording Codes for Digital Mag- netic Storage,” IEEE Trans. on Magnetics, Sept. 1985, pp. 13441349.

000000000011000101100000000000 000101000000000000000000000100

A $ A l =

1808