correction to 'a heuristic algorithm for the construction of a code with limited word length

2
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 34, NO. 4, JULY 1988 893 and (3x)(V( X,, y) such that ((A,),, < x,, , if j and y, =O for atleast one(i, j), i + j) The proof follows easily frcm Lemma 1 and is omitted. A sketch of the domains where X should be checked appears in Fig. 1. Remark: By the Jensen inequality, conditions (3.6) hold if the following stronger condition holds, respectively: which is simpler to check. REFERENCES J. L. Doob, Stochustic Processes. W. M. Wonham, “Some applications of stochastic differential equations to optimal non-linear filtering,” SIAM 1. Contr., Ser. A., vol. 2, pp. 347-369, 1965. A. Dembo and 0. Zeitouni, “Parameter estimation of partially observed continuous time processes via the EM algorithm.” Stochastic Processes and their Applicutions. vol. 23, pp. 91-113, 1986. R. S. Liptser and A. N. Shiryayev, Statistics of Random Processes, parts I, 11. Berlin-New York: Springer-Verlag. 1977. R. H. Shumway and D. S. Stoffer, “An approach to time series smooth- ing and forecasting using the EM algorithm,” J. Time Series Anal., vol. 3, no. 4, pp. 253-264, 1982. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J. Roy. Sturist. Soc. B, vol. 39, pp. 1-38, 1977. Y. C. Yao, “Estimation of noisy telegraph process: Nonlinear filtering versus nonlinear smoothing,” IEEE Trans. Inform. Theop, vol. IT-31, pp. 444-446, 1985. B. H. Juang. S. E. Levinson, and M. M. Sondhi, “Maxi6um likelihood estimation for multivariate mixture observations of Markov chains,” IEEE Trans. Inform. Theory, vol. IT-32. pp. 307-309, 1986. L. E. Baum et a/., “A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains.” Ann. Math. Stutist.. vol. 41, pp. 164-171, 1970. C. F. J. Wu, “On the convergence properties of the EM algorithm,” Ann. Statist.. vol. 11. no. 1, pp. 95-103, 1983. S. I. Marcus, “Algebraic and geometric methods in nonlinear filtering,” SIAM J. Contr. Opt., vol. 22, pp. 817-834, 1984. A. V. Balakrishnan. Stochastic Differential Systems, Lecture Notes in Economics and Mathematical Systems 84. Berlin-New York: Springer, 1973. New York: Wiley, 1953. Correction to “A Heuristic Algorithm for the Construction of a Code with Limited Word Length” JAN L. P. DE LAMEILLIEURE I. INTRODUCTION In the above correspondence’ a method is presented for the construction of a variable length code with maximum word-length constraint. The algorithm is partially based on the algorithm of Murakami et al. [l], and is extended with heuristic choices for the bifurcations. In [2], Lu explains why the step 1) taken by Murakami, which is used in the contribution’ as condition (6), does not guarantee an optimum code when a word-length restric- tion is imposed. This step l), or condition (6), does not take the bounded maximum word length into account. Therefore, in this contribution, the following adaptation of condition (6) into the composite condition (6’) ((6a’) or (6b’)) is proposed for the case when there is a word-length constraint, where condition (6a’) is (the same notations as in’ are used for the general case of Q-ary codes) k +(( N- k - l)DIV(m DIVQ)) c 1-k+l AND [{(((N- ~-~)DIv(~DIvQ)) + ( Q - 3)) DIV ( Q - l)} + X < b] . and condition (6b‘) is .(k+m- + (Q - 3)) DIV( Q - 1)) + X < b] . (6b’) i Here, (x DIVy) denotes the largest integer smaller than or equal to x/y. With this composite condition (U), the possible state transitions from the state (k, X) are investigated.‘ If condition (6’) is true for a state (k, A), then message k can be assigned the length A. 11. PROOF OF CONDITION (6a’) In condition (6a’), first a probability upper bound for the prefix of length A with the smallest probability is calculated for the case that Z( k) would be X + 1. A probability upper bound for the least probable prefix of length X is obtained by considering the prefix of length X with the fewest messages for which it is a prefix. That prefix has in the optimum bounded maximum word- length code at most r = ((N - k - l)DIV(m DIVQ)) messages for which it is a prefix: in the state (k, A), there are still (mDIVQ) prefixes of length X to be assigned and, as the hypothesis of I( k) = X + 1 is considered, there is still one prefix of length A + 1 to be assigned; if the prefix of length X + 1 is assigned to one of the messages of the set k + 1, k + 2,. . . , N }, Manuscript received July 7, 1987. The author was with the Electronics Laboratory, University of Ghent, B-9000 Ghent, Belgium. He is now with the Communication Engineering Laboratory at the same university. IEEE Log Number 8823500. ‘J. L. P. De Lameillieure, IEEE Trans. Inform. Theory, vol. IT-33, pp. 438-443, May 1987. 0018-9448/88/0700-0893$01.00 01988 IEEE

Upload: jlp

Post on 21-Sep-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 34, NO. 4, JULY 1988 893

and

(3x)(V( X,, y) such that ((A,), , < x,, , if j and

y,, = O for atleast one(i , j ) , i + j )

The proof follows easily frcm Lemma 1 and is omitted. A sketch of the domains where X should be checked appears in Fig. 1.

Remark: By the Jensen inequality, conditions (3.6) hold if the following stronger condition holds, respectively:

which is simpler to check.

REFERENCES J. L. Doob, Stochustic Processes. W. M. Wonham, “Some applications of stochastic differential equations to optimal non-linear filtering,” SIAM 1. Contr., Ser. A., vol. 2, pp. 347-369, 1965. A. Dembo and 0. Zeitouni, “Parameter estimation of partially observed continuous time processes via the EM algorithm.” Stochastic Processes and their Applicutions. vol. 23, pp. 91-113, 1986. R. S . Liptser and A. N. Shiryayev, Statistics of Random Processes, parts I, 11. Berlin-New York: Springer-Verlag. 1977. R. H. Shumway and D. S . Stoffer, “An approach to time series smooth- ing and forecasting using the EM algorithm,” J . Time Series Anal., vol. 3, no. 4, pp. 253-264, 1982. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J . Roy. Sturist. Soc. B , vol. 39, pp. 1-38, 1977. Y. C. Yao, “Estimation of noisy telegraph process: Nonlinear filtering versus nonlinear smoothing,” I E E E Trans. Inform. Theop, vol. IT-31, pp. 444-446, 1985. B. H. Juang. S . E. Levinson, and M. M. Sondhi, “Maxi6um likelihood estimation for multivariate mixture observations of Markov chains,” I E E E Trans. Inform. Theory, vol. IT-32. pp. 307-309, 1986. L. E. Baum et a/., “A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains.” Ann. Math. Stutist.. vol. 41, pp. 164-171, 1970. C. F. J. Wu, “On the convergence properties of the EM algorithm,” Ann. Statist.. vol. 11. no. 1, pp. 95-103, 1983. S . I. Marcus, “Algebraic and geometric methods in nonlinear filtering,” SIAM J. Contr. Opt., vol. 22, pp. 817-834, 1984. A. V. Balakrishnan. Stochastic Differential Systems, Lecture Notes in Economics and Mathematical Systems 84. Berlin-New York: Springer, 1973.

New York: Wiley, 1953.

Correction to “A Heuristic Algorithm for the Construction of a Code with Limited Word Length”

JAN L. P. DE LAMEILLIEURE

I. INTRODUCTION In the above correspondence’ a method is presented for the

construction of a variable length code with maximum word-length constraint. The algorithm is partially based on the algorithm of Murakami et al. [l], and is extended with heuristic choices for the bifurcations. In [2], Lu explains why the step 1) taken by Murakami, which is used in the contribution’ as condition (6) , does not guarantee an optimum code when a word-length restric- tion is imposed. This step l) , or condition (6), does not take the bounded maximum word length into account. Therefore, in this contribution, the following adaptation of condition (6) into the composite condition (6’) ((6a’) or (6b’)) is proposed for the case when there is a word-length constraint, where condition (6a’) is (the same notations as in’ are used for the general case of Q-ary codes)

k +(( N - k - l )DIV(m DIVQ)) c 1 - k + l

AND [ { ( ( ( N - ~ - ~ ) D I v ( ~ D I v Q ) )

+ ( Q - 3)) DIV ( Q - l)} + X < b] .

and condition (6b‘) is

. ( k + m -

+ ( Q - 3)) DIV( Q - 1)) + X < b] . (6b’) i Here, (x DIVy) denotes the largest integer smaller than or equal to x/y. With this composite condition (U), the possible state transitions from the state ( k , X) are investigated.‘ If condition (6’) is true for a state ( k , A), then message k can be assigned the length A.

11. PROOF OF CONDITION (6a’) In condition (6a’), first a probability upper bound for the

prefix of length A with the smallest probability is calculated for the case that Z( k ) would be X + 1. A probability upper bound for the least probable prefix of length X is obtained by considering the prefix of length X with the fewest messages for which it is a prefix. That prefix has in the optimum bounded maximum word- length code at most r = ( ( N - k - l)DIV(m DIVQ)) messages for which it is a prefix: in the state ( k , A), there are still (mDIVQ) prefixes of length X to be assigned and, as the hypothesis of I ( k ) = X + 1 is considered, there is still one prefix of length A + 1 to be assigned; if the prefix of length X + 1 is assigned to one of the messages of the set k + 1, k + 2,. . . , N },

Manuscript received July 7, 1987. The author was with the Electronics Laboratory, University of Ghent,

B-9000 Ghent, Belgium. He is now with the Communication Engineering Laboratory at the same university.

IEEE Log Number 8823500. ‘J. L. P. D e Lameillieure, IEEE Trans. Inform. Theory, vol. IT-33, pp.

438-443, May 1987.

0018-9448/88/0700-0893$01.00 01988 IEEE

894 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 34. NO. 4, JULY 1988

there remain N - k - 1 messages for the (m DIVQ) prefixes of length A, and at least one of the prefixes will have at most i (( N - k - 1) DIV( m DIVQ)) messages for which it is a pre- fix. The assumption that the r messages are the r most probable messages of the set { k + 1, k + 2,. . . , N } delivers a probability upper bound for the least probable prefix of length A.

After this calculation of the upper bound of the prefix prob- ability, we derive the maximum word length of the code tree having its root in the prefix for which the probability upper bound is calculated. The maximum word length is { [ r + (Q - 3)] DIV (Q - l)}, and it is obtained by constructing the code tree as is shown in Fig. 1 for ternary codes. If the prefix probability upper bound does not exceed p ( k ) , and if the code tree in this prefix is shorter than the word length bound, then the word length of message k in the optimum code with maximum word length constraint must not be longer than A .

1 _ _ _ _ _ ~ - _ _ _ _ fl------ 2-v--- _---_--

Fig. 2. Ternary code tree with longest possible maximum word length for the state ( k , 1) and for I( k ) = h + 1.

IV. NUMERICAL RESULTS The code lengths in the examples of the correspondence’ do

not change due to the adaptation of condition (6) into (6’). Because condition (6’) is weaker than condition (6), the number of recursive calls to the investigation procedures has changed. The new Tables I11 and IV are listed below. They do not change the conclusions on the heuristic power of the mean code-length estimations.

TABLE I11 NUMBER OF RECURSIVE CALLS TO THE INVESTIGATION PROCEDURE

I N THE BINARY EXAMPLE OF TABLE I

1 2 3 4 5

Fig. 1. Ternary code tree with longest possible maximum word length and a 5443440 117278 31516 65306 31516 with its root in the prefix of length A , for which the probability upper 5443440 41635 24411 37509 24411

C - - 9661 26426 9661 bound has been calculated.

TABLE IV NUMBER OF RECURSIVE CALLS TO THE INVESTIGATION PROCEDURE

IN THE EXAMPLE OF TABLE I1 WITH b = 7

1 2 3 4 5

If I ( k ) were A + 1, there would exist a prefix of length A with highest possible probability smaller than p ( k ) . Because of the length condition in (6a’), this prefix and the codeword of message k could be exchanged, resulting in a better code. Therefore, I ( k ) must not be longer than A , and may be assigned the length A , if a 39313 6145 4134 4886 41 34 (6a’) is true. h 39373 1530 1265 1446 1265

C - - 644 747 406

111. PROOF OF CONDITION (6b’) In condition (6b’) the worst-case maximum word length is

considered for I ( k ) = A + 1. This maximum word length is de- rived from a code tree constructed as indicated in Fig. 2 for ternary codes. If the worst-case maximum word length for I ( k ) = A + 1 is shorter than the word length bound and C p ( i ) < p ( k ) , then the message k can be assigned the word length A. If I ( k ) were longer than A, the code would contain a prefix of length A with a smaller probability than p ( k ) . Because of the code-length condition in (6b’), this prefix and the codeword of message k could be exchanged, resulting in a better code. That explains why in the optimum code that satisfies (6b’), I ( k ) must not be longer than A.

a: In a bifurcation, transition (1) is investigated before transition (2). h: In a bifurcation, transition (2) is investigated before transition (1). c: In a bifurcation, the transition order is heuristically determined. 1: No heuristic information is used at all: I,,,,,, = 0. 2: I,,,,,, = lav.est.O.

3: I,,,,,t = m=(lay.est,07 L”.est.l). 4: I,”,,,, = m~(l, , . , , t ,o. lav.est.2). 5: Iav.,,, = max(~,”,,%o. lav.esr.l. 1a“.est,2).

REFERENCES

[l] H. Murakami, S. Matsumoto, and H. Yamamoto, “Algorithm for con- struction of variable length code with limited maximum word length,” IEEE Trans. Commun., vol. COM-32, pp. 1157-1159, Oct. 1984. Chung H. Lu, “Comment on ‘Algorithm for construction of variable length code with limited maximum word length’,” IEEE Trans. Commun., vol. 36, pp. 373-375, Mar. 1988.

[2 ]