Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-IT, NO. 1, JANUARY 1971 77 The Complexity of Decoders-Part Computational Work and Decoding JOHN E. SAVAGE, Abstract--The computational work and the time required to decode with reliability E at code rate R on noisy channels are defined, and hounds on the size of these measures are developed. A number of ad hoc decoding procedures are ranked on the basis of the computational work they require. I. INTRODUCTION T HE CENTRAL result of information theory is the demonstration that reliable communication of any degree is possible on noisy channels at all rates less than channel capacity. In this paper we show that reliable communication can be costly and that the cost increases with increasing reliability and increasing code rate and is largest at channel capacity. In an earlier paper [l] we modeled decoders by combinational machines and sequential machines constructed with two-input binary logic elements and individually accessed binary memory cells (the input to each cell is the output of another cell or of a logic element). Classes of decoding rules, including block and tree decoding rules, were then examined using as a measure of decoder complexity the sum (called logic complexity) of the number of logic elements and memory cells in the decoder. We demonstrated that each class of rules has a logic complexity associated with it, which is such that for large block length almost all decoders in a class have a logic complexity that exceeds this quantity. Thus, random selection of a decoder will require a complexity of this size with a probability near 1. It is interesting to note that the logic complexity of the class of block decoders grows exponentially with block length. While these results can serve to direct a search for a good class of decoders, they are inadequate since they do not provide bounds on complexity that hold for all decoders. This is not true of the two measures of complexity studied in this paper. The two measures of the complexity of decoding examined here are called computational work and decoding time. The machine models discussed above are assumed here also. Suppose that a decoder can be described by L interacting sequential machines {Si, 1 I i 5 L), containing {Xi, 1 I i 5 L} 1o g’ic elements and executing {T, 1 < i I L} clock cycles, respectively, to decode one received word from a block code. Then, the computational work performed by that decoder x is defined by Manuscript received August 1, 1969; revised February 2, 1970. This work was supported by the National Science Foundation under Grant NSF GK-3302 and by the National Aeronautics and Space Administration under Grant NGR 40-002-082. The author is with the Division of Engineering and Center for Computer and Information Sciences, Brown University, Providence, R.I. I I: Time MEMBER, IEEE x = ; xiq (1) i=l This quantity can be interpreted as the number of logical operations performed by the decoder to process a received word since Si performs Xi logical operations in one cycle and X,T operations in q cycles. We show that to obtain reliability E = -log, Pe, where Pe is the average probability of error, at rate R on a completely connected discrete memoryless channel (DMC) requires a computational work x x 2 ~EI&,(R) (2) for large E, where E,(R) is a lower bound exponent such as the sphere-packing exponent. The bound holds with or without feedback as long as the appropriate E,(R) is used and applies for strictly nonzero rates since when R = 0, we can do without a decoder and set x = 0. At code rates approaching zero with increasing E, we show the existence of a decoder with x that grows as E log E. At code rates bounded away from zero the Ziv iterative coding methods, [2], [3], Forney’s concatenated coding [4], and sequential decoding are examined and concatenated coding is found to require the smallest work to achieve large E, whereas sequential decoding is shown to require the most work. With concatenated coding x grows as E2, whereas x grows exponentially with E for sequential decoding. Computational work measures the complexity of a decoding rule or procedure and not the complexity of a decoder. It is important to note that a machine with few logic elements can do a lot of work if used many times. Consequently, computational work allows one to see the interdependence of machine complexity and running time and as such it provides useful information about the decoding process. The second measure of decoding complexity studied here is decoding time z. Decoding time is measured as the number of levels of logic through which decoder inputs must percolate to reach the decoder outputs. This measure is important in high-speed data applications where the duration of decoder inputs is a small multiple of the switching time of logic elements. In such applications, a large decoding time will require the parallel operation of several decoders and can be an important determinant of decoder cost. We show that z must grow at least logarithmically with E and exhibit a low-rate decoding procedure for which z grows as the square of the logarithm of E. Most decoders 78 IEEE TRANSACTIONS require a decoding time that is linear in E and, in fact, no decoding procedure for decoding codes of fixed rate and increasing block length has been found that has a z growing less than linearly in E. Decoding time is studied using the important methods of Winograd [Sj. ON INFORMATION This leads to the following. Theorem 1: The computational satisfy X = fj xiT THEORY. JANUARY 1971 work x given by (3) must 2 cC2(.fJ (4) i=l SECTION II A. Computational Work Canonical forms for decoders must be assumed if reliable comparisons of the complexities of decoding procedures are to be made. Consequently, we assume that a decoder is modeled either by a combinational machine or a sequential machine constructed of logic elements from a fixed set fl and individually accessed memory cells of fixed storage capacity. Let the input and output alphabets of the logic elements in n and the memory cells be cd = (0, 1, . . . , d - l}. Then, a combinational machine with m inputs and p outputs realizes some function f,: (&)m + (&)p. Also, a sequential machine S with input alphabet I = (xd)r and output alphabet J = (xd)S is defined by the 4-tuple S = (S, 6, y, so) where S is the state set (which is a collection of t-tuples over c,), s,, E S is the initial state, 6 is the next-statefunction 6: S x I --, S, and y is the output function y: S x I -+ J. There is a natural functionf, associated with a sequential machine that maps the set of initial states and the set of T inputs into the set of T outputs, fT : {s,,} x (I)T -+ (J)T. A decoding function characterizes the action of a block decoder on received words. Given a function f: (&)n --f (c,)’ it can be used as a decoding function for block codes since it partitions the input space into disjoint sets on the basis of outputs and these sets can be used as decoding regions. A decoding function that has M points in its range makes M decisions. We shall assume that a decoder makes M decisions, where M is the number of codewords in the corresponding code. Two additional remarks are in order here. First, we allow any representation for points in the range of the decoding function. Second, any type of storage that does not consist of an array of individually accessed memory can be replaced by equivalent circuits realized with the logic elements and memory cells assumed here. This point will be discussed again later. Consider a sequential machine S composed of the interconnection of the sequential machines, S1 , S,, . . , S, or s = s, x s, x . . . x s,. Let each machine be modeled as above and let Si have Xi logic elements and execute T cycles to complete its portion of the computation done by S. Then, S executes T = max T cycles and receives n inputs from Cd, say, some of which may appear in the initial states of one or more machines. If S produces I outputs from cd, then it is said to compute f,: (&)n + (13’. Definition: The computational work x performed by S = s, x s, x ..’ x S, to computef, is defined by x = i i=l x,ly. (3) where C&f,) is the minimum number of logic elements required to realize fs with a combinational machine constructed with logic elements from R. Comment; Theorem 1 implies that a minimum number of logical operations or an amount of computational work is required to compute a given function. It is interesting to consider C,(h) for linear functions f,: (x2)’ + (x2,,. Using a variant of Theorem 2 of [l], it can be shown that almost all such functions for large n have C&J, which is proportional to n2/log n2. Proof: Fig. 1 shows a model for a sequential machine S in which L holds the logic and the unit to the right holds the state of the machine. Let y, , y2, . . . , yT be the sequence of inputs to this machine and let z1 , . ’ . , zT be the sequence of outputs. Since the memory cells are accessed individually, the combinational machine shown in Fig. 2 can be constructed, which computes the same function fT computed by S. This combinational machine has XT logic elements if X is the number of such elements contained in L. If machines S, , S, , . . . , SL are interconnected to create S, each such machine can be “stretched” into a sequential machine with XiT logic elements. Thus, the stretched version of S contains x elements and this can not be smaller than the minimum number of logic elements required to realize the function associated with S by a combinational machine. Q.E.D. Tape and core memories are not of the type assumed for our machine models. However, models for these storage types can be created with the standard machines assumed here and the result is that we have to add to Xi a quantity Pi that is the number of logic elements in the model for the storage associated with Si. For core and tape memories, Pi will grow at least linearly with the number of cores or tape squares in the memory unit. Thus, Pi will be significant when Xi is small and we can refer to Pi as the computing power of storage. (Note that Pi is zero for an array of individually accessed cells since no computing is done by such an array.) Hereafter we shall let R be the set of two-input binary logic elements. As Muller [7] has observed, if one wishes to deal with a different set a’, each element in n can be replaced by a small number of logic elements from fP with a consequent multiplicative effect on C&f’) and X. We also assume that the memory cells are binary. B. Lower Bounds on Computational Work In this section we derive bounds on the minimum computational work needed to support a reliability E at rate R on a DMC. To begin, let Pe”(X, R) be the minimum attainable average probability of error with equiprobable codewords from a block code of rate R when the decoder is a SAVAGE : THE COMPLEXITY OF DECODERS-PART II : COMPUTATIONAL WORK AND DECODING TIME 79 Z: t’ yi L COMBINATIONAL NETWORK ‘i I aI++ Fig. 1. I I t Y, Yz Y1 Model for sequential machine. Fig. 3. Fig. 2. Equivalent combinational machine. sequential machine that performs computational work x. Let PP’(~, R) be similarly defined for combinational machines. Theorem 2: Pe”(X, R) 2 Pe’(X, R). (5) Proof: Let W be the block code and f the decoding function that achieves Pe”(X, R) with computational work x. Since x 2 C,(f) from (4), we can realize the same probability of error with a combinational machine with no more Q.E.D. work. Thus, (5) follows. Consider now the combinational decoder of Fig. 3 for a binary block code of length n, rate R used on the binary symmetric channel (BSC) with crossover probability p. This machine has n binary inputs and k binary outputs with k 2 nR, if the decoder must be equipped to make exactly 20R decisions. Assume that some binary output function, say z,., is identically equal to some binary input, say yj. Then, if no errors occur in transmission of codeword &,, yj is the jth digit of 6&,, as is z,. Consequently, if a single error occurs in the transmission of ?&, (or any codeword) at the jth position, the decoder output will change or a decoding error will occur. The probability of a transmission error on the BSC in a given digit of a codeword is p. Lemma 1: On the BSC, Pe”(X, R) 2 p if any decoder output is equal to an input. On a DMC other than the BSC, we assume that each channel output letter is encoded into a string of binary digits. If the channel has J outputs, encode each letter into’ [log, 51 binary digits so that no position in the binary representation of letters contains the same digit for all letters. Again, let z, equal yj, the jth binary input of the combinational decoder. Let yj be in the representation of the jth channel output letter and assume that the DMC is completely connected, that is, that all transitions 1 This is the smallest integer greater than or equal to log, J. --_-Y” Combinational decoder between inputs and outputs have nonzero probability of occurring and let Pminbe the smallest transition probability. Let codeword 6$,, be transmitted and yj the jth decoder input where a transition occurs that corresponds to no decoding error. Some other transition, however, will result in a decoding error. This argument applies whether or not feedback from receiver to transmitter exists. Lemma 1’ : On a completely connected DMC with smallest transition probability Pmin, Pe’(x, R) 2 Pmin (with or without feedback) if any decoder output is equal to an input. The implication of these two lemmas is that at least one logic element must be interposed between input and output terminals if Pe’(X, R) < Pmin on a completely connected DMC. As stated earlier, at least nR distinct, nonconstant nonequal binary outputs are needed to specify each of the 2”R decisions the decoder makes in decoding a code of block length n and rate R. Thus, for such a decoder at least nR logic elements are needed if Pe”(X, R) < Pmin. We need only lower bound n to have a bound to computational work. Lower bounds to probability of error on the DMC for all codes of block length n, rate R, and equiprobable codewords have the form [8]-[lo] PehR) 2 exp, {-n[&(R - O,(n))+ 02(n)l) (6) where 0, (n) and O,(n) approach zero with increasing n. For all rates less than channel capacity C, we can take E,(R) to be the sphere-packing exponent or at low rates can improve on this by using the straight-line bound, both of which are found in the paper by Shannon et al. [S]. The two functions O,(n) and O,(n) have similar dependence on n for both exponents and their relative sizes are such that [Ol(n)]2/02(n) approaches zero with increasing n. An additional fact that will be important is that the sphere-packing exponent E,,(R) behaves as a(C - R)2 for R near C. Also both E,,(R) and the straight-line exponent E,,(R) decrease with increasing R as indicated in Fig. 4. Now let Pe”(X, R) < Pmi, on a completely connected DMC. Then, Pe”(X, R) < Pmin and x 2 nR for the smallest integer n that ensures that Pe(n, R) < Pe”(x, R). Thus, x > n,R where n,, is the smallest integer satisfying dEdR - Olh)) + Wdl 2 E"k RI = -log, Here Es&, R) is called the coding reliability. Pe”(X, R). (7) 80 IEEE TRANSACTIONS E,(R) ON INFORMATION THEORY, A 1971 I I I I I I I I I I I I I I I I I I Q(O) JANUARY K + E/E L(O) I I C Fig. 4. Probability R Fig. of error exponents. Theorem 4: For all block decoders operating on completely connected DMCs, a coding reliability E can be obtained with equiprobable codewords at code rate R, 0 < R < C, only when the decoder computational work x satisfies II 2 [W-WQI (8) for large E. With feedback, E,(R) is the exponent on any lower bound to probability of error. Prooj:’ For R < C and E large, we can see from (7) and the knowledge that O,(n) and O,(n) have the following forms for the sphere-packing exponent [9] O,(n) = 2 + T, O,(n) = i + J i log, g Ill,” that no must be very large. Thus, R - O,(q,) will be less than C and we can neglect O,(n) and O,(n) to give (8) from (6) and (Q.E.D.) (7). This bound approaches zero with decreasing R. We can improve on this behavior with the following bounding argument. Assume that a combinational decoder with x logic elements achieves the minimum probability of error Pe”(X, R) at rate R with a code of length n. Assume, however, that the decoder outputs depend on only r of the n input letters, i.e., on r of the sequences of [log, 51 binary digits. Then, the code can be shortened to length r and the rate increased to R’ = (n/r)R without affecting the probability of error. If a reliability E’(x, R) is to be maintained, we must have from (6) r[EdR - %(4) + 0,091 2 WX, RI. Lower bounds to computational complexity input logic elements are required, assuming that Pe”(x, R) < Pmin. The next theorem follows. Theorem 4’: A coding reliability E can be obtained with equiprobable codewords at rate R, 0 < R < C, on a completely connected DMC only with a computational work 3~that satisfies x 2 &W,(R)1 (11) for large E. This result also applies with feedback. Note that we replace 2 by s for a logic set R containing s-input logic elements. The bounds of (8) and (11) are sketched in Fig. 5 and it should be clear that (11) is the superior bound at small rates and may be replaced by (8) at large rates. The behavior of both bounds near channel capacity is important since both become very large at capacity. From (6) and (7) and the related discussion, we have for no at R = C n,O,(n,) 2 E”(x, R). Also, from (9) we have & log2 (e’lp,J 2 E”(x, RI - 3 (13) so that n0 and the bound on computational work x changes from linear to quadratic dependence on reliability E as the rate approaches capacity. Quite a different kind of growth applies to the binary symmetric channel and perhaps other channels as well. On the BSC [ll], f102(4= log2[JmU - d/PI so that n, now grows exponentially with E. Corollary: Under the conditions of Theorem 4, (10) Since R’ 2 R and E,(R) is monotone decreasing in inat R = C on all completely connected DMCs and creasing R, we can replace R’ by R in (10) to give (7). Thus, r 2 ~1~. Note that these arguments apply when feedback p2E 2 is allowed as long as a bound of the form (6) exists. (16) x 2 16 (1 - p) Now if the outputs of the decoder depend on at least ~1~ on the BSC at R = C. input letters and if each must be connected to an output through at least one logic element, then at least n,/2 twoOn the BSC then, the lower bound to x behaves as 1 i I SAVAGE: THE COMPLEXITY OF DECODERS-PART II: COMPUTATIONAL WORK (p/Pe)’ near capacity. This implies that only a small computational work is needed if p N Pe but that x grows very quickly as the ratio p/Pe increases. This is consistent with the intuition that accepts that decoding that achieves low error rates near channel capacity is both time consuming and costly. We now show that the dependence of the lower bound in (11) on the reliability E can not be substantially improved at low rates. We do this by constructing a decoder for a code with some fixed number M of codewords and demonstrate that the computational complexity of this decoder grows with E as E log E. The class of binary maximal-length sequence codes [9], [19] is especially simple since each of the n codewords of length n = 2” - 1 in the code is a cyclic shift of a maximal-length sequence and can be generated by a linear shift register with m = log, (n + 1) stages. We pick any M I n codewords from such a code and observe that the minimum distance of the code is (n + 1)/2. Consequently, with such a code any pattern of t = [(n + 1)/4] or fewer errors in transmission of a codeword over a BSC can be corrected.2 Using standard Chernov bounding techniques, the probability of incomplete decoding or a decoding error can be shown to decrease exponentially with n, for large n, if the BSC crossover probability p satisfies p < l/4. Then, the block length and reliability are related by n 5 E/E(p) where E(p) = -$ log, ~(1 - p)3 -H(1/4). Consider a decoder that generates each codeword and computes the Hamming distance between each and the received sequence. M shift registers with a number of logic elements proportional to log, n and M counters counting to t and with a number of logic elements proportional to log, n will be used as well as a counter to mark the end of a block. Let each counter produce b = [log, (M + l)] zeros if the count exceeds t and a unique nonzero b-tuple identifying the counter, otherwise. Let the decoder output be created by the logical OR, bit by bit, of the M b-tuples produced by the counters. The logic required here will grow as Mb. Therefore, a decoder can be constructed with X logic elements, X I AM log, n, for large n with A a constant. Since the number of decoding cycles T = n we have that the decoder does computational work bounded by for large E. Thus, we demonstrate that the rate of growth of the lower bound (11) with reliability can not be substantially improved for decoders of codes with a fixed number of codewords (and rate approaching zero with increasing El. At this point it is instructive to recall some of the major results of [ 11. In Theorem 3 of [ 11, lower bounds are derived to the number of logic elements required in combinationa! decoders realizing almost all decoding rules in each of 2 1x1 is the largest integer contained in x. AND DECODING 81 TIME several classes. These bounds, Theorem 1, and the lower bound to probability of error of (4) establish the following theorem. Theorem 5: Consider the class of block decoders that achieve reliability E at rate R, 0 < R < C, with equiprobable codewords on DMC with alphabets of size B. The fraction of these decoders that require a computational work x R E)----- 2’” logzB)IELW) 1% B approaches one with increasing E. Similarly, for large E, almost all bounded-distance decoders of systematic binary parity-check codes require a computational work x, which on the BSC satisfies ’ ’ f(1 - E)[(E/EJR))2R(l - RI1 piog2[(E/E,(R))2R(1 - R)] (18) Equation (18) is interesting because it implies that random selection of a bounded-distance decoder for parity-check codes will yield, with probability close to 1, a decoder with computational complexity that grows at least as fast with E as E’/log E’. This is to be compared with the bounds of Theorems 4 and 4’ that grow linearly with E. On the other hand, random selection of a block decoder is very likely to lead to a decoder that requires an enormous computational work ! C. Work Required by Several Decoding Procedures In two recent papers, Ziv [2], [3] has introduced an iterative coding scheme and its modification, that achieves a small probability of error with decoders of small complexity. The first Ziv scheme [2] has an inner coding step that converts a noisy DMC into a less noisy channel followed by another stage of coding that performs error detection only. The inputs and outputs of this second stage are scrambled and descrambled to effectively create a channel that is a noisy binary erasure channel. The erasure-correcting method of Epstein is then used to correct erasures while neglecting errors. The second Ziv scheme [3] replaces the inner code and decoder by the complete first scheme to reduce the overall decoder complexity. The computational work performed by the second Ziv scheme is calculated as the sum of the computational works of each of the various stages. In fact, this is done by conceptually creating combinational decoder equivalents for each stage and adding together the number of logic elements used. The reader can verify that the Ziv iterative method has a computational complexity x, that grows as the fifth power of his parameter v. Fig. 6, depicting the first scheme, may help in this verification. Notice that the innermost decoder has block length n, , complexity proportional to 2”‘, and is repeated v2/nI times. The second-stage decoder is repeated v times, has block length v, and performs a parity check that requires a number of logic elements proportional to v2. The third-stage decoder is repeated vR, times and uses approximately v4 logic elements to solve for the unknowns (erasures) in a block of v digits. Thus, the computational work required by the second 82 IEEE TRANSACTIONS ON INFORMATION Reed-Solomon code, R = rR, E,-(R) is given by THEORY, is the overall E,-(R) = max E(R,)(l JANUARY 1971 rate, and - r) (25) rR, = R where E(R) is the best upper bound block exponent. Theorem 7 : Errors-only decoders of concatenated [4] of rate R do computational work x satisfying DESCRAMBLING I” STAGE Fig. 6. Jrd STAGE End STAGE Combinational codes (26) model of Ziv decoder. Ziv scheme is bounded by x, I Av5 (19) for large v, with A some positive constant. Ziv also demonstrates that this scheme achieves a block probability of error bounded by for large E on an arbitrary DMC, where R, is the optimizing value of (25) and B is a constant of the decoding procedure. Equation (26) follows from (22)(24) after observing that NOR0 2 2R,E/E,(R) (27) is the solution to (23). The important point about this result is that it implies that the computational work need not grow faster than E’. for large v and R < C, the DMC channel capacity. ConseThis is a considerably faster rate of growth than that of the quently, we have the following theorem. lower bounds of Theorems 4 and 4’ and greater than but Theorem 6: On a DMC with capacity C, a rate R < C close to the rate of growth of the almost-all bound for parityand reliability E can be achieved with a computational check codes given by (18), which grows as E2/log E2. work 1, which for large E satisfies It is interesting to consider the computational work that x I A(E/[l - (R/C)“31)5 (21) is done by sequential decoders [ 151, [ 161. Here the performance is measured primarily by the probability of buffer where A is some constant of the Ziv decoding scheme. overflow since the undetected error probability can and Forney [4] has introduced a two-stage coding method usually is made much smaller. Sequential decoders decode called concatenated coding. He considers an inner code convolutional codes that are usually truncated so that they with q codewords, q a power of a prime, and then uses as form large block codes of block length n. If the buffer size an outer code a Reed-Solomon code [12] of block length IZ, is B and the speed factor is SF, various bounds [17], [18] minimum distance d over GF(q), n I q - 1. The decoder give for the probability of an overflow during n transmisfor the inner code need not require more than approxisions P,,(n) mately q logic elements for realization as a combinational 1 decoder, but it must be used n times. Then the compu‘dn) = Dn[~(SF)]E(R) tational work of the inner decoder will be proportional, at worst, to nq. As Berlekamp has shown [13], a Reed-Solowhere D is a small constant and a(R) is the Pareto exponent. mon code can be decoded with a number of logic elements The sequential decoding machine consists of two prinand a number of cycles each proportional to n log n. Thus, cipal components, a logic unit and a buffer. The logic unit the Forney method requires a computational work xF executes the various steps required by the decoding algobounded by rithm and contains an encoder. The encoder has a number xF I B(n log n)2 of logic elements that grows linearly with constraint (24 length v while the remaining logic is essentially independent for large n, when the ReeddSolomon code is primitive of v. (n = q - 1). Here B is a constant of the decoding procedure. The total number of cycles completed by the decoder will Forney has given a bound to the probability of error Pe exceed n and the number of logic elements will exceed B, for “errors-only” decoding of concatenated codes that has since each of the B branches in the buffer must be available the correct form but whose proof is incomplete. A complete to the decoder logic unit. proof is given in [14] and the bound is The computational work done by any sequential depe < 2-[~0~cUWl (23) coder Xs, is bounded by pe 2 p2-[1 ~(R/C)“31~ (20) X,, where N, is the overall block length given by No = n log, q/R, = n log, (n + 1) R (24) 0 and R, is the rate of the inner code. Also, r is the rate of the Now substituting 2 nB. for B through (28) we have (29) SAVAGE: THE COMPLEXITY OF DECODERS-PART II: COMPUTATIONAL WORK where the exponent E,, is defined as the negative logarithm to the base 2 of P,,(n). Here the speed factor is limited by the switching time of logic elements and can be considered a constant. Expression (30) is minimized under variation of n by making n as small as possible. Here n can not be too small, otherwise the undetected error rate will exceed the rate of buffer overflows. Hence, let n = n1 its smallest value. Then, xsD grows exponentially with E,,, for large values of E,,. This exponential rate of growth, however, may not be visible for modest values of E,,, and, in fact, sequential decoders now find application to many important coding problems such as satellite communication. Nevertheless for large Es,, sequential decoding is nonoptimal when optimality is measured in terms of computational work. SECTION III A. Decoding Time A nonzero amount of time is required for signals to propagate through logic elements. In this section, we assume that each logic unit introduces one unit of delay and we derive bounds on the total delay z (or the number of levels of logic) required to achieve a reliability E at code rate R. We shall derive a lower bound to the minimum delay, present an almost-all bound to delay for several decoder classes, and exhibit a decoder that has small total delay. Again assume that decoders are modeled by combinational or sequential machines constructed with two-input binary logic elements and binary memory cells. Let 1be the maximum number of logic levels between the inputs and outputs of the logic unit of a sequential machine. If the machine executes T cycles to compute function J the decoding function, say, then the time spent computing J z, is given by z = Tl. Theorem 8 : The time in which f is computed by a sequential machine satisfies z = Tl 2 D*(f) (31) where D*(f) is the minimum time in which f can be computed by a combinational machine using elements from 0. B. Bounds on Decoding Time Consider the combinational decoder of Fig. 3. Let zi be an arbitrary nonconstant output and let zi depend on K of the n letter inputs. (It is then independent of the remaining n - K letter inputs.) Assume that the channel is a completely connected DMC and that each of its J outputs is encoded into [log, 51 binary digits.3 Also, let Pmi, be the smallest transition probability of this channel. Then, the smallest attainable probability of error Pe must satisfy Pe 2 P,“,,. (32) To see this, observe that a received sequence that is decoded correctly produces one value of zi but that there is some 3 Thus, z, depends on at least one binary digit in each of K encoded channel letters. AND DECODING TIME 83 pattern of K digits in the received sequence, which if changed, results in a different value for zi and a decoding error. And every pattern qf K received digits has a probability greater than or equal to P&, of being received. Theorem 9: Every combinational decoder achieving a reliability E on a completely connected DMC with smallest transition probability Pmin must have minimum delay z that exceeds (for 2-input logic elements) z 2 Tlog,(E/-lOg, f',iJl. (33) Proqfl From any given output at most two inputs can be reached with one level of logic and at most 2’ inputs with z levels of logic. We set 2’ 2 K and solve for K from (32) to establish the inequality of (33). Q.E.D. This bounding argument is the starting point in the work of Winograd [5], [19] on the time required to multiply and add. In [19], Winograd presents an upper bound to the number of combinational machines with n inputs that have z or fewer levels of logic. If zi is an output of the machine it could be calculated by one of at most 32’- ‘n2’ trees of depth z because at most 2’ - 1 positions in a tree are available to the three types of logic elements used here and the n inputs can be attached to at most 2’ inputs of a tree. If the machine has k outputs then, at most, (32’p 1n2’)k different machines with delay z exist. In Theorem 3 of [l] the number of distinct minimumdistance combinational decoders of binary parity-check (BPC) codes was shown to be exp, n2R(1 - R). If we let k = nR and choose z so that (32zpln2z)k =(2dR(l-R))1-~ (34) with 0 < E < 1, then for large n, almost all minimumdistance combinational decoders of BPC codes will require a z that is larger than the solution of (34). Solving for z, we have the following. Theorem 10: Almost all minimum-distance decoders of BPC codes of block length n and rate R require a computation time z that satisfies z 2 [log, [( 1 - c)( 1 - R)n/log, n] 1 (35) forlargenwhenO<<< 1. Since BPC codes exist for the BSC for which n grows linearly with E, the BPC minimum-distance decoders are likely to be as good with respect to computation time as the best decoder. When this bound is applied to the class of all block decoders, however, the bound grows linearly with E and not logarithmically. Our next objective is to demonstrate the existence of a decoder for which the number of levels of logic does not grow too rapidly with reliability E. To do this, we reexamine the code and decoder introduced at the end of Section II-B. The code was a subcode of a binary maximal-length code that contains a fixed number M( I H) of codewords of length n and corrects t or fewer errors, where t = [(n + 1)/4]. Assume that n parallel BSCs are used for transmission and that the decoder is a combinational machine that acts on all received digits simultaneously. Form the term-by- 84 IEEE TRANSACTIONS term sum modulo 2 of the M codewords with the received sequence and apply the resultants to n-input combinational counters that drive circuits that produce b = flog, (M + 1)l zeros if a count exceeds t and a unique identifying b-tuple, otherwise. Feed the b-tuple to b OR gates with M inputs and the output of these b gates will be a b-tuple identifying the single codeword at Hamming distance t or less from the received sequence or the zero b-tuple signaling that no codeword satisfies this condition. The b OR gates each can be realized with flog, M 1 levels of 2-input logic elements, or a number of levels that grows at worst as log n. The circuits that produce b-tuples can be realized with b 2-input AND gates or with one level of logic as long as a binary signal is available that reflects whether a count exceeds t or not. In the next paragraph we argue that the counters require a number of levels of logic that grows as (log, n)‘. Finally, the modulo-2 sums of codewords and the received sequence can be realized with a single level of logic consisting of 2-input modulo-2 adders. Consequently, the total number of levels of logic grows as (log, n)“. Let the binary counter be designed so that it has m inputs for ma power of 2, say, m = 2k, and assume that its output is a k-tuple representing in binary radix notation the number of l’s among the m binary inputs. This circuit can be realized with two counters each with m/2 inputs and a number of binary full adders (two binary inputs and a “sum” and a “carry” output). The full adders are used to add the number of l’s, 2’s, 4’s, etc., with carry at the output of the two m/2-input counters. For example, the two 4’s must be added together with the 4 carry and this can be done with 3 full adders. Thus, the sum of the outputs of the m/2-input counters can be formed with a total of 3k - 2 full adders. (The l’s can be added with one full adder.) When the 4 carry is added to the 4’s at the counter output, 2 additional levels of full adders are introduced. Hence, the total number of levels of full adders required to sum the counter outputs is 2k - 1 (the l’s can be added with one level). Let z(m) be the number of levels of full adders required to count the number of l’s among m = 2k binary digits. Then, we have shown that z(m) = 2(log, m) - 1 + r(m/2). (36) Since r(2) = 1, the solution to this recursion relation is z(m) = i: (2j - 1) + 1 j=2 = p + 2)(k - 2 1) _ (k _ 1) + 1 (37) z(m) = k2 = (log, m)2. Each full adder can be realized with one level of logic consisting of a 2-input exclusive OR gate to form the “Sum” and an AND gate to form the “carry.” Therefore, an n-input binary counter can be realized with exactly ([log, n1)2 levels of logic since zeros can be introduced to increase n to the next power of 2. The counter outputs can be combined to produce a binary signal that has value 1 when t or fewer errors occur and 0 otherwise. Using 2-input AND ON INFORMATION THEORY, JANUARY 1971 and OR gates, the outputs can be combined with [log, (log, [(n + 1)/S])] + 2 levels of logic when n + 1 is a power of 2, as is the case for our code. It is clear that the decoder described above uses a number of levels of logic bounded by A(log, n)2 for constant A. Since the probability of error that can be obtained on the BSC will decrease exponentially with n, we can relate n and reliability E by n I E/E(p) where E(p) has been given before and p is the BSC crossover probability p < l/4. Theorem 11: There exists a code of rate R - l/E and decoder for it on the BSC with crossover probability p that achieves reliability E with a computation time r bounded by z g Abg, W%N2 (38) for A a constant where E(p) = -+ log, p(1 - p)” - H(1/4). Any Boolean function of n variables can be realized (in sum-of-products form) with a computation time that grows almost linearly with n. Therefore, any decoder for a block code of length n can be decoded in a time proportional to E. This is important because all attempts to find a decoder with z that is not linear in E and that also decodes codes of fixed rate have not been successful. The question of how closely the bound of Theorem 9 can be approached for codes of fixed rate remains open. IV. CONCLUSIONS We have introduced two new measures of decoding complexity, namely, computational work x and decoding time r. Lower bounds to x have been developed that grow linearly with reliability E and become large as the code rate approaches channel capacity. The existence of a decoder for very low rate codes has been demonstrated that does a work that grows as E log E. At all rates less than channel capacity, concatenated coding and decoding achieves reliability E with a computational work that grows as E2. Thus, we know that the bounds on x can not be substantially improved at rates near zero and there is reason to believe that the bounds can be improved at larger rates. It is now possible to rank decoding procedures on the basis of the computational work they require to reach reliability E, for large E. We have shown, for example, that sequential decoding is far inferior to concatenated decoding and the Ziv iterative decoding procedures at large E. The time z required to decode has also been examined and we have shown that z must grow at least logarithmically with E. Also, a low-rate decoding procedure has been given that decodes in a time that grows as the square of the lower bound. Thus, the dependence of z on E is not clear even for low-rate coding. Computational work and decoding time are both measures of the complexity of the decoding process and are not measures of the complexity of a decoding machine. The bounds on computational work and decoding time IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-II, NO. 1, JANUARY 1971 need to be tightened. Additionally, it would be instructive to relate computational work and decoding time more closely to parameters of well-known codes and standard decoding rules for them. Such relations might be of more direct value in decoder design. REFERENCES [l] J. E. Savage, “The complexity of decoders-Pt. I : Decoder classes,’ IEEE Trans. Inform. Theory, vol. IT-15, pp. 6899695, November 1969 [2] J. Ziv, “Asymptotic performance and complexity of a coding scheme for memoryless channels,” IEEE Trans. Inform. Theory, vol. IT-13. pp. 3566359, July 1967. [3] -, “Further results on the asymptotic complexity of an iterative coding scheme,” IEEE Trans. Inform. Theory, vol. IT-12, pp. 168-171, April 1966. [4] G. D. Forney, Concatenated Codes. Cambridge, Mass.: M.I.T. Press, 1966, ch. 4. [5] S. Winograd, “On the time required to perform addition,” J. ACM, vol. 12, no. 2, pp. 277-285, 1965. [6] J. E. Savage, “Some comments on the computation time and complexity of algorithms,” Proc. Princeton Corzf. Infbrm. Sciences and Systems, 1969. [I] D. E. Muller, “Complexity in electronic switching circuits,” IRE Trans. Electron. Cornput., vol. EC-5, pp. 15-19, March 1956. [S] C. E. Shannon, R. G. Gallager, and E. R. Berlekamp, “Lower Cyclic 85 [9] [lo] [ 1l] [12] [13] [14] [15] [16] [ll] [18] [19] [20] bounds to error probability for coding on discrete memoryless channels,” Inform. and Control, vol. 10, pp. 655103, 522-552, 1967. R. G. Gallager, Infbrmation Theory and Reliable Communication. New York: Wiley, 1968, ch. 5. F. Jelinek, Probabilistic Information Theory. New York: McGrawHill, 1968. R. G. Gallager, op. cit., p. 164. I. S. Reed, and G. Solomon, “Polynomial codes over certain finite fields,” J. SIAM, vol. 8, pp. 300-304, 1960. E. R. Berlekamp, Algebraic Coding Theory. New York: McGrawHill, 1968, ch. 7. J. E. Savage, “A note on the performance of concatenated coding,” IEEE Trans. Ir$orm. Theory (Correspondence), vol. IT-16, pp. 512-513, July 1970. -, “Progress in sequential decoding,” in Advances in Communication Systems, vol. 3, A. V. Balakrishnan, Ed. New York: Academic Press, 1968. J. M. Wozencraft and I. M. Jacobs, Principles OJ Communication Engineering. New York: Wiley, 1965, ch. 6. J. E. Savage, “Sequential decoding--The computation problem,” Bell Syst. Tech. J., vol. 45, no. 1, pp. 149-175, 1966. I. M. Jacobs and E. R. Berlekamp, “A lower bound to the distribution of computation for sequential decoding,” IEEE Trans. Inform. Theory, vol. IT-13, pp. 1677174, April 1967. S. Winograd, “On the time required to perform multiplication,” J.ACM, vol. 14, no. 4, pp. 793-802, 1967. W. W. Peterson, Error-Correcting Codes. Cambridge, Mass. : M.I.T. Press, and New York: Wiley, 1961. and Multiresidue Codes Arithmetic Operations THAMMAVARAPU R. N. RAO, MEMBER, IEEE, AND OSCAR Abstracf-In this paper, the cyclic nature of AN codes is defined after a brief summary of previous work in this area is given. New results are shown in the determination of the range for single-error-correcting AN codes when A is the product of two odd primes pI and pz , given the orders of 2 module p 1 and modulo p2. The second part of the paper treats a more practical class of arithmetic codes known as separate codes. A generalized separate code, called a mnltiresidue code, is one in which a number N is represented as where mi are pairwise relatively prime integers. For each AN code, where A is composite, a multiresidue code can he derived having error-correction properties analogous to those of the AN code. Under certain natural constraints, multiresidue codes of large distance and large range (i.e., large values of N) can be implemented. This leads to possible realization of practical single and/or multiple-error-correcting arithmetic units. Manuscript received June 2, 1969. This work was supported in part by Research Grants NSF GK-1543, NSF GK-25278, and NCR-21-002-229. T. R. N. Rao is with the Department of Electrical Engineering, University of Maryland, College Park, Md. 20742. 0. N. Garcia was with the University of Maryland, College Park. He is now with the Department of Electrical and Electronic Systems, University of South Florida, Tampa, Fla. N. GARCIA, for MEMBER, IEEE I. BACKGROUND REMARKS T HE CLASS of codes known as AN codes are considered useful in monitoring errors in arithmetic operations as well as in communication. First Diamond [3] and later Brown [2] developed these codes and discussed several examples. Other researchers [l], [4], [5] extended and proved some important theorems. Massey [6] presented a firm mathematical foundation and an excellent survey of the early work on these codes. Previously, Peterson [13] had shown that the only possible way to check addition with a separate checker was by means of a residue code. Garner [7] established the algebraic structure of the separate- and nonseparate-type residue codes as machine number systems. Independently, Mandelbaum [4] and Barrows [l] found a new class of AN codes with large minimum distance and mentioned the cyclic property of these codes. Mandelbaum also studied them as burst-error codes. Rao [8] extended the use of separate residue codes to check errors in operations such as complement, shift, rotate, etc. In a more recent paper [9], a bi-residue code capable of correcting single