IEEE TRANSACTIONS
ON INFORMATION
THEORY,
VOL.
IT-IT,
NO.
1, JANUARY
1971
77
The Complexity
of Decoders-Part
Computational
Work and Decoding
JOHN E. SAVAGE,
Abstract--The computational work and the time required to decode with
reliability E at code rate R on noisy channels are defined, and hounds on the
size of these measures are developed. A number of ad hoc decoding procedures are ranked on the basis of the computational work they require.
I. INTRODUCTION
T
HE CENTRAL
result of information theory is the
demonstration that reliable communication of any
degree is possible on noisy channels at all rates less than
channel capacity. In this paper we show that reliable
communication can be costly and that the cost increases
with increasing reliability and increasing code rate and is
largest at channel capacity.
In an earlier paper [l] we modeled decoders by combinational machines and sequential machines constructed
with two-input binary logic elements and individually
accessed binary memory cells (the input to each cell is the
output of another cell or of a logic element). Classes of
decoding rules, including block and tree decoding rules,
were then examined using as a measure of decoder complexity the sum (called logic complexity) of the number of
logic elements and memory cells in the decoder. We demonstrated that each class of rules has a logic complexity
associated with it, which is such that for large block length
almost all decoders in a class have a logic complexity that
exceeds this quantity. Thus, random selection of a decoder
will require a complexity of this size with a probability
near 1. It is interesting to note that the logic complexity
of the class of block decoders grows exponentially with
block length. While these results can serve to direct a search
for a good class of decoders, they are inadequate since they
do not provide bounds on complexity that hold for all
decoders. This is not true of the two measures of complexity
studied in this paper.
The two measures of the complexity of decoding examined here are called computational work and decoding time.
The machine models discussed above are assumed here
also. Suppose that a decoder can be described by L interacting sequential machines {Si, 1 I i 5 L), containing
{Xi, 1 I i 5 L} 1o g’ic elements and executing {T,
1 < i I L} clock cycles, respectively, to decode one received
word from a block code. Then, the computational work
performed by that decoder x is defined by
Manuscript received August 1, 1969; revised February 2, 1970. This
work was supported by the National Science Foundation under Grant
NSF GK-3302 and by the National Aeronautics and Space Administration under Grant NGR 40-002-082.
The author is with the Division of Engineering and Center for Computer
and Information Sciences, Brown University, Providence, R.I.
I I:
Time
MEMBER, IEEE
x = ; xiq
(1)
i=l
This quantity can be interpreted as the number of logical
operations performed by the decoder to process a received
word since Si performs Xi logical operations in one cycle
and X,T operations in q cycles. We show that to obtain
reliability E = -log, Pe, where Pe is the average probability of error, at rate R on a completely connected
discrete memoryless channel (DMC) requires a computational work x
x 2 ~EI&,(R)
(2)
for large E, where E,(R) is a lower bound exponent such as
the sphere-packing exponent. The bound holds with or
without feedback as long as the appropriate E,(R) is used
and applies for strictly nonzero rates since when R = 0,
we can do without a decoder and set x = 0.
At code rates approaching zero with increasing E,
we show the existence of a decoder with x that grows as
E log E. At code rates bounded away from zero the Ziv
iterative coding methods, [2], [3], Forney’s concatenated
coding [4], and sequential decoding are examined and
concatenated coding is found to require the smallest work
to achieve large E, whereas sequential decoding is shown to
require the most work. With concatenated coding x grows
as E2, whereas x grows exponentially with E for sequential
decoding.
Computational
work measures the complexity of a
decoding rule or procedure and not the complexity of a
decoder. It is important to note that a machine with few
logic elements can do a lot of work if used many times.
Consequently, computational work allows one to see the
interdependence of machine complexity and running time
and as such it provides useful information about the
decoding process.
The second measure of decoding complexity studied
here is decoding time z. Decoding time is measured as the
number of levels of logic through which decoder inputs
must percolate to reach the decoder outputs. This measure
is important in high-speed data applications where the
duration of decoder inputs is a small multiple of the switching time of logic elements. In such applications, a large
decoding time will require the parallel operation of several
decoders and can be an important determinant of decoder
cost.
We show that z must grow at least logarithmically with
E and exhibit a low-rate decoding procedure for which z
grows as the square of the logarithm of E. Most decoders
78
IEEE TRANSACTIONS
require a decoding time that is linear in E and, in fact, no
decoding procedure for decoding codes of fixed rate and
increasing block length has been found that has a z growing
less than linearly in E. Decoding time is studied using the
important methods of Winograd [Sj.
ON INFORMATION
This leads to the following.
Theorem 1: The computational
satisfy
X = fj
xiT
THEORY.
JANUARY
1971
work x given by (3) must
2 cC2(.fJ
(4)
i=l
SECTION II
A. Computational
Work
Canonical forms for decoders must be assumed if reliable
comparisons of the complexities of decoding procedures
are to be made. Consequently, we assume that a decoder is
modeled either by a combinational machine or a sequential
machine constructed of logic elements from a fixed set fl
and individually accessed memory cells of fixed storage
capacity.
Let the input and output alphabets of the logic elements
in n and the memory cells be cd = (0, 1, . . . , d - l}. Then,
a combinational
machine with m inputs and p outputs
realizes some function f,: (&)m + (&)p. Also, a sequential machine S with input alphabet I = (xd)r and output
alphabet J = (xd)S is defined by the 4-tuple S = (S, 6, y, so)
where S is the state set (which is a collection of t-tuples
over c,), s,, E S is the initial state, 6 is the next-statefunction
6: S x I --, S, and y is the output function y: S x I -+ J.
There is a natural functionf,
associated with a sequential
machine that maps the set of initial states and the set of T
inputs into the set of T outputs, fT : {s,,} x (I)T -+ (J)T.
A decoding function characterizes the action of a block
decoder on received words. Given a function f: (&)n --f
(c,)’ it can be used as a decoding function for block codes
since it partitions the input space into disjoint sets on the
basis of outputs and these sets can be used as decoding
regions. A decoding function that has M points in its range
makes M decisions. We shall assume that a decoder makes
M decisions, where M is the number of codewords in the
corresponding code.
Two additional remarks are in order here. First, we allow
any representation for points in the range of the decoding
function. Second, any type of storage that does not consist
of an array of individually accessed memory can be replaced
by equivalent circuits realized with the logic elements and
memory cells assumed here. This point will be discussed
again later.
Consider a sequential machine S composed of the interconnection of the sequential machines, S1 , S,, . . , S, or
s = s, x s, x . . . x s,. Let each machine be modeled
as above and let Si have Xi logic elements and execute T
cycles to complete its portion of the computation done by
S. Then, S executes T = max T cycles and receives n inputs
from Cd, say, some of which may appear in the initial states
of one or more machines. If S produces I outputs from cd,
then it is said to compute f,: (&)n + (13’.
Definition: The computational work x performed by S =
s, x s, x ..’ x S, to computef, is defined by
x = i
i=l
x,ly.
(3)
where C&f,) is the minimum number of logic elements
required to realize fs with a combinational machine constructed with logic elements from R.
Comment; Theorem 1 implies that a minimum number
of logical operations or an amount of computational work
is required to compute a given function. It is interesting to
consider C,(h) for linear functions f,: (x2)’ + (x2,,.
Using a variant of Theorem 2 of [l], it can be shown that
almost all such functions for large n have C&J, which is
proportional to n2/log n2.
Proof: Fig. 1 shows a model for a sequential machine S
in which L holds the logic and the unit to the right holds
the state of the machine. Let y, , y2, . . . , yT be the sequence
of inputs to this machine and let z1 , . ’ . , zT be the sequence
of outputs. Since the memory cells are accessed individually,
the combinational
machine shown in Fig. 2 can be constructed, which computes the same function fT computed
by S. This combinational machine has XT logic elements
if X is the number of such elements contained in L.
If machines S, , S, , . . . , SL are interconnected to create
S, each such machine can be “stretched” into a sequential
machine with XiT logic elements. Thus, the stretched
version of S contains x elements and this can not be smaller
than the minimum number of logic elements required to
realize the function associated with S by a combinational
machine.
Q.E.D.
Tape and core memories are not of the type assumed for
our machine models. However, models for these storage
types can be created with the standard machines assumed
here and the result is that we have to add to Xi a quantity
Pi that is the number of logic elements in the model for the
storage associated with Si. For core and tape memories,
Pi will grow at least linearly with the number of cores or
tape squares in the memory unit. Thus, Pi will be significant when Xi is small and we can refer to Pi as the computing
power of storage. (Note that Pi is zero for an array of individually accessed cells since no computing is done by such
an array.)
Hereafter we shall let R be the set of two-input binary
logic elements. As Muller [7] has observed, if one wishes to
deal with a different set a’, each element in n can be replaced by a small number of logic elements from fP
with a consequent multiplicative effect on C&f’) and X.
We also assume that the memory cells are binary.
B. Lower Bounds on Computational
Work
In this section we derive bounds on the minimum computational work needed to support a reliability E at rate R
on a DMC. To begin, let Pe”(X, R) be the minimum attainable average probability of error with equiprobable codewords from a block code of rate R when the decoder is a
SAVAGE : THE
COMPLEXITY
OF DECODERS-PART
II : COMPUTATIONAL
WORK
AND
DECODING
TIME
79
Z:
t’
yi
L
COMBINATIONAL
NETWORK
‘i
I
aI++
Fig. 1.
I I t
Y, Yz Y1
Model for sequential machine.
Fig. 3.
Fig. 2.
Equivalent combinational
machine.
sequential machine that performs computational work x.
Let PP’(~, R) be similarly defined for combinational
machines.
Theorem 2:
Pe”(X, R) 2 Pe’(X, R).
(5)
Proof: Let W be the block code and f the decoding
function that achieves Pe”(X, R) with computational work x.
Since x 2 C,(f) from (4), we can realize the same probability of error with a combinational machine with no more
Q.E.D.
work. Thus, (5) follows.
Consider now the combinational decoder of Fig. 3 for a
binary block code of length n, rate R used on the binary
symmetric channel (BSC) with crossover probability p.
This machine has n binary inputs and k binary outputs with
k 2 nR, if the decoder must be equipped to make exactly
20R decisions.
Assume that some binary output function, say z,., is
identically equal to some binary input, say yj. Then, if no
errors occur in transmission of codeword &,, yj is the jth
digit of 6&,, as is z,. Consequently, if a single error occurs
in the transmission of ?&, (or any codeword) at the jth
position, the decoder output will change or a decoding
error will occur. The probability of a transmission error
on the BSC in a given digit of a codeword is p.
Lemma 1: On the BSC, Pe”(X, R) 2 p if any decoder
output is equal to an input.
On a DMC other than the BSC, we assume that each
channel output letter is encoded into a string of binary
digits. If the channel has J outputs, encode each letter into’
[log, 51 binary digits so that no position in the binary
representation of letters contains the same digit for all
letters. Again, let z, equal yj, the jth binary input of the
combinational decoder. Let yj be in the representation of
the jth channel output letter and assume that the DMC
is completely connected, that is, that all transitions
1 This is the smallest integer greater than or equal to log, J.
--_-Y”
Combinational
decoder
between inputs and outputs have nonzero probability of
occurring and let Pminbe the smallest transition probability.
Let codeword 6$,, be transmitted and yj the jth decoder
input where a transition occurs that corresponds to no
decoding error. Some other transition, however, will result
in a decoding error. This argument applies whether or not
feedback from receiver to transmitter exists.
Lemma 1’ : On a completely connected DMC with
smallest transition probability Pmin, Pe’(x, R) 2 Pmin (with
or without feedback) if any decoder output is equal to an
input.
The implication of these two lemmas is that at least one
logic element must be interposed between input and output
terminals if Pe’(X, R) < Pmin on a completely connected
DMC. As stated earlier, at least nR distinct, nonconstant
nonequal binary outputs are needed to specify each of the
2”R decisions the decoder makes in decoding a code of
block length n and rate R. Thus, for such a decoder at least
nR logic elements are needed if Pe”(X, R) < Pmin. We need
only lower bound n to have a bound to computational work.
Lower bounds to probability of error on the DMC for
all codes of block length n, rate R, and equiprobable codewords have the form [8]-[lo]
PehR) 2 exp, {-n[&(R
- O,(n))+ 02(n)l)
(6)
where 0, (n) and O,(n) approach zero with increasing n. For
all rates less than channel capacity C, we can take E,(R)
to be the sphere-packing exponent or at low rates can
improve on this by using the straight-line bound, both of
which are found in the paper by Shannon et al. [S]. The two
functions O,(n) and O,(n) have similar dependence on n for
both exponents and their relative sizes are such that
[Ol(n)]2/02(n) approaches zero with increasing n. An additional fact that will be important is that the sphere-packing
exponent E,,(R) behaves as a(C - R)2 for R near C. Also
both E,,(R) and the straight-line exponent E,,(R) decrease
with increasing R as indicated in Fig. 4.
Now let Pe”(X, R) < Pmi, on a completely connected
DMC. Then, Pe”(X, R) < Pmin and x 2 nR for the smallest
integer n that ensures that Pe(n, R) < Pe”(x, R). Thus,
x > n,R where n,, is the smallest integer satisfying
dEdR - Olh)) + Wdl
2 E"k RI
= -log,
Here Es&, R) is called the coding reliability.
Pe”(X, R).
(7)
80
IEEE TRANSACTIONS
E,(R)
ON INFORMATION
THEORY,
A
1971
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
Q(O)
JANUARY
K
+ E/E L(O)
I
I
C
Fig. 4.
Probability
R
Fig.
of error exponents.
Theorem 4: For all block decoders operating on completely connected DMCs, a coding reliability E can be
obtained with equiprobable codewords at code rate R,
0 < R < C, only when the decoder computational
work x satisfies
II 2 [W-WQI
(8)
for large E. With feedback, E,(R) is the exponent on any
lower bound to probability of error.
Prooj:’ For R < C and E large, we can see from (7)
and the knowledge that O,(n) and O,(n) have the following
forms for the sphere-packing exponent [9]
O,(n) = 2 + T,
O,(n) = i +
J
i log, g
Ill,”
that no must be very large. Thus, R - O,(q,) will be less than
C and we can neglect O,(n) and O,(n) to give (8) from (6) and
(Q.E.D.)
(7).
This bound approaches zero with decreasing R. We can
improve on this behavior with the following bounding
argument. Assume that a combinational
decoder with x
logic elements achieves the minimum probability of error
Pe”(X, R) at rate R with a code of length n. Assume, however,
that the decoder outputs depend on only r of the n input
letters, i.e., on r of the sequences of [log, 51 binary digits.
Then, the code can be shortened to length r and the rate
increased to R’ = (n/r)R without affecting the probability
of error. If a reliability E’(x, R) is to be maintained, we must
have from (6)
r[EdR - %(4) + 0,091 2 WX, RI.
Lower bounds to computational
complexity
input logic elements are required, assuming that Pe”(x, R) <
Pmin. The next theorem follows.
Theorem 4’: A coding reliability E can be obtained with
equiprobable codewords at rate R, 0 < R < C, on a completely connected DMC only with a computational
work
3~that satisfies
x 2 &W,(R)1
(11)
for large E. This result also applies with feedback.
Note that we replace 2 by s for a logic set R containing
s-input logic elements.
The bounds of (8) and (11) are sketched in Fig. 5 and it
should be clear that (11) is the superior bound at small rates
and may be replaced by (8) at large rates. The behavior of
both bounds near channel capacity is important since both
become very large at capacity. From (6) and (7) and the
related discussion, we have for no at R = C
n,O,(n,) 2 E”(x, R).
Also, from (9) we have
&
log2 (e’lp,J
2 E”(x, RI - 3
(13)
so that n0 and the bound on computational work x changes
from linear to quadratic dependence on reliability E as
the rate approaches capacity. Quite a different kind of
growth applies to the binary symmetric channel and perhaps other channels as well.
On the BSC [ll],
f102(4= log2[JmU
- d/PI
so that n, now grows exponentially with E.
Corollary: Under the conditions of Theorem 4,
(10)
Since R’ 2 R and E,(R) is monotone decreasing in inat R = C on all completely connected DMCs and
creasing R, we can replace R’ by R in (10) to give (7). Thus,
r 2 ~1~. Note that these arguments apply when feedback
p2E 2
is allowed as long as a bound of the form (6) exists.
(16)
x 2 16 (1 - p)
Now if the outputs of the decoder depend on at least ~1~
on the BSC at R = C.
input letters and if each must be connected to an output
through at least one logic element, then at least n,/2 twoOn the BSC then, the lower bound to x behaves as
1
i
I
SAVAGE:
THE
COMPLEXITY
OF DECODERS-PART
II:
COMPUTATIONAL
WORK
(p/Pe)’ near capacity. This implies that only a small computational work is needed if p N Pe but that x grows very
quickly as the ratio p/Pe increases. This is consistent with
the intuition that accepts that decoding that achieves low
error rates near channel capacity is both time consuming
and costly.
We now show that the dependence of the lower bound in
(11) on the reliability E can not be substantially improved
at low rates. We do this by constructing a decoder for a code
with some fixed number M of codewords and demonstrate
that the computational complexity of this decoder grows
with E as E log E. The class of binary maximal-length
sequence codes [9], [19] is especially simple since each of
the n codewords of length n = 2” - 1 in the code is a cyclic
shift of a maximal-length sequence and can be generated
by a linear shift register with m = log, (n + 1) stages. We
pick any M I n codewords from such a code and observe
that the minimum distance of the code is (n + 1)/2.
Consequently, with such a code any pattern of t =
[(n + 1)/4] or fewer errors in transmission of a codeword
over a BSC can be corrected.2 Using standard Chernov
bounding techniques, the probability of incomplete decoding or a decoding error can be shown to decrease exponentially with n, for large n, if the BSC crossover probability
p satisfies p < l/4. Then, the block length and reliability
are related by n 5 E/E(p) where E(p) = -$ log, ~(1 - p)3
-H(1/4).
Consider a decoder that generates each codeword and
computes the Hamming distance between each and the
received sequence. M shift registers with a number of logic
elements proportional to log, n and M counters counting
to t and with a number of logic elements proportional to
log, n will be used as well as a counter to mark the end
of a block. Let each counter produce b = [log, (M + l)]
zeros if the count exceeds t and a unique nonzero b-tuple
identifying the counter, otherwise. Let the decoder output
be created by the logical OR, bit by bit, of the M b-tuples
produced by the counters. The logic required here will
grow as Mb. Therefore, a decoder can be constructed with
X logic elements, X I AM log, n, for large n with A a
constant. Since the number of decoding cycles T = n we
have that the decoder does computational work bounded
by
for large E. Thus, we demonstrate that the rate of growth
of the lower bound (11) with reliability can not be substantially improved for decoders of codes with a fixed number
of codewords (and rate approaching zero with increasing
El.
At this point it is instructive to recall some of the major
results of [ 11. In Theorem 3 of [ 11, lower bounds are derived
to the number of logic elements required in combinationa!
decoders realizing almost all decoding rules in each of
2 1x1 is the largest integer contained in x.
AND
DECODING
81
TIME
several classes. These bounds, Theorem 1, and the lower
bound to probability of error of (4) establish the following
theorem.
Theorem 5: Consider the class of block decoders that
achieve reliability E at rate R, 0 < R < C, with equiprobable codewords on DMC with alphabets of size B.
The fraction of these decoders that require a computational
work x
R
E)----- 2’” logzB)IELW)
1% B
approaches one with increasing E. Similarly, for large E,
almost all bounded-distance decoders of systematic binary
parity-check codes require a computational work x, which
on the BSC satisfies
’ ’
f(1 - E)[(E/EJR))2R(l - RI1
piog2[(E/E,(R))2R(1 - R)]
(18)
Equation (18) is interesting because it implies that random
selection of a bounded-distance decoder for parity-check
codes will yield, with probability close to 1, a decoder with
computational complexity that grows at least as fast with
E as E’/log E’. This is to be compared with the bounds of
Theorems 4 and 4’ that grow linearly with E. On the other
hand, random selection of a block decoder is very likely to
lead to a decoder that requires an enormous computational
work !
C. Work Required by Several Decoding Procedures
In two recent papers, Ziv [2], [3] has introduced an
iterative coding scheme and its modification, that achieves
a small probability of error with decoders of small complexity. The first Ziv scheme [2] has an inner coding step that
converts a noisy DMC into a less noisy channel followed
by another stage of coding that performs error detection
only. The inputs and outputs of this second stage are
scrambled and descrambled to effectively create a channel
that is a noisy binary erasure channel. The erasure-correcting
method of Epstein is then used to correct erasures while
neglecting errors. The second Ziv scheme [3] replaces the
inner code and decoder by the complete first scheme to
reduce the overall decoder complexity.
The computational work performed by the second Ziv
scheme is calculated as the sum of the computational works
of each of the various stages. In fact, this is done by conceptually creating combinational decoder equivalents for
each stage and adding together the number of logic elements
used. The reader can verify that the Ziv iterative method
has a computational complexity x, that grows as the fifth
power of his parameter v. Fig. 6, depicting the first scheme,
may help in this verification. Notice that the innermost
decoder has block length n, , complexity proportional to
2”‘, and is repeated v2/nI times. The second-stage decoder
is repeated v times, has block length v, and performs a
parity check that requires a number of logic elements
proportional
to v2. The third-stage decoder is repeated
vR, times and uses approximately v4 logic elements to solve
for the unknowns (erasures) in a block of v digits.
Thus, the computational work required by the second
82
IEEE TRANSACTIONS
ON INFORMATION
Reed-Solomon
code, R = rR,
E,-(R) is given by
THEORY,
is the overall
E,-(R) = max E(R,)(l
JANUARY
1971
rate, and
- r)
(25)
rR, = R
where E(R) is the best upper bound block exponent.
Theorem 7 : Errors-only decoders of concatenated
[4] of rate R do computational
work x satisfying
DESCRAMBLING
I”
STAGE
Fig. 6.
Jrd STAGE
End STAGE
Combinational
codes
(26)
model of Ziv decoder.
Ziv scheme is bounded by
x, I Av5
(19)
for large v, with A some positive constant. Ziv also demonstrates that this scheme achieves a block probability of
error bounded by
for large E on an arbitrary DMC, where R, is the optimizing value of (25) and B is a constant of the decoding
procedure.
Equation (26) follows from (22)(24) after observing that
NOR0 2 2R,E/E,(R)
(27)
is the solution to (23).
The important point about this result is that it implies
that
the computational work need not grow faster than E’.
for large v and R < C, the DMC channel capacity. ConseThis
is a considerably faster rate of growth than that of the
quently, we have the following theorem.
lower
bounds of Theorems 4 and 4’ and greater than but
Theorem 6: On a DMC with capacity C, a rate R < C
close
to
the rate of growth of the almost-all bound for parityand reliability E can be achieved with a computational
check
codes
given by (18), which grows as E2/log E2.
work 1, which for large E satisfies
It is interesting to consider the computational work that
x I A(E/[l - (R/C)“31)5
(21) is done by sequential decoders [ 151, [ 161. Here the performance is measured primarily by the probability of buffer
where A is some constant of the Ziv decoding scheme.
overflow since the undetected error probability can and
Forney [4] has introduced a two-stage coding method
usually is made much smaller. Sequential decoders decode
called concatenated coding. He considers an inner code convolutional codes that are usually truncated so that they
with q codewords, q a power of a prime, and then uses as form large block codes of block length n. If the buffer size
an outer code a Reed-Solomon code [12] of block length IZ, is B and the speed factor is SF, various bounds [17], [18]
minimum distance d over GF(q), n I q - 1. The decoder
give for the probability of an overflow during n transmisfor the inner code need not require more than approxisions P,,(n)
mately q logic elements for realization as a combinational
1
decoder, but it must be used n times. Then the compu‘dn)
= Dn[~(SF)]E(R)
tational work of the inner decoder will be proportional, at
worst, to nq. As Berlekamp has shown [13], a Reed-Solowhere D is a small constant and a(R) is the Pareto exponent.
mon code can be decoded with a number of logic elements
The sequential decoding machine consists of two prinand a number of cycles each proportional
to n log n. Thus,
cipal components, a logic unit and a buffer. The logic unit
the Forney method requires a computational
work xF
executes the various steps required by the decoding algobounded by
rithm and contains an encoder. The encoder has a number
xF I B(n log n)2
of logic elements that grows linearly with constraint
(24
length v while the remaining logic is essentially independent
for large n, when the ReeddSolomon code is primitive
of v.
(n = q - 1). Here B is a constant of the decoding procedure.
The total number of cycles completed by the decoder will
Forney has given a bound to the probability of error Pe
exceed n and the number of logic elements will exceed B,
for “errors-only”
decoding of concatenated codes that has
since each of the B branches in the buffer must be available
the correct form but whose proof is incomplete. A complete
to the decoder logic unit.
proof is given in [14] and the bound is
The computational
work done by any sequential depe < 2-[~0~cUWl
(23) coder Xs, is bounded by
pe 2 p2-[1 ~(R/C)“31~
(20)
X,,
where N, is the overall block length given by
No = n log, q/R, =
n log, (n + 1)
R
(24)
0
and R, is the rate of the inner code. Also, r is the rate of the
Now substituting
2 nB.
for B through (28) we have
(29)
SAVAGE:
THE
COMPLEXITY
OF DECODERS-PART
II:
COMPUTATIONAL
WORK
where the exponent E,, is defined as the negative logarithm
to the base 2 of P,,(n). Here the speed factor is limited by
the switching time of logic elements and can be considered
a constant. Expression (30) is minimized under variation
of n by making n as small as possible. Here n can not be too
small, otherwise the undetected error rate will exceed the
rate of buffer overflows. Hence, let n = n1 its smallest
value. Then, xsD grows exponentially with E,,, for large
values of E,,. This exponential rate of growth, however,
may not be visible for modest values of E,,, and, in fact,
sequential decoders now find application to many important
coding problems such as satellite communication. Nevertheless for large Es,, sequential decoding is nonoptimal
when optimality is measured in terms of computational
work.
SECTION
III
A. Decoding Time
A nonzero amount of time is required for signals to
propagate through logic elements. In this section, we assume
that each logic unit introduces one unit of delay and we
derive bounds on the total delay z (or the number of levels
of logic) required to achieve a reliability E at code rate R.
We shall derive a lower bound to the minimum delay,
present an almost-all bound to delay for several decoder
classes, and exhibit a decoder that has small total delay.
Again assume that decoders are modeled by combinational or sequential machines constructed with two-input
binary logic elements and binary memory cells. Let 1be the
maximum number of logic levels between the inputs and
outputs of the logic unit of a sequential machine. If the
machine executes T cycles to compute function J the
decoding function, say, then the time spent computing J
z, is given by z = Tl.
Theorem 8 : The time in which f is computed by a sequential machine satisfies
z = Tl 2 D*(f)
(31)
where D*(f) is the minimum time in which f can be computed by a combinational machine using elements from 0.
B. Bounds on Decoding Time
Consider the combinational decoder of Fig. 3. Let zi be
an arbitrary nonconstant output and let zi depend on K
of the n letter inputs. (It is then independent of the remaining
n - K letter inputs.) Assume that the channel is a completely
connected DMC and that each of its J outputs is encoded
into [log, 51 binary digits.3 Also, let Pmi, be the smallest
transition probability of this channel. Then, the smallest
attainable probability of error Pe must satisfy
Pe 2 P,“,,.
(32)
To see this, observe that a received sequence that is decoded
correctly produces one value of zi but that there is some
3 Thus, z, depends on at least one binary digit in each of K encoded
channel letters.
AND
DECODING
TIME
83
pattern of K digits in the received sequence, which if
changed, results in a different value for zi and a decoding
error. And every pattern qf K received digits has a probability greater than or equal to P&, of being received.
Theorem 9: Every combinational decoder achieving a
reliability E on a completely connected DMC with smallest
transition probability Pmin must have minimum delay z
that exceeds (for 2-input logic elements)
z 2 Tlog,(E/-lOg, f',iJl.
(33)
Proqfl From any given output at most two inputs can
be reached with one level of logic and at most 2’ inputs
with z levels of logic. We set 2’ 2 K and solve for K from
(32) to establish the inequality of (33).
Q.E.D.
This bounding argument is the starting point in the work
of Winograd [5], [19] on the time required to multiply and
add. In [19], Winograd presents an upper bound to the
number of combinational machines with n inputs that have
z or fewer levels of logic. If zi is an output of the machine it
could be calculated by one of at most 32’- ‘n2’ trees of
depth z because at most 2’ - 1 positions in a tree are available to the three types of logic elements used here and the
n inputs can be attached to at most 2’ inputs of a tree. If the
machine has k outputs then, at most, (32’p 1n2’)k different
machines with delay z exist.
In Theorem 3 of [l] the number of distinct minimumdistance combinational
decoders of binary parity-check
(BPC) codes was shown to be exp, n2R(1 - R). If we let
k = nR and choose z so that
(32zpln2z)k =(2dR(l-R))1-~
(34)
with 0 < E < 1, then for large n, almost all minimumdistance combinational decoders of BPC codes will require
a z that is larger than the solution of (34). Solving for z,
we have the following.
Theorem 10: Almost all minimum-distance decoders of
BPC codes of block length n and rate R require a computation time z that satisfies
z 2 [log, [( 1 - c)( 1 - R)n/log, n] 1
(35)
forlargenwhenO<<<
1.
Since BPC codes exist for the BSC for which n grows
linearly with E, the BPC minimum-distance decoders are
likely to be as good with respect to computation time as
the best decoder. When this bound is applied to the class
of all block decoders, however, the bound grows linearly with
E and not logarithmically.
Our next objective is to demonstrate the existence of a
decoder for which the number of levels of logic does not
grow too rapidly with reliability E. To do this, we reexamine
the code and decoder introduced at the end of Section
II-B. The code was a subcode of a binary maximal-length
code that contains a fixed number M( I H) of codewords of
length n and corrects t or fewer errors, where t = [(n + 1)/4].
Assume that n parallel BSCs are used for transmission
and that the decoder is a combinational machine that acts
on all received digits simultaneously. Form the term-by-
84
IEEE TRANSACTIONS
term sum modulo 2 of the M codewords with the received
sequence and apply the resultants to n-input combinational
counters that drive circuits that produce b = flog, (M + 1)l
zeros if a count exceeds t and a unique identifying b-tuple,
otherwise. Feed the b-tuple to b OR gates with M inputs and
the output of these b gates will be a b-tuple identifying the
single codeword at Hamming distance t or less from the
received sequence or the zero b-tuple signaling that no
codeword satisfies this condition.
The b OR gates each can be realized with flog, M 1 levels
of 2-input logic elements, or a number of levels that grows
at worst as log n. The circuits that produce b-tuples can be
realized with b 2-input AND gates or with one level of logic
as long as a binary signal is available that reflects whether a
count exceeds t or not. In the next paragraph we argue that
the counters require a number of levels of logic that grows
as (log, n)‘. Finally, the modulo-2 sums of codewords and
the received sequence can be realized with a single level of
logic consisting of 2-input modulo-2 adders. Consequently,
the total number of levels of logic grows as (log, n)“.
Let the binary counter be designed so that it has m inputs
for ma power of 2, say, m = 2k, and assume that its output is
a k-tuple representing in binary radix notation the number
of l’s among the m binary inputs. This circuit can be realized
with two counters each with m/2 inputs and a number of
binary full adders (two binary inputs and a “sum” and a
“carry” output). The full adders are used to add the
number of l’s, 2’s, 4’s, etc., with carry at the output of the
two m/2-input counters. For example, the two 4’s must be
added together with the 4 carry and this can be done with
3 full adders. Thus, the sum of the outputs of the m/2-input
counters can be formed with a total of 3k - 2 full adders.
(The l’s can be added with one full adder.) When the 4
carry is added to the 4’s at the counter output, 2 additional
levels of full adders are introduced. Hence, the total number
of levels of full adders required to sum the counter outputs
is 2k - 1 (the l’s can be added with one level). Let z(m)
be the number of levels of full adders required to count the
number of l’s among m = 2k binary digits. Then, we have
shown that
z(m) = 2(log, m) - 1 + r(m/2).
(36)
Since r(2) = 1, the solution to this recursion relation is
z(m) = i: (2j - 1) + 1
j=2
= p
+ 2)(k -
2
1) _ (k _ 1) + 1
(37)
z(m) = k2 = (log, m)2.
Each full adder can be realized with one level of logic
consisting of a 2-input exclusive OR gate to form the “Sum”
and an AND gate to form the “carry.” Therefore, an n-input
binary counter can be realized with exactly ([log, n1)2
levels of logic since zeros can be introduced to increase n
to the next power of 2. The counter outputs can be combined to produce a binary signal that has value 1 when t
or fewer errors occur and 0 otherwise. Using 2-input AND
ON INFORMATION
THEORY,
JANUARY
1971
and OR gates, the outputs can be combined with [log,
(log, [(n + 1)/S])] + 2 levels of logic when n + 1 is a power
of 2, as is the case for our code.
It is clear that the decoder described above uses a number
of levels of logic bounded by A(log, n)2 for constant A.
Since the probability of error that can be obtained on the
BSC will decrease exponentially with n, we can relate n
and reliability E by n I E/E(p) where E(p) has been given
before and p is the BSC crossover probability p < l/4.
Theorem 11: There exists a code of rate R - l/E and
decoder for it on the BSC with crossover probability p
that achieves reliability E with a computation
time r
bounded by
z g Abg, W%N2
(38)
for A a constant where
E(p) = -+ log, p(1 - p)” - H(1/4).
Any Boolean function of n variables can be realized
(in sum-of-products form) with a computation time that
grows almost linearly with n. Therefore, any decoder for a
block code of length n can be decoded in a time proportional
to E. This is important because all attempts to find a decoder
with z that is not linear in E and that also decodes codes of
fixed rate have not been successful. The question of how
closely the bound of Theorem 9 can be approached for
codes of fixed rate remains open.
IV. CONCLUSIONS
We have introduced two new measures of decoding
complexity, namely, computational work x and decoding
time r. Lower bounds to x have been developed that grow
linearly with reliability E and become large as the code rate
approaches channel capacity. The existence of a decoder
for very low rate codes has been demonstrated that does a
work that grows as E log E. At all rates less than channel
capacity, concatenated coding and decoding achieves
reliability E with a computational work that grows as E2.
Thus, we know that the bounds on x can not be substantially
improved at rates near zero and there is reason to believe
that the bounds can be improved at larger rates.
It is now possible to rank decoding procedures on the
basis of the computational
work they require to reach
reliability E, for large E. We have shown, for example,
that sequential decoding is far inferior to concatenated
decoding and the Ziv iterative decoding procedures at
large E.
The time z required to decode has also been examined
and we have shown that z must grow at least logarithmically
with E. Also, a low-rate decoding procedure has been given
that decodes in a time that grows as the square of the lower
bound. Thus, the dependence of z on E is not clear even for
low-rate coding.
Computational work and decoding time are both measures of the complexity of the decoding process and are not
measures of the complexity of a decoding machine.
The bounds on computational work and decoding time
IEEE TRANSACTIONS
ON INFORMATION
THEORY,
VOL.
IT-II,
NO.
1, JANUARY
1971
need to be tightened. Additionally, it would be instructive
to relate computational work and decoding time more
closely to parameters of well-known codes and standard
decoding rules for them. Such relations might be of more
direct value in decoder design.
REFERENCES
[l] J. E. Savage, “The complexity of decoders-Pt. I : Decoder classes,’
IEEE Trans. Inform. Theory, vol. IT-15, pp. 6899695, November 1969
[2] J. Ziv, “Asymptotic performance and complexity of a coding scheme
for memoryless channels,” IEEE Trans. Inform. Theory, vol. IT-13.
pp. 3566359, July 1967.
[3] -,
“Further results on the asymptotic complexity of an iterative
coding scheme,” IEEE Trans. Inform. Theory, vol. IT-12, pp. 168-171,
April 1966.
[4] G. D. Forney, Concatenated Codes. Cambridge, Mass.: M.I.T.
Press, 1966, ch. 4.
[5] S. Winograd, “On the time required to perform addition,” J. ACM,
vol. 12, no. 2, pp. 277-285, 1965.
[6] J. E. Savage, “Some comments on the computation time and complexity of algorithms,” Proc. Princeton Corzf. Infbrm. Sciences and
Systems, 1969.
[I] D. E. Muller, “Complexity in electronic switching circuits,” IRE
Trans. Electron. Cornput., vol. EC-5, pp. 15-19, March 1956.
[S] C. E. Shannon, R. G. Gallager, and E. R. Berlekamp, “Lower
Cyclic
85
[9]
[lo]
[ 1l]
[12]
[13]
[14]
[15]
[16]
[ll]
[18]
[19]
[20]
bounds to error probability for coding on discrete memoryless
channels,” Inform. and Control, vol. 10, pp. 655103, 522-552, 1967.
R. G. Gallager, Infbrmation Theory and Reliable Communication.
New York: Wiley, 1968, ch. 5.
F. Jelinek, Probabilistic Information Theory. New York: McGrawHill, 1968.
R. G. Gallager, op. cit., p. 164.
I. S. Reed, and G. Solomon, “Polynomial codes over certain finite
fields,” J. SIAM, vol. 8, pp. 300-304, 1960.
E. R. Berlekamp, Algebraic Coding Theory. New York: McGrawHill, 1968, ch. 7.
J. E. Savage, “A note on the performance of concatenated coding,”
IEEE Trans. Ir$orm. Theory (Correspondence), vol. IT-16, pp.
512-513, July 1970.
-,
“Progress in sequential decoding,” in Advances in Communication Systems, vol. 3, A. V. Balakrishnan, Ed. New York: Academic Press, 1968.
J. M. Wozencraft and I. M. Jacobs, Principles OJ Communication
Engineering. New York: Wiley, 1965, ch. 6.
J. E. Savage, “Sequential decoding--The computation problem,”
Bell Syst. Tech. J., vol. 45, no. 1, pp. 149-175, 1966.
I. M. Jacobs and E. R. Berlekamp, “A lower bound to the distribution of computation for sequential decoding,” IEEE Trans. Inform.
Theory, vol. IT-13, pp. 1677174, April 1967.
S. Winograd, “On the time required to perform multiplication,”
J.ACM, vol. 14, no. 4, pp. 793-802, 1967.
W. W. Peterson, Error-Correcting Codes. Cambridge, Mass. : M.I.T.
Press, and New York: Wiley, 1961.
and Multiresidue
Codes
Arithmetic
Operations
THAMMAVARAPU
R. N. RAO, MEMBER, IEEE, AND OSCAR
Abstracf-In
this paper, the cyclic nature of AN codes is defined after a
brief summary of previous work in this area is given. New results are shown
in the determination of the range for single-error-correcting AN codes when
A is the product of two odd primes pI and pz , given the orders of 2 module p 1
and modulo p2.
The second part of the paper treats a more practical class of arithmetic
codes known as separate codes. A generalized separate code, called a mnltiresidue code, is one in which a number N is represented as
where mi are pairwise relatively prime integers. For each AN code, where A
is composite, a multiresidue code can he derived having error-correction
properties analogous to those of the AN code. Under certain natural
constraints, multiresidue codes of large distance and large range (i.e., large
values of N) can be implemented. This leads to possible realization of
practical single and/or multiple-error-correcting arithmetic units.
Manuscript received June 2, 1969. This work was supported in part by
Research Grants NSF GK-1543, NSF GK-25278, and NCR-21-002-229.
T. R. N. Rao is with the Department of Electrical Engineering,
University of Maryland, College Park, Md. 20742.
0. N. Garcia was with the University of Maryland, College Park. He is
now with the Department of Electrical and Electronic Systems, University
of South Florida, Tampa, Fla.
N. GARCIA,
for
MEMBER, IEEE
I. BACKGROUND REMARKS
T
HE CLASS of codes known as AN codes are considered
useful in monitoring errors in arithmetic operations as
well as in communication. First Diamond [3] and later
Brown [2] developed these codes and discussed several
examples. Other researchers [l], [4], [5] extended and
proved some important theorems. Massey [6] presented a
firm mathematical foundation and an excellent survey of
the early work on these codes. Previously, Peterson [13] had
shown that the only possible way to check addition with a
separate checker was by means of a residue code. Garner [7]
established the algebraic structure of the separate- and
nonseparate-type residue codes as machine number systems.
Independently, Mandelbaum [4] and Barrows [l] found a
new class of AN codes with large minimum distance and
mentioned the cyclic property of these codes. Mandelbaum
also studied them as burst-error codes. Rao [8] extended the
use of separate residue codes to check errors in operations
such as complement, shift, rotate, etc. In a more recent
paper [9], a bi-residue code capable of correcting single