194 Sorting in Linear Time: 8.2 Counting Sort
194 Sorting in Linear Time: 8.2 Counting Sort
194 Sorting in Linear Time: 8.2 Counting Sort
8.1-2
Obtain asymptotically tight bounds on lg.nŠ/
Pn without using Stirling’s approxi-
mation. Instead, evaluate the summation kD1 lg k using techniques from Sec-
tion A.2.
8.1-3
Show that there is no comparison sort whose running time is linear for at least half
of the nŠ inputs of length n. What about a fraction of 1=n of the inputs of length n?
What about a fraction 1=2n ?
8.1-4
Suppose that you are given a sequence of n elements to sort. The input sequence
consists of n=k subsequences, each containing k elements. The elements in a given
subsequence are all smaller than the elements in the succeeding subsequence and
larger than the elements in the preceding subsequence. Thus, all that is needed to
sort the whole sequence of length n is to sort the k elements in each of the n=k
subsequences. Show an .n lg k/ lower bound on the number of comparisons
needed to solve this variant of the sorting problem. (Hint: It is not rigorous to
simply combine the lower bounds for the individual subsequences.)
Counting sort assumes that each of the n input elements is an integer in the range
0 to k, for some integer k. When k D O.n/, the sort runs in ‚.n/ time.
Counting sort determines, for each input element x, the number of elements less
than x. It uses this information to place element x directly into its position in the
output array. For example, if 17 elements are less than x, then x belongs in output
position 18. We must modify this scheme slightly to handle the situation in which
several elements have the same value, since we do not want to put them all in the
same position.
In the code for counting sort, we assume that the input is an array AŒ1 : : n, and
thus A:length D n. We require two other arrays: the array BŒ1 : : n holds the
sorted output, and the array C Œ0 : : k provides temporary working storage.
8.2 Counting sort 195
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
A 2 5 3 0 2 3 0 3 0 1 2 3 4 5 B 3
0 1 2 3 4 5 C 2 2 4 7 7 8 0 1 2 3 4 5
C 2 0 2 3 0 1 C 2 2 4 6 7 8
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
B 0 3 B 0 3 3 1 2 3 4 5 6 7 8
0 1 2 3 4 5 0 1 2 3 4 5 B 0 0 2 2 3 3 3 5
C 1 2 4 6 7 8 C 1 2 4 5 7 8
Figure 8.2 The operation of C OUNTING -S ORT on an input array AŒ1 : : 8, where each element
of A is a nonnegative integer no larger than k D 5. (a) The array A and the auxiliary array C after
line 5. (b) The array C after line 8. (c)–(e) The output array B and the auxiliary array C after one,
two, and three iterations of the loop in lines 10–12, respectively. Only the lightly shaded elements of
array B have been filled in. (f) The final sorted output array B.
Figure 8.2 illustrates counting sort. After the for loop of lines 2–3 initializes the
array C to all zeros, the for loop of lines 4–5 inspects each input element. If the
value of an input element is i, we increment C Œi. Thus, after line 5, C Œi holds
the number of input elements equal to i for each integer i D 0; 1; : : : ; k. Lines 7–8
determine for each i D 0; 1; : : : ; k how many input elements are less than or equal
to i by keeping a running sum of the array C .
196 Chapter 8 Sorting in Linear Time
Finally, the for loop of lines 10–12 places each element AŒj into its correct
sorted position in the output array B. If all n elements are distinct, then when we
first enter line 10, for each AŒj , the value C ŒAŒj is the correct final position
of AŒj in the output array, since there are C ŒAŒj elements less than or equal
to AŒj . Because the elements might not be distinct, we decrement C ŒAŒj each
time we place a value AŒj into the B array. Decrementing C ŒAŒj causes the
next input element with a value equal to AŒj , if one exists, to go to the position
immediately before AŒj in the output array.
How much time does counting sort require? The for loop of lines 2–3 takes
time ‚.k/, the for loop of lines 4–5 takes time ‚.n/, the for loop of lines 7–8 takes
time ‚.k/, and the for loop of lines 10–12 takes time ‚.n/. Thus, the overall time
is ‚.k C n/. In practice, we usually use counting sort when we have k D O.n/, in
which case the running time is ‚.n/.
Counting sort beats the lower bound of .n lg n/ proved in Section 8.1 because
it is not a comparison sort. In fact, no comparisons between input elements occur
anywhere in the code. Instead, counting sort uses the actual values of the elements
to index into an array. The .n lg n/ lower bound for sorting does not apply when
we depart from the comparison sort model.
An important property of counting sort is that it is stable: numbers with the same
value appear in the output array in the same order as they do in the input array. That
is, it breaks ties between two numbers by the rule that whichever number appears
first in the input array appears first in the output array. Normally, the property of
stability is important only when satellite data are carried around with the element
being sorted. Counting sort’s stability is important for another reason: counting
sort is often used as a subroutine in radix sort. As we shall see in the next section,
in order for radix sort to work correctly, counting sort must be stable.
Exercises
8.2-1
Using Figure 8.2 as a model, illustrate the operation of C OUNTING -S ORT on the
array A D h6; 0; 2; 0; 1; 3; 4; 6; 1; 3; 2i.
8.2-2
Prove that C OUNTING -S ORT is stable.
8.2-3
Suppose that we were to rewrite the for loop header in line 10 of the C OUNTING -
S ORT as
10 for j D 1 to A:length
Show that the algorithm still works properly. Is the modified algorithm stable?