Sorting in Linear Time: Counting-Sort
Sorting in Linear Time: Counting-Sort
Sorting in Linear Time: Counting-Sort
Example:
n=6, k=3
Index 1 2 3 4 5 6
A= 3 2 3 1 3 1
C 213
C 236
j=6, A[6]=1, C[1]=2, B[2] 1
B 1
C 136
j=5, A[5]=3, C[3]=6, B[6]3
B 1 3
C 135
j=4, A[4]=1, C[1]=1, B[1]1
B 11 3
C 035
j=3, A[3]=3, C[3]=5, B[5]3
B 11 33
C 034
j=2, A[2]=2, C[2]=3, B[3]2
B 112 33
C 024
j=1, A[1]=3, C[3]=4, B[4]3
B 112333
C 023
For more examples you can use the following Counting-Sort animation.
Page 1
Practical Session #11 Sorting in Linear Time
the worst-case.
Counting sort is stable, two elements with the same key value will appear in the output
in the same order as they appeared in the input. Stability is important when there is
additional data besides the key.
Radix-Sort
A stable sort algorithm for sorting elements with d digits, where digits are in base b,
i.e., in range [0,b-1]. The algorithm uses a stable sort algorithm to sort the keys by each
digit, starting with the least significant digit (the rightmost one) marked in the
algorithm as the 1th digit.
Radix-Sort(A[1..n], d)
for i 1 to d
Use stable sort to sort A on digit i (like counting-sort)
Run-time Complexity:
Assuming the stable sort runs in O(n+b) (such as counting sort) the running time is
O(d(n+b)) = O(dn+db).
If d is constant and b=O(n), the running time is O(n).
Example
7 numbers with 3 digits. n=7, d=3,b=10 (decimal)
329, 457, 657, 839, 436, 720, 355
720, 355, 436, 457, 657, 329, 839 sorted by digit 1
720, 329, 436, 839, 355, 457, 657 sorted by digit 2 (and 1)
329, 355, 436, 457, 657, 720, 839 sorted
For more examples you can use the following Radix-Sort animation.
Bucket-Sort
Assumption: Input array elements are uniformly distributed over the interval
[0,1).
The idea of bucket-sort is to divide the interval [0,1) into n equal-sized subintervals
(buckets), and then distribute the n input numbers into the buckets. To produce the
output we simply sort the elements in each bucket and then creating a sorted list by
going through the buckets in order.
Bucket-sort (A)
n length(A)
for i 1 to n do
B nA[i ] A[i ]
for i 1 to n do
sort B[i] with insertion sort
concatenate the lists B[0], B[1] ,, B[n-1] together in order
Run-time Complexity:
Assuming the inputs are uniformly distributed over [0,1), we expect O(1) elements in
each bucket (average case), thus sorting them takes O(1) expected time.
We insert n elements into n buckets in O(n) and we concatenate the
Lists in O(n) Total expected run time: O(n).
Page 2
Practical Session #11 Sorting in Linear Time
Solution
Preprocess(A, k)
for i 1 to k
C[i] 0
for j 1 to n
C[A[j]] C[A[j]] + 1 // C[i] = the number of appearances of i in A.
for i 2 to k
C[i] C[i] + C[i-1] // C[i] = the number of elements in A that are i
return(C)
end
Range(C,a,b)
return(C[b]-C[a-1])
end
Page 3
Practical Session #11 Sorting in Linear Time
Design an algorithm for sorting n data items with keys in the range [x,x+d) that runs in
expected O(n) time if the items are uniformly distributed over [x,x+d], and runs in O(nlogn)
in the worst distribution case.
Solution 1:
Use bucket sort over the range [x,x+d) with the following changes:
1. The elements in each bucket are stored in a AVL tree (instead of a linked list)
2. In the last stage, concatenate all the inorder visits of all the buckets one after another.
Note: bucket distribution function will be B n(( A[i ] x) / d ) A[i ]
Time Complexity:
Let ni be the number of elements in the tree in bucket i.
1. Inserting the n elements into the buckets takes O(n1logn1 + n2logn2 + ... + nnlognn)
When the keys are uniformly distributed ni = O(1) for every i, hence
O(n1logn1 + n2logn2 + ... + nnlognn) c(n1 + n2 + ... + nn) = cn, where c is a constant.
In the worst distribution cases:
O(n1logn1 + n2logn2 + ... + nnlognn) O(n1logn + n2logn + ... + nnlogn) =
O((n1 + n2 + ... + nn )(logn)) = O(nlogn)
2. Inorder traversals of all buckets takes O(n1 + n2 + ... + nn) = O(n)
3. Concatenation of all inorder traversal lists takes O(n)
The algorithm runs in O(n) time for uniformly distributed keys and runs in O(nlogn) in the
worst distribution case.
Solution 2:
Execute in parallel the following two algorithms:
1. Original bucket sort
2. Any sort algorithm that takes O(nlogn)
Stop when one of the algorithms has stopped and return the sorted elements.
Page 4
Practical Session #11 Sorting in Linear Time
Solution:
Comparison-based algorithm takes O(nlogn) .
Counting-sort: k= n3 O(n+ n3)=O(n3)
Radix-sort: b=n, d =4, d(b+n)=O( 4 (n+n))=O(n)
Why is that ?
Use radix-sort after preprocessing:
1. Convert all numbers to base n in O(n) total time using mod and div operations.
x/ni mod n )
x = [x3,x2,x1,x0] (x0= x mod n, xi =
2. Call radix-sort on the transformed numbers. O(n)
All the numbers are in range 1 to n3, therefore, we have at most 4 digits for each
number. The running time for the suggest algorithm: d=4, b=n O(4(n+n))=O(n).
Suggest an algorithm for sorting all sets S1, S2,...,Sm in O(n) time complexity and O(n) space
(memory) complexity.
Note: The output is m sorted sets and not one merged sorted set.
Solution:
Page 5
Practical Session #11 Sorting in Linear Time
Given n words in English (assume they are all in lower case). The words are not of the same
length. Suggest an algorithm for sorting the words in a lexicographic order in O(n) time.
Solution:
Reduction:
Assume you have two problems P1 and P2 and the algorithm for solving P2 is known.
A reduction from P1 to P2 is a way to solve problem P1 by using the algorithm for solving P2.
P1
x f(x) P2 y f-1(y)
Example:
Lexicographical-sort input: {blue, red, green}
Radix-sort input: { (2,12, 21,5,0), (18,5,4,0,0), (7, 18,5,5,14) }
Radix-sort output: { (2,12, 21,5,0), (7, 18,5,5,14) , (18,5,4,0,0) }
Lexicographical-sort output: {blue, green, red}
Page 6
Practical Session #11 Sorting in Linear Time
Given an array A of n positive integer numbers. All the numbers except for ten are in the
range [10, 10n]. Design an algorithm to sort an array A in O(n) time in the worst case.
Solution:
1. Traverse the array and remove 10 numbers that are not in the range into a help array B.
//O(n) time
2. Counting sort on the array A. //O( n-10 +10n)=O(n) time
3. Some sort on the array B. //O(1) time
4. Merge A and B. //O(n) time
Page 7