Topic: Heap Sort Algorithm: Shashank Dwivedi United Institute of Technology
Topic: Heap Sort Algorithm: Shashank Dwivedi United Institute of Technology
SHASHANK DWIVEDI
UNITED INSTITUTE OF TECHNOLOGY
Outline
` Heaps
` Maintaining the heap property
` Building a heap
` The heapsort algorithm
` Priority queues
2
The purpose of this chapter
` In this chapter, we introduce the heapsort algorithm.
` with worst case running time O(nlgn)
` an in‐place sorting algorithm: only a constant number of array
elements are stored outside the input array at any time.
` thus, require at most O(1) additional memory
3
Heaps
` The (Binary) heap data structure is an array object that can be
viewed as a nearly complete binary tree.
` A binary tree with n nodes and depth k is complete iff its nodes
correspond to the nodes numbered from 1 to n in the full binary
tree of depth k.
2
16 3
14 10 1 2 3 4 5 6 7 8 9 10
4 5 6 7
16 14 10 8 7 9 3 2 4 1
8 7 9 3
8 9 10
2 4 1
4
Binary tree representations
1 1
2 3 2 3
4 5 6 7 4 5 6 7
8 9 10 11 12 13 14 15 8 9 10
5
Attributes of a Heap
` An array A that presents a heap with two attributes:
` length[A]: the number of elements in the array.
` heap‐size[A]: the number of elements in the heap stored with
array A.
` length[A] ≥ heap‐size[A]
2
16 3
14 10 1 2 3 4 5 6 7 8 9 10
4 5 6 7 16 14 10 8 7 9 3 2 4 1
8 7 9 3
8 9 10 length[A]=heapsize[A]=10
2 4 1
6
Basic procedures1/2
` If a complete binary tree with n nodes is represented
sequentially, then for any node with index i, 1 ≤ i ≤ n, we have
` A[1] is the root of the tree
` the parent PARENT(i) is at ⌊i/2⌋ if i ≠ 1
` the left child LEFT(i) is at 2i
` the right child RIGHT(i) is at 2i+1
1
2
16 3
14 10 1 2 3 4 5 6 7 8 9 10
4 5 6 7 16 14 10 8 7 9 3 2 4 1
8 7 9 3
8 9 10
2 4 1
7
Basic procedures2/2
` The LEFT procedure can compute 2i in one instruction by simply
shifting the binary representation of i left one bit position.
` Similarly, the RIGHT procedure can quickly compute 2i+1 by
shifting the binary representation of i left one bit position and
adding in a 1 as the low‐order bit.
` The PARENT procedure can compute ⌊i/2⌋ by shifting i right one
bit position.
8
Heap properties
` There are two kind of binary heaps: max‐heaps and min‐heaps.
` In a max‐heap, the max‐heap property is that for every node i
other than the root,
A[PARENT(i) ] ≥ A[i] .
` the largest element in a max‐heap is stored at the root
` the subtree rooted at a node contains values no larger than that
contained at the node itself
` In a min‐heap, the min‐heap property is that for every node i
other than the root,
A[PARENT(i) ] ≤ A[i] .
` the smallest element in a min‐heap is at the root
` the subtree rooted at a node contains values no smaller than that
contained at the node itself
9
Max and min heaps
14 9 30
12 7 6 3 25
10 8 6 5
Max Heaps
2 10 11
7 4 20 83 21
10 8 6 50
Min Heaps
10
The height of a heap
` The height of a node in a heap is the number of edges on the
longest simple downward path from the node to a leaf, and
the height of the heap to be the height of the root, that is
Θ(lgn).
` For example:
` the height of node 2 is 2
1
` the height of the heap is 3
2
16 3
14 10
4 5 6 7
8 7 9 3
8 9 10
2 4 1
11
The remainder of this chapter
` We shall presents some basic procedures in the remainder
of this chapter.
` The MAX‐HEAPIFY procedure, which runs in O(lgn) time, is the
key to maintaining the max‐heap property.
` The BUILD‐MAX‐HEAP procedure, which runs in O(n) time,
produces a max‐heap from an unordered input array.
` The HEAPSORT procedure, which runs in O(nlgn) time, sorts an
array in place.
` The MAX‐HEAP‐INSERT, HEAP‐EXTRACT‐MAX, HEAP‐INCREASE‐KEY,
and HEAP‐MAXIMUM procedures, which run in O(lgn) time,
allow the heap data structure to be used as a priority queue.
12
Outline
` Heaps
` Maintaining the heap property
` Building a heap
` The heapsort algorithm
` Priority queues
13
The MAX‐HEAPIFY procedure1/2
` MAX‐HEAPIFY is an important subroutine for manipulating max
heaps.
` Input: an array A and an index i
` Output: the subtree rooted at index i becomes a max heap
` Assume: the binary trees rooted at LEFT(i) and RIGHT(i) are
max‐heaps, but A[i] may be smaller than its children
` Method: let the value at A[i] “float down” in the max‐heap
1 1
2
16 3 2
16 3
i
4 10 MAX‐HEAPIFY 14 10
4 5 6 7 4 5 6 7
14 7 9 3 8 7 9 3
8 9 10 8 9 10
2 8 1 2 4 1
14
The MAX‐HEAPIFY procedure2/2
MAX‐HEAPIFY(A, i)
1. l ← LEFT(i)
2. r ← RIGHT(i)
3. if l ≤ heap‐size[A] and A[l] > A[i]
4. then largest ← l
5. else largest ← i
6. if r ≤ heap‐size[A] and a[r] > A[largest]
7. then largest ← r
8. if largest ≠ i
9. ↔
then exchange A[i] A[largest]
10. MAX‐HEAPIFY (A, largest)
15
An example of MAX‐HEAPIFY procedure
1 1
2
16 3 2
16 3
i 4 10 14 10
4 5 6 7 4 5 6 7
14 7 9 3 i 4 7 9 3
8 9 10 8 9 10
2 8 1 2 8 1
2
16 3
14 10
4 5 6 7
8 7 9 3
8 9 i 10
2 4 1
16
The time complexity
` It takes Θ(1) time to fix up the relationships among the
elements A[i], A[LEFT(i)], and A[RIGHT(i)].
` Also, we need to run MAX‐HEAPIFY on a subtree rooted at one
of the children of node i.
` The children’s subtrees each have size at most 2n/3
` worst case occurs when the last row of the tree is exactly half full
` The running time of MAX‐HEAPIFY is
T(n) = T(2n/3) + Θ(1)
= O(lg n)
` solve it by case 2 of the master theorem
` Alternatively, we can characterize the running time of MAX‐
HEAPIFY on a node of height h as O(h).
17
Outline
` Heaps
` Maintaining the heap property
` Building a heap
` The heapsort algorithm
` Priority queues
18
Building a Heap
` We can use the MAX‐HEAPIFY procedure to convert an array
A=[1..n] into a max‐heap in a bottom‐up manner.
` The elements in the subarray A[(⌊n/2⌋+1)…n ] are all leaves of
the tree, and so each is a 1‐element heap.
` The procedure BUILD‐MAX‐HEAP goes through the remaining
nodes of the tree and runs MAX‐HEAPIFY on each one.
BUILD‐MAX‐HEAP(A)
1. heap‐size[A] ← length[A]
2. for i ← ⌊length[A]/2⌋ downto 1
3. do MAX‐HEAPIFY(A,i)
19
An example
1 2 3 4 5 6 7 8 9 10
A 4 1 3 2 16 9 10 14 8 7
2 4 3
1 3
4 5 6 7
i
2 16 9 10
8 9 10
14 8 7
20
[1] 4 [1] 4 [1] 4
MAX‐HEAPIFY(A, 1) MAX‐HEAPIFY(A, 2)
[1] 4 [1] 4 [1] 4
max‐heap
Correctness1/2
` To show why BUILD‐MAX‐HEAP work correctly, we use the
following loop invariant:
` At the start of each iteration of the for loop of lines 2‐3, each
node i+1, i+2, …, n is the root of a max‐heap.
BUILD‐MAX‐HEAP(A)
1. heap‐size[A] ← length[A]
2. for i ← ⌊length[A]/2⌋ downto 1
3. do MAX‐HEAPIFY(A,i)
` We need to show that
` this invariant is true prior to the first loop iteration
` each iteration of the loop maintains the invariant
` the invariant provides a useful property to show correctness
when the loop terminates.
23
Correctness2/2
` Initialization: Prior to the first iteration of the loop, i = ⌊n/2⌋.
⌊n/2⌋+1, …n is a leaf and is thus the root of a
trivial max‐heap.
` Maintenance:By the loop invariant, the children of node i are
both roots of max‐heaps. This is precisely the
condition required for the call MAX‐HEAPIFY(A, i)
to make node i a max‐heap root. Moreover, the
MAX‐HEAPIFY call preserves the property that
nodes i + 1, i + 2, . . . , n are all roots of max‐heaps.
` Termination: At termination, i=0. By the loop invariant, each node
1, 2, …, n is the root of a max‐heap.
In particular, node 1 is.
24
Time complexity1/2
` Analysis 1:
` Each call to MAX‐HEAPIFY costs O(lgn), and there are O(n) such
calls.
` Thus, the running time is O(nlgn). This upper bound, through
correct, is not asymptotically tight.
` Analysis 2:
` For an n‐element heap, height is ⌊lgn⌋ and at most ⌈n / 2h+1⌉
nodes of any height h.
` The time required by MAX‐HEAPIFY when called on a node of
height h is O(h).
⎣lg n ⎦
⎡ n ⎤ ⎛ ⎣lg n ⎦ h ⎞
` The total cost is ∑ ⎢ h +1 ⎥ O (h ) = O ⎜⎜ n ∑ h ⎟⎟.
h =0 ⎢2 ⎥ ⎝ h =0 2 ⎠
25
Time complexity2/2
` The last summation yields
∞
h 1/ 2
∑
h =0 2
h
=
(1 − 1 / 2) 2
=2
` Thus, the running time of BUILD‐MAX‐HEAP can be bounded as
⎣lg n ⎦
⎡ n ⎤ ⎛ ∞ h⎞
∑
h =0
⎢⎢ 2 h +1 ⎥ O (h) = O ⎜ n∑ 2 h ⎟ = O(n)
⎥ ⎝ h =0 ⎠
` We can build a max‐heap from an unordered array in linear
time.
26
Outline
` Heaps
` Maintaining the heap property
` Building a heap
` The heapsort algorithm
` Priority queues
27
The heapsort algorithm
` Since the maximum element of the array is stored at the root,
A[1] we can exchange it with A[n].
` If we now “discard” A[n], we observe that A[1...(n − 1)] can easily
be made into a max‐heap.
` The children of the root A[1] remain max‐heaps, but the new
root A[1] element may violate the max‐heap property, so we
need to readjust the max‐heap. That is to call MAX‐HEAPIFY(A, 1).
HEAPSORT(A)
1. BUILD‐MAX‐HEAP(A)
2. for i ← length[A] downto 2
3. ↔
do exchange A[1] A[i]
4. heap‐size[A] ← heap‐size[A] −1
5. MAX‐HEAPIFY(A, 1)
28
An example
1
2 16 3
14 10
4 5 6 7
8 7 9 3
8 9 10
i
2 4 1
1 2 3 4 5 6 7 8 9 10
A 1 2 3 4 7 8 9 10 14 16
29
[1] 16 [1] 1 [1] 1
32
Outline
` Heaps
` Maintaining the heap property
` Building a heap
` The heapsort algorithm
` Priority queues
33
Heap implementation of priority queues
` Heaps efficiently implement priority queues.
` There are two kinds of priority queues: max‐priority queues
and min‐priority queues.
` We will focus here on how to implement max‐priority queues,
which are in turn based on max‐heaps.
` A priority queue is a data structure for maintaining a set S of
elements, each with an associated value called a key.
34
Priority queues
` A max‐priority queue supports the following operations.
` INSERT(S, x): inserts the element x into the set S.
` MAXIMUM(S): returns the element of S with the largest key.
` EXTRACT‐MAX(S): removes and returns the element of S with
the largest key.
` INCREASE‐KEY(S, x, k): increases value of element x’s key to the
new value k. Assume k ≥ x’s current key
value.
35
Finding the maximum element
` MAXIMUM(S): returns the element of S with the largest key.
` Getting the maximum element is easy: it’s the root.
HEAP‐MAXIMUM(A)
1. return A[1]
` The running time of HEAP‐MAXIMUM is Θ(1).
36
Extracting max element
` EXTRACT‐MAX(S): removes and returns the element of S with
the largest key.
HEAP‐EXTRACT‐MAX(A)
1. if heap‐size[A] < 1
2. then error “heap underflow”
3. max ← A[1]
4. A[1] ← A[heap‐size[A]]
5. heap‐size[A] ← heap‐size[A] − 1
6. MAX‐HEAPIFY(A, 1)
7. return max
` Analysis: constant time assignments + time for MAX‐HEAPIFY.
` The running time of HEAP‐EXTRACT‐MAX is O(lgn).
37
Increasing key value
` INCREASE‐KEY(S, x, k): increases value of element x’s key to k.
Assume k ≥ x’s current key value.
HEAP‐INCREASE‐KEY (A, i, key)
1. if key < A[i]
2. then error “new key is smaller thean current key”
3. A[i] ← key
4. While i > 1 and A[PARENT(i)] < A[i]
5. ↔
do exchange A[i] A[P ARENT(i)]
6. i ← PARENT(i)
` Analysis: the path traced from the node updated to the root
has length O(lgn).
` The running time is O(lgn).
38
An example of increasing key value
1
2 16 3
14 10
4 5 6 7
8 7 9 3
8 9 10
i
2 4
15 1
Increase Key!
39
[1] 16 [1] 16
[1] 16 [1] 16
i
[2] 15 [3] 10 [2] 14 [3] 10
i
[4] 14 [5] 7 9 3 [4] 15 [5] 7 9 3
[6] [7] [6] [7]
2 8 1 2 8 1
[8] [9] [10] [8] [9] [10]
Inserting into the heap
` INSERT(S, x): inserts the element x into the set S.
MAX‐HEAP‐INSERT(A)
1. heap‐size[A] ← heap‐size[A]+1
2. A[heap‐size[A] ← −∞
3. HEAP‐INCREASE‐KEY(A, heap‐size[A], key)
` Analysis: constant time assignments + time for HEAP‐INCREASE‐KEY.
` The running time is O(lgn).
41