Questions 11
Questions 11
Questions 11
At the start, the array entries may contain garbage, and initializing the entire array is impractical because of its size. Describe a scheme for implementing a directaddress dictionary on a huge array. Each stored object should use O(1) space; the operations SEARCH, INSERT, and DELETE should take O(1) time each; and the initialization of the data structure should take O(1) time. (Hint: Use an additional stack, whose size is the number of keys actually stored in the dictionary, to help determine whether a given entry in the huge array is valid or not.) 11.2-1 Suppose we use a hash function h to hash n distinct keys into an array T of length m. Assuming simple uniform hashing, what is the expected number of collisions? More precisely, what is the expected cardinality of {{k, l} : k _= l and h(k) = h(l)}? 11.2-4 Suggest how storage for elements can be allocated and deallocated within the hash table itself by linking all unused slots into a free list. Assume that one slot can store a flag and either one element plus a pointer or two pointers. All dictionary and free-list operations should run in O(1) expected time. Does the free list need to be doubly linked, or does a singly linked free list suffice? 11.3-3 Consider a version of the division method in which h(k) = k mod m, where m = 2 1 and k is a character string interpreted in radix 2 . Show that if string x can be derived from string y by permuting its characters, then x and y hash to the same value. Give an example of an application in which this property would be undesirable in a hash function. 11.3-5 _ Define a family H of hash functions from a finite set U to a finite set B to be universal if for all pairs of distinct elements k and l in U, Pr {h(k) = h(l)} where the probability is taken over the drawing of hash function h at random from the family H. Show that an _-universal family of hash functions must have
11-1 Longest-probe bound for hashing A hash table of size m is used to store n items, with n m/2. Open addressing is used for collision resolution.
a. Assuming uniform hashing, show that for i = 1, 2, . . . , n, the probability that the ith insertion requires strictly more than k probes is at most 2 . b. Show that for i = 1, 2, . . . , n, the probability that the ith insertion requires more than 2 lg n probes is at most . Let the random variable Xi denote the number of probes required by the ith insertion. You have shown in part (b) that Pr {Xi > 2 lg n} variable X = max of the n insertions. . Let the random Xi denote the maximum number of probes required by any
d. Show that the expected length E [X] of the longest probe sequence is O(lg n). 11-2 Slot-size bound for chaining Suppose that we have a hash table with n slots, with collisions resolved by chaining, and suppose that n keys are inserted into the table. Each key is equally likely to be hashed to each slot. Let M be the maximum number of keys in any slot after all the keys have been inserted. Your mission is to prove an O(lg n/ lg lg n) upper bound on E [M], the expected value of M. a. Argue that the probability Qk that exactly k keys hash to a particular slot is given by
b. Let Pk be the probability that M = k, that is, the probability that the slot containing the most keys contains k keys. Show that
11-3 Quadratic probing Suppose that we are given a key k to search for in a hash table with positions 0, 1, . . . ,m 1, and suppose that we have a hash function h mapping the key space into the set {0, 1, . . . ,m 1}. The search scheme is as follows. 1. Compute the value i h(k), and set j 0. 2. Probe in position i for the desired key k. If you find it, or if this position is empty, terminate the search. 3. Set j ( j + 1) mod m and i (i + j ) mod m, and return to step 2. Assume that m is a power of 2. a. Show that this scheme is an instance of the general quadratic probing scheme by exhibiting the appropriate constants c1 and c2 for equation (11.5). b. Prove that this algorithm examines every table position in the worst case.