Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
376 views5 pages

Bda Experiment 4: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 5

BDA EXPERIMENT 4

Roll No. A-52 Name: JANMEJAY PATIL


Class: BE-A Batch: A3
Date of Experiment: Date of Submission
Grade :

B.1. DGIM algorithm:

Write a program by considering any stream to implement the DGIM algorithm.

DGIM algorithm (Datar-Gionis-Indyk-Motwani Algorithm)

Designed to find the number 1’s in a data set. This algorithm uses O(log²N) bits to
represent a window of N bit, allows estimating the number of 1’s in the window with an
error of no more than 50%.

So this algorithm gives a 50% precise answer.

In the DGIM algorithm, each bit that arrives has a timestamp, for the position at which it
arrives. if the first bit has a timestamp 1, the second bit has a timestamp 2, and so on..
the positions are recognized with the window size N (the window sizes are usually taken
as a multiple of 2). The windows are divided into buckets consisting of 1’s and 0's.

RULES FOR FORMING THE BUCKETS:

➢ The right side of the bucket should always start with 1. (if it starts with a 0, it is to
be neglected) E.g. · 1001011 → a bucket of size 4, having four 1’s and starting
with 1 on its right end.
➢ Every bucket should have at least one 1, else no bucket can be formed.
➢ All buckets should be in powers of 2.
➢ The buckets cannot decrease in size as we move to the left. (move-in increasing
order towards left)
B.2. Input and Output:

inp = list(map(int, input("Enter Elements : ").split()))

print("Length of Input: ",len(inp))

bucket_list = []

bucket_size_count = {}

def checker():

for ct in bucket_size_count.keys():

if bucket_size_count[ct] > 2:

s2, e2, size2 = bucket_list.pop(-2)

s1, e1, size1 = bucket_list.pop(-2)

bucket_list.insert(-1, (s1, e2, size1 * 2))

bucket_size_count[ct] -= 2

start_index = 0

end_index = 0

pair = 0

for i in range(len(inp)):

bit = inp[i]

if bit == 1:

if pair == 1:

end_index = i

pair = 0

bucket_list.append((start_index, end_index, 2))

if 2 in bucket_size_count:

bucket_size_count[2] += 1

else:
bucket_size_count[2] = 1

checker()

else:

start_index = i

pair = 1

print(bucket_list)

starts = []

ends = []

for s, e, size in bucket_list:

starts.append(s)

ends.append(e)

print("Buckets are: ", end="")

for i in range(len(inp)):

bit = inp[i]

if i in starts:

print(" ", bit, end="")

elif i in ends:

print(bit, end=" ")

else:

print(bit, end=" ")

print("\nNo. of buckets: ", len(bucket_list))

k = int(input("\nEnter k : "))

length = len(inp)

bound1 = length - 1 - k
bound2 = length - 1

ones_count = 0

for s, e, size in bucket_list[::-1]:

if s < bound1 and e < bound1:

break

elif s <= bound1 <= e:

ones_count += int(size / 2)

elif s >= bound1 and e >= bound1:

ones_count += size

print("Number of 1's in Last", k, "bits are ", ones_count)

OUTPUT
B.3. Observations and learning:

Advantages

➢ Stores only O(log2 N) bits


➢ O(log N)counts of log2N bits each
➢ Easy update as more bits enter
➢ Error in count no greater than the number of 1’s in the unknown area.

Drawbacks

➢ As long as the 1s are fairly evenly distributed, the error due to the unknown region
is small – no more than 50%.
➢ But it could be that all the 1s are in the unknown area at the end. In that case, the
error is unbounded.

B.4. Conclusion:

Hence we’ve successfully implemented a program to implement the DGIM algorithm.

You might also like