The document discusses various data structures for representing sets and algorithms for performing set operations on those data structures. It describes representing sets as linked lists, trees, hash tables, and bit vectors. For linked lists, it provides algorithms for union, intersection, difference, equality testing, and other set operations. It also discusses how bit vectors can be used to efficiently represent the presence or absence of elements in a set and perform operations using bitwise logic.
4. Simplest and straight forward
Best suited for dynamic storage facility.
This allow multiplicity of elements ie; Bag structure.
All operations can be easily implemented and
performance of these operations are as good as
compared to other representations.
Ex: set S = { 5,6,9,3,2,7,1} using linked list structure is
5 6 9 3
1 7 2
6. ALGORITHM : UNION_LIST_SETS(Si,Sj;S)
Input: Si and Sj are header of two single linked list
representing two distinct sets.
Output: S is the union of Si and Sj.
Data structure: Linked list representation of set.
7. STEPS
/* to get a header note for S and initialize it*/
1. S= GETNODE(NODE)
2. S.LINK= NULL, S.DATA = NULL
/* to copy the entire list of Si into S*/
3. ptri = si.LINK
4. While (ptri !=NULL) do
1.Data = ptri.data
2.INSERT_SL_FRONT(S, DATA)
3. ptri= ptri.LINK
5.Endwhile
/* for each element in Sj added to S if it
6.ptrj=Sj.LINK
is not in Si*/
8. 7. While (ptrj!=NULL) do
ptri=Si.link
while (ptri. DATA != ptrj. DATA) do
1. ptri=ptri.LINK
8. Endwhile
9.If (ptri=NULL) then
INSERT_SL_FRONT(S,ptrj.DATA)
10. EndIf
11. ptrj=ptrj.LINK
12. Endwhile
13. Return (S)
14. stop
10. ALGORITHM :
INTERSECTION_LIST_SETS(Si,Sj;S)
Input: Si and Sj are header of two single linked list
representing two distinct sets.
Output: S is the intersection of Si and Sj.
Data structure: Linked list representation of set.
11. STEPS:
/*To get a header node for S and initialize it*/
1. S= GETNODE(NODE)
2. S. LINK= NULL, S. DATE= NULL
/*search the list Sj, for each element in Si*/
3. ptri= Si.LINK
4. While (ptri!= NULL) do
1. ptrj= Sj.LINK
2. While(ptrj.DATA!= ptri.DATA) and(ptrj !=NULL) do
1. ptrj= ptrj. LINK
12. 3. Endwhile.
4. If (ptrj!=NULL) then // when the element is found in Sj
1. INSERT_SL_FRONT(S,ptrj,DATA)
5. EndIf
6. ptri = Si.LINK
5. Endwhile
6. Return(S)
7.Stop.
14. ALGORITHM :
DIFFERENCE_LIST_SETS(Si,Sj;S)
Input: Si and Sj are header of two single linked list
representing two distinct sets.
Output: S is the difference of Si and Sj.
Data structure: Linked list representation of set.
15. STEPS:
/*Get a header node forS and initialize it*/
1.S= GETNODE(NODE )
2. S.LINK= NULL,S. DATA =NULL
/*Get S’ the intersection of Si, and Sj*/
3. S‟= INTERSECTION _LIST_SET_(Si, Sj)
/* Copy the entire list Si into S*/
4.ptri= Si. LINK
5. While (ptri.LINK!=NULL) do
1. INSERT_SL_FRONT(S.ptri.DATA)
2. ptri=ptri.LINK
6. Endwhile
16. /* For each element in S’. Delete it from S if it is there*/
7.ptr= S‟.LINK
8.While (ptr!=NULL) do
1. DELETE_SL_ANY(S,ptr.DATA)
2. ptr=ptr.LINK
9. Endwhile
10.Return (S)
11.Stop.
17. ALGORITHM :
EQUALITY_LIST_SETS(Si,Sj)
Input: Si and Sj are header of two single linked list
representing two distinct sets.
Output: Return TRUE if two sets Si and Sj equal else
FALSE
Data structure: Linked list representation of set.
18. STEPS
1. li= 0, lj =0
2.ptr=Si.LINK // to count Si
3.while (ptr!=NULL) do
1. li=li+1
2. ptr=ptr.LINK
4.Endwhile
5. ptr=Sj.LINK //to count Sj
6. While (ptr!=NULL) do
1. lj=lj+1
2. ptr=ptr.LNIK
7.Endwhile
8. If (li !=lj) then
1. flag = FALSE
2. exit .
9.Endif /*compare the elements in Si and Sj*/
19. 10. ptri= Si.LINK,flag=TRUE
11. While (ptril!=NULL )and (flag = TURE) do
1. ptrj=sj.LINK
2. while (ptrj.DATA !=ptri.DATA)and
(ptrj!=NULL) do
1.ptrj=ptrj.LINK
3. Endwhile
4.ptri=ptri.LINK
5. If (ptrj= NULL)then
1. flag= FALSE
6.Endif
12. Endwhile
13.Return(flag)
14.Stop.
21. ►Here a tree is used to represent one set, and the each
element in the set has the same root.
►Each element in a set has pointer to its parent.
►Let us consider sets S1 ={1,3,5,7,9,11,13}
S2 ={2,4,8}
S1
S3 ={6}
S2 S3
1
6
2
3 5 7 9
S3 ={6}
4 8
11 13
S1 ={1,3,5,7,9,11,13} S2 ={2,4,8}
25. Here the elements in collection are separated in to
number of buckets.
Each bucket can hold arbitrary number of elements.
Consider set S ={2,5,7,16,17,23,34,42}
Here hash table with 4 buckets and H(x) hash function
can store which can place element from S to any of the
four buckets.
Bucket 1 16
Bucket 2
5 17
Bucket 3 2 34 42
Bucket 4 7 23
26. A B C B N
-- Q
D E E S
F --
G V S T
-- --
H I J I
K L X K Z
Set Si Set Sj
27. UNION: S = Si U Sj
A B C N
Q
D E S
F
G V U T
--
H I J
K L X Z
28. INTERSECTION DIFFERENCE
B A C
-- --
E D
-- F
-- G
-- --
I H J
K L
Si Sj Si -Sj
30. VARIATION OF
SETS
MAINTAINING THE
MAINTAINING ACTUAL INDICATION OF
DATA VALUE PRESENCE OR
ABSENCE OF DATA
31. A set, giving the records about the age of cricketer
less than or equal to 35 is as given below:
{0,0,0,0,1,1,1,1,0,1,1}
Here 1 indicates the presence of records having the
age less than or equal to 35.
0 indicates the absence of records having the age less
than or equal to 35.
As we have to indicate presence or absence of an
element only, so 0 or 1 can be used for indication for
saving storage space
A bit array data structure is known for this purpose.
A bit array is simply an array containing values 0 or
1(binary).
32. • It is very easy to implement set operation on the bit
array data structure.
• The operations are well defined only if the size of the
bit arrays representing two sets under operation are of
same size.
33. To obtain the union of sets si and sj, the bit-wise
OR operation can be used
Si and Sj are given below:
Si = 1001011001
Sj = 0011100100
Si U Sj = 1 0 1 1 1 1 1 1 0 1
34. ALGORITHM :
UNION_BIT_SETS(Si,Sj;S)
Input: Si and Sj are two bit array corresponding to two
sets.
Output: A bit array S is the result of Si U Sj.
Data structure: Bit vector representation of set.
35. 1. li=LENGTH(Si) //Size of Si.
2. li=LENGTH(Sj) //Size of Sj.
3. If (li != lj) then
1.Print “Two sets are not compatible for union”
2.Exit
4. End if
/*Loop over the under lying bit arrays and bit-
wise OR on its constituents data.*/
5. For i=1 to li do
1.S[i] = Si[i] OR Sj[i]
6. EndFor
7. Return(S)
8. Stop
36. To obtain the intersection of sets si and sj, the bit-
wise AND operation can be used
Si and Sj are given below:
Si = 1001011001
Sj = 0011100100
Si Sj = 0 0 0 1 0 0 0 0 0 0
37. ALGORITHM :
INTERSECTION_BIT_SETS(Si,Sj;S)
Input: Si and Sj are two bit array corresponding to two
sets.
Output: A bit array S is the result of Si Sj.
Data structure: Bit vector representation of set.
38. 1. li=LENGTH(Si) //Size of Si.
2. li=LENGTH(Sj) //Size of Sj.
3. If (li != lj) then
1.Print “Two sets are not compatible for
intersection”
2.Exit
4. End if
/*Loop over the under lying bit arrays and
bit-wise AND on its constituents data.*/
5. For i=1 to li do
1.S[i] = Si[i] AND Sj[i]
6. EndFor
7. Return(S)
8. Stop
39. The difference of Si from Sj is the set of values
that appear in Si but not in Sj. This can be
obtained using bit-wise AND on the inverse of Sj.
Si and Sj are given below:
Si = 1001011001
Sj = 0011100100
Sj’ =11 00011011
S = Si – Sj = Si Sj’ =
1000011001
40. ALGORITHM :
DIFFERENCE_BIT_SETS(Si,Sj;S)
Input: Si and Sj are two bit array corresponding to two
sets.
Output: A bit array S is the result of Si and Sj.
Data structure: Bit vector representation of set.
41. STEPS:
1. li=LENGTH(Si) //Size of Si.
2. lj=LENGTH(Sj) //Size of Sj.
3. If (li != lj) then
1.Print “Two sets are not compatible for difference”
2.Exit
4. End if /*To find the inverse (NOT) of Sj.*/
5. For i=1 to li do
1.Sj[i] = NOT Sj[i]
6. EndFor /*Loop over the under lying bit arrays and
bit-wise AND*/
7. For i=1 to li do
1.S[i] = Si[i] AND Sj[i]d
8. EndFor
9. Return(S)
10. Stop
42. The equality operation is used to determine whether two
sets Si and Sj are equal or not.
This can be achieved by simple comparison between the
pair-wise bit values in two bit arrays.
43. ALGORITHM :
EQUALITY_BIT_SETS(Si,Sj)
Input: Si and Sj are two bit array corresponding to two
sets.
Output: Return TRUE if they are equal else FALSE.
Data structure: Bit vector representation of set.
44. 1. li=LENGTH(Si) //Size of Si.
2. li=LENGTH(Sj) //Size of Sj.
3. If (li != lj) then
1.Return (FALSE) //return with failure
2.Exit
4. End if
/*Loop over the under lying bit arrays and compare*/
5. For i=1 to li do
1.SJ[i] != Sj[i] then
1.Return (FALSE) //return with failure
2.Exit
2.EndIf
6. EndFor
/*Otherwise two sets are equal */
7. Return(TRUE)
8. Stop
46. Let us consider a technique of storage and retrieval
of information using bit strings.
A bit string is a set of bits that is a string of 0’s
and 1’s for example 1000110011 is a bit string.
Let us now see how the information can be stored
and retrieved using bit string.
Let us assume a simple database to store the
information of 10 students.
In the sample database we have assumed the
information structure as stated below:
47. NAME REG SEX DISCIPLINE MODULE CATEGORY ADDRESS
NO
AAA A1 M CS C SC ---
BBB A2 M CE P GN ---
CCC A3 F ME D GN ---
DDD A4 F EC D GN ---
EEE A5 M EE P ST ---
FFF A6 M AE C SC ----
GGG A7 F ME C ST ---
HHH A8 M CE D GN ---
III A9 F CS P SC ---
JJJ A10 M AE P ST ---
A SAMPLE DATA BASE WITH 10 RECORDS
48. Name : String of Characters of length 25.
RegnNo : Alpha numeric string of length 15.
Sex : A single character value coded as
F=Female M=Male
Discipline: Two character value coded as:
AE-Agricultural Engineering
CE-Civil Engineering
CS-Computer Science and Engineering
EC-Electrical and Communication Engineering
EE-Electrical Engineering
ME-Mechanical
Module : One character value coded as
C = Certificate P=Diploma D= Degree
Category: Two character value coded as
GN=General SC=Scheduled Caste
ST=Schedule d tribe OC=Other Category
Address : Alpha numeric String of length 50
49. Length of bit string = number of records(here 10).
To store a particular column we require Bit Arrays storing a set of
bit string.
The number of bit arrays will be determined by different attributes
that the field may have.
For ex:
Sex : 2 for M or F
Discipline : 6 for six different branches
Module : 3 for three different streams
Category : 4 for different categories
All together 15 bit arrays each of length 10 in this case is required
to store the information.
Hence in the bit array in the „i‟th position of the bit string ,a „1‟ means
the existence and „0‟ means the absence of such attribute for the
‘i’th record.
50. ARRAY BIT STRING
M 1100110101
F 0011001010
AE 0000010001
CE 0100000100
CS 1000000010
EC 0001000000
EE 0000100000
ME 0000010000
C 0010001000
P 0100100011
D 0011000100
GN 0111000100
SC 1000010010
ST 0000101001
OC 0000000000
51. How many students are there in engineering and
computer discipline?
To retrieve this information only bit arrays CS needs to
be searched for the number of 1‟s in it.
Who are the female students in CS discipline?
For this information do F CS or
[0 0 1 1 0 0 1 0 1 0] [1 0 0 0 0 0 0 0 1 0] =
[000000010]
Thus it gives the 9th record only.
How many students of General Category are there in
diploma or degree Module?
GN [P D]
52. Efficient in terms of storage point of view
If v = number of bit arrays
r = number of records
Total bits needed = v*r;
In our example 15*10 = 150 bits.
In contrast if we are using conventional method we
may need 10 bytes for sex and module, 20 bytes for
each Discipline and Category thus total 60 bytes=480
bits
53. From computation point of view this technique is
efficient because no searching is involved.
A record can be computed through logical operations
like AND,OR,NOT and hence giving fast computations.
One drawback of this technique is that it is not possible
to store all kind of information. For example , the field
where all or nearly all the values are different ,like name,
regno, address this technique is in efficient.