Unit 1 DS PDF
Unit 1 DS PDF
Unit 1 DS PDF
The intimate relationship between data and programs can be traced to the
beginning of computing. In any area of application, the input data, (internally
stored data) and output data may each have a unique structure. Data structure is
representation of the logical relationship existing between individual elements of
data. In other words, a data structure is a way of organizing all data items that
considers not only the elements stored but also their relationship to each other.
Data structures are the building blocks of a program, and it affects the design of
both structural and functional aspects of a program. And hence the selection of a
particular data structure stresses on the following two things.
1. The data structures must be rich enough in structure to reflect the relationship
existing between the data.
2. The structure should be simple so that we can process data effectively whenever
required.
These are basic structures and are directly operated upon by the machine
instructions. In general, they have different representations on different
computers. Integer, floating point numbers, character constants, string
constants, pointers etc.
These are more sophisticated data structures. These are derived from the
primitive data structures. The non-primitive data structures emphasize on
structuring of a group of homogeneous (same type) or heterogeneous (different type)
data items. Arrays, lists and files are examples.
int a[10] ;
Where int specifies the data type or type of elements array stores. ”a” is the name
of array, and the number specified inside the square brackets is the number of
elements an array can store this is also called size or length of array.
2. Lists - A list (Linear Linked list) can be defined as a collection of variable number
of data items. Lists are the most commonly used non-primitive data structures. An
element of list must contain least two fields, one for storing data or information
and other for storing address of next element. For storing address we have a
special data structures called pointers, hence the second field of the list must be
pointer type. Technically each such element is referred to as a node.
2.1 Linear data structures – Those data structure where the data elements are
organised in some sequence is called Linear data structures. Here operation on data
structure are possible in a sequence. Stack, queue, array are example of linear data
structure.
2.1.2 Queues - Queues are first in first out type of Data Structures (i.e., FIFO).
In a queue new elements are added to the queue from one end called REAR
end, and the elements are always removed from other end called the FRONT
end.
2.2 Non Linear data structures – Those data structure where the data
elements are not organised in some sequence, organised in some arbitrary function
without any sequence is called Non linear data structures. Graph, Tree are example
of linear data structure.
2.2.1 Trees - A Tree can be defined as finite set of data items (nodes). Tree is
non primitive non-linear type of data structures in which data items are
arranged or stored in a sorted sequence. Trees represent the hierarchical
relationship between various elements. In trees:
The tree structure organizes the data into branches, which relate the information.
1. Simple Graph
2. Directed Graph
3. Non-directed Graph
4. Connected Graph
5. Non-connected Graph
6. Multi-Graph
1. CREATE - This operation results in reserving memory for the program elements.
This can be done by declaration statement. The creation of data structure may take
place either during compile time or during run time.
2. SELECTION- This operation deals with accessing a particular data within a data
structure.
3. DESTROY or DELETE - This operation destroys the memory space allocated for the
specified data structure. Malloc() and free() function of C language are used for
these two operations respectively.
4. UPDATION - As the name implies this operation updates or modifies the data in the
data structure. Probably new data may be entered or previously stored data may be
deleted.
1. SEARCHING – Searching operation finds the presence of the desired data item in
the list of data item. It may also find the locations of all elements that satisfy certain
conditions.
2. SORTING – Sorting is the process of arranging all data items in a data structure in
a particular order say for example, either in ascending order or in descending order.
Algorithm preliminaries
Program Design
There are various ways by which we can specify an program design. Those are -
1. Use of Pseudocode
2. Representation of Flowchart
Suppose we want to find out the time taken by following program statement
x = x+1
Determining the amount of time required by the above statement in terms of clock
time is not possible because following is always dynamic
The above information varies from machine to machine. Hence it is not possible to
find out the exact figure. Hence the performance of the machine is measured in
terms of frequency count.
void fun()
{
int a=10;
a++;
printf (“%d, a”);
} The frequency count of above program is 2.
Complexities
S(P)=C+Sp
where C is a constant i.e. fixed part and it denotes the space of inputs and outputs.
This space is an amount of space taken by instruction, variables and identifiers.
And Sp is a space dependent upon instance characteristics. This is a variable part
whose space requirement depends on particular problem instance.
There are two types of components that contribute to the space complexity
1. The variables whose size is dependent upon the particular problem instance
being solved. The control statements (such as for, do, While, choice) are used to
solve such instance
2. Recursion stack for handling recursive call.
Asymptotic Notations
To choose the best algorithm, we need to check efficiency of each algorithm. The
efficiency can be measured by computing time complexity of each algorithm.
Asymptotic notation is a shorthand way to represent the time complexity. Using
asymptotic notations we can give time complexity as ”fastest possible”, "slowest
possible” or ”average time”. Various notations such as Ω-Omega, θ-theta and O-Big
O used are called asymptotic notations.
1. Big oh Notation
Definition - Let F(n) and g(n) be two non-negative functions. Let n0 and constant c
are two integers such that n0 denotes some value of input and n > n0. Similarly c is
some constant such that c > 0. We can write F(n)<= c*g(n)
for n = 1 then
F(n) = 2n+ 2
= 2(1) + 2
If n = 2 then
F(n) = 2(2) + 2
=6
g(n) = (2)2
g(n) = 4
i.e. F(n) > g(n)
If n = 3 then
F(n) = 2(3) + 2
=8
g(n) = (3)2
g(n) = 9
2. Omega Notation
Omega notation is denoted by “Ω”. This notation is used to represent the lower
bound of algorithm's running time. Using omega notation we can denote shortest
amount of time taken by algorithm.
Then if n = 0
F(n) = 2(0)2 + 5
g(n) = 7 (0)
=0
i.e. F(n) > g(n)
If n = 3 then,
F(n) = 2(3)2 + 5
= 18 + 5
= 23
g(n) = 7(3)
= 21
i.e. F(n) > g(n)
Thus for n > 3 we get F(n) > c * g(n). It can be represented as 2n2 + 5 є Ω (n)
3. Θ Notation
The theta notation is denoted by Θ. By this method the running time is between
upper bound and lower bound.
Definition - Let F(n) and g(n) be two non negative functions. There are two positive
constants namely c1 and c2 such that: c1 g(n) <= F(n) <= c2 g(n)
Then we can say that : F(n) є θ (g(n))
The theta notation is more precise with both big oh and omega notation.
Order of Growth
Time space trade-off is basically a situation where either a space efficiency (memory
utilization) can be achieved at the cost of time or a time efficiency (performance
efficiency) can be achieved at the cost of memory.
Example 1 : Consider the programs like compilers in which symbol table is used to
handle the variables and constants. Now if entire symbol table is stored in the
program then the time required for searching or storing the variable in the symbol
table will be reduced but memory requirement will be more. On the other hand, if
we do not store the symbol table in the program and simply compute the table
entries then memory will be reduced but the processing time will be more.
Example 2 : Suppose, in a file, if we store the uncompressed data then reading the
data will be an efficient job but if the compressed data is stored then to read such
data more time will be required.
Example 3 : This is an example of reversing the order of the elements. That is, the
elements are stored in an ascending order and we want them in the descending
order. This can be done in two ways -
i. We will use another array b[ ] in which the elements in descending order can be
arranged by reading the array an in reverse direction. This approach will actually
increase the memory but time will be reduced.
ii. We will apply some extra logic for the same array all to arrange the elements in
descending order. This approach will actually reduce the memory but time of
execution will get increased.
Pointer
Pointer is the variable which holds the address of another variable. It is Defined by
Symbol *. Pointer operator available in C is ‘*’, called ‘value at address’ operator. It
gives the value stored at a particular address. The ‘value at address’ operator is also
called ‘indirection’ operator. Consider the declaration, int i = 3 ;
The other pointer operator available in C is ‘*’, called ‘value at address’ operator. It
gives the value stored at a particular address. The ‘value at address’ operator is also
called ‘indirection’ operator.
int *j;
main( )
{
int i = 3 ;
int *j ;
j = &i ;
printf ( "\nAddress of i = %u", &i ) ;
printf ( "\nAddress of i = %u", j ) ;
printf ( "\nAddress of j = %u", &j ) ;
printf ( "\nValue of j = %u", j ) ;
printf ( "\nValue of i = %d", i ) ;
printf ( "\nValue of i = %d", *( &i ) ) ;
printf ( "\nValue of i = %d", *j ) ;
}
Memory Allocations In C
int x, y ;
float a[5] ;
when the first statement is encountered, the compiler will allocate two bytes to each
variables x and y. The second statement results into the allocation of 20 bytes to
the array a (5 *4), where there are five elements and each element of float type takes
four bytes).
First problem with this is that the extra elements added as a part of this array ”a”
are not allocated, the consecutive memory location after the five elements, i.e., only
the first five elements are stored in consecutive memory location and the other
elements are stored randomly at any unknown locations the memory. Thus during
accessing these extra elements would not be made available to user, only the first
five values can be accessible.
The second problem with static memory allocation is that if you store less number
of elements than the number of elements for which you have declared memory, then
the rest of the memory will be wasted. This leads to the inefficient use of memory.
The Malloc ( ) function allocates a block of memory in bytes. The user should
explicitly give the block size it requires for the use. The Malloc ( ) function is like a
request to the RAM of the system to allocate memory, if the request is granted (i.e.,
the Malloc ( ) function stays successful in allocating memory), returns a pointer to
the first block of that memory. The type of the pointer it returns is void, which
means that we can assign it any type of pointer. However if the Malloc ( ) function
fails to allocate the required amount of memory, it returns a NULL. The Malloc ( )
function is available in header file alloc.h or stdlib.h in TURBO C. The syntax of this
function is as follows:
Where size represents the size of memory required in bytes (i.e., number of
contiguous memory locations to be allocated). But as already told that the function
Malloc ( ) returns a void pointer so a cast operator is required to change the
returned pointer type according to our need, the above declaration would take the
following form :
Ptr_var = (type_cast *) ma11oc (size)
Where ptr_ var is the name of pointer that holds the starting address of allocated
memory block, type_ cast is the data type into which the returned pointer (or type
void) is to be converted, and size specifies the size of allocated memory block in
bytes. Example
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#define NULL 0
void main( )
{
int *ptr;
int i, n, sum=0;
float avg ;
print(“Enter the number of elements you want to store in the array”) ;
scan(“%d", &n) ;
ptr = (int *) malloc(sizeof(int)); /* Dynamic memory allocation */
if (ptr = =NULL) /* checking if request granted or not */
{
This function works exactly similar to malloc( ) function except for the fact that it
needs two arguments as against one argument required by malloc( ).
For example,
int *ptr;
ptr = (int *) calloc (10 ,2);
Here 2 specifies the size of data type in byte tor which we want the allocation to be
made, which in this case is 2 for integers. And 10 spec1fy the number of elements
for which is to be made. Argument passed to the function malloc was (n * 10), it is a
single argument (don’t be confused) because multiple arguments are always
separated by commas. The argument (n *10) has no commas in between hence it is
a single argument.
After the execution of the above statement a memory block of 20 bytes is allocated
to the requesting program and the address of first block is assigned to pointer ptr.
Another minor difference is that if the memory allocated by malloc( ) function
contains garbage values, while memory allocated is by calloc( ) function contains all
The free( ) function is used to de-allocate the previously allocated memory using
malloc( ) or calloc( ) functions. The syntax of this function is : free (ptr_var) ;
Where ptr__var is the pointer in which the address of the allocated memory block is
assigned. The free function is used to return the allocated memory to the system
RAM.
This function is used to resize the size of memory block, which is already allocated.
It found use of in two situations: -
Where ptr_var is the pointer holding the starting address of already allocated
memory block; and new_size is the size in bytes you want the system to allocate
now, it may be smaller than the size of previously allocated memory block or may
be greater than the size of previously allocated memory block depending upon the
requirement. This function is also available in the header file <stdlib.h>. Example of
functions free( ) and malloc( ).
#inc1ude<stdio.h>
#include<conio.h>
#inc1ude<string.h>
void main( )
{
char *msg ;
msg = (char *) malloc(30 * sizeof(char));
strcopy (msg, “Hello”) ;
printf (“The message now is %s \n", msg) ;
msg = (char *) relloc (msg, 50) ;
strcopy(msg, “good morning. . . .") ;
printf ( “\nThe message is now %s", msg) ;
free(msg) ;
getch( ) ;
}
int a[10] ;
int num[6] = { 2, 4, 12, 5, 45, 5 } ;
int n[ ] = { 2, 4, 12, 5, 45, 5 } ;
Where int specifies the data type or type of elements array stores. ”a” is the name of
array, and the number specified inside the square brackets is the number of
elements an array can store this is also called size or length of array.
1. The individual element of an array can be accessed by specifying name of the array
followed by index or subscript inside square brackets. For. example to access fifth
element of array a, we have to give the following statement : a[4]
2. The first element of the array has index zero [0]. It means the first element and last
element will be specified as: a[0] and a[9] respectively.
4. The number of elements that can be stored in an array i.e, the size of an array or its
length is given by the following equation: - (upper bound – lower bound) + 1.
1. Creation of an array.
2. Traversing an array (accessing array elements)
3. Insertion of new elements.
4. Deletion of required elements.
5. Modification of an element.
6. Merging of arrays.
Types of Array
Above details are of 1-D array. Declaration of one dimension arrays is as follows :
int a[10] ;
int num[6] = { 2, 4, 12, 5, 45, 5 } ;
int n[ ] = { 2, 4, 12, 5, 45, 5 } ;
It is also possible for arrays to have two or more dimensions. The two-dimensional
array is also called a matrix. A sample program that stores roll number and marks
obtained by a student side by side in a matrix.
main( )
{
int stud[4][2] ; int i, j ;
clrscr();
for ( i = 0 ; i <= 3 ; i++ )
{
printf ( "\n Enter roll no. and marks" ) ;
scanf ( "%d %d", &stud[i][0], &stud[i][1] ) ;
}
for ( i = 0 ; i <= 3 ; i++ )
There are two parts to the program—in the first part through a for loop we read in
the values of roll no. and marks, whereas, in second part through another for loop
we print out these values.
In stud[i][0] and stud[i][1] the first subscript of the variable stud, is row number
which changes for every student. The second subscript tells which of the two
columns are we talking about—the zeroth column which contains the roll no. or the
first column which contains the marks. Remember the counting of rows and
columns begin with zero. The complete array arrangement is shown below.
In our sample program the array elements have been stored rowwise and accessed
rowwise. However, you can access the array elements columnwise as well.
Traditionally, the array elements are being stored and accessed rowwise; therefore we
would also stick to the same strategy. Initialising a 2-Dimensional Array is done as
int arr[3][4][2] = { { { 2, 4 }, { 7, 8 }, { 3, 4 }, { 5, 6 } }, { { 7, 6 }, { 3, 4 }, { 5, 3 }, { 2, 3
} }, { { 8, 9 }, { 7, 2 }, { 3, 4 }, { 5, 1 }, } } ;
Arrays like other simple variables can be passed to function. To pass, Its name is
written inside the argument list in the call statement. Arrays are by default passed to
function by Call by reference method because array name is itself ,a pointer to the
first memory location of the array. However we can pass individual array elements
through call by value method also. The following program illustrates, how to pass
array to the functions.
#include<stdio.h>
#include<conio.h>
int funarray(int[ ], int) ;
main( )
{
int a[10], i, sum = 0;
clrscr( );
printf(“Enter the array") ;
for (i=0; i<=9; i++)
{
scanf("%d". &a[i]);
}
sum = funarray(a, 9) ;
printf (“The sum of array elements is %d", sum) ;
getch( );
}
int funarray (int p[ ], int n)
{
int s == 0, i;
for (i=0; i<=n-1; i++)
{
s = s + p[i];
}
return(s) ;
}
When passing array to any function, the array name is used as an argument for the
function declaration.
Inserting an element at the end of an array can be easily done provided the memory
space allocated for the array is large enough to accommodate the additional element.
For inserting the element at required position, elements must be moved downwards
to new locations. To accommodate the new element and keep the order of the
elements.
Algorithm
1. [ Initialization the venue of i ] Set i = len
2. Repeat for i = len down to pos
[shift the e1ements down by 1 position]
Set a [i+1] = a [i]
[End of loop]
3. [Insert the element at required position]
set a[pos] = num
4. [Reset len] Set len = len + 1
5. Display the new list of arrays
6. End
To insert an element in an array.
#inc1ude<stdio.h>
#inc1ude<conio.h>
For deleting an element from the array, the logic is straight forward. Deleting an
element at the end of an array presents no difficulties, but deleting element
somewhere in the middle of the array would require to shift all the elements to fill the
space emptied by the deletion of the element a[5] = 90, then the elements following it
were moved upward by one location as shown in figure 19.
Algorithm for deletion of an element from the array
The following algorithm deletes the element stored at position pos from a linear array
a and assign it to a variable item. Delete (a, pos, n) where n be the length of linear
array
Algorithm
1. Set item = a[pos]
#inc1ude<stdio.h>
#inc1ude<conio.h>
int i, n ;
main( )
{
int a[100], pos;
void del (int a[ ], int, int) :
clrscr ( ) :
printf (“How many e1ements in the array\n") ;
scanf (“%d", &n) ;
printf ("Enter the element of the array\n") ;
for (i=0; i <=n-1; i++)
scanf ("%d", &a[i]) ;
printf ("0n which position e1ement do you want delete\n") :
scanf ("%d", &pos) ;
del (a, pos, n) ;
getch( );
}
void del (int a[ ], int pos, int n)
{
int j, item ;
item = a [pos];
for (j=pos; j<=n-1; j++)
{
a[j] = a [j+1];
}
N=n-1;
Printf ("New array\n") :
Prepared By: Prof. Dheresh Soni SRHU - HSET Page 22
for (i=0 ; i <=n-1; i++)
printf(“%d\n", a[i]) ;
}
Traversing Of An Array
Traversing means to access all the elements of the array, starting from first element
(Upper bond) last element in the array one-by-one.
Algorithm
Let LB be the lower bound and UB be the upper bound of linear array a.
1. [Initialize counter] set 1 at 1ower bound LB
2. Repeat for i = LB to UB
[visit element] Disp1ay a[ i ]
[End of the loop]
3. Exit
#<inc1ude<stdio.h>
#<inc1ude<conio.h>
void main( )
{
int n, 1, a[10] ;
clrscr( );
printf (“Enter the length of the array") ;
scanf (“%d", &n) ;
printf (“Enter the elements of the array”) ;
for(i=0; i<=n-1; i++)
{
scanf (“%d\n”, &a[ i ] );
}
printf (“Traversing of the array") ;
for (i=0: i<=n-1; i++)
printf (“\n%d", a[ i ]) ;
getch ( );
}
The storage can be clearly understood by arranging array as matrix as shown below :
The computer does not keep the track of all elements of the array, rather, it keeps a
base address (i.e. the address of first element in the array), and calculates the
address of required element when needed. It calculates this (in row major
implementation) by the following relation :
Where B is the base address of the array, W is size of each array element, n is the
number of columns (i.e., U2 — L2). L1 the lower bound of row, L2 is lower bound of
column. An example to get a clear idea of row major implementation:
Important: Usually number of rows and columns of a matrix are given (like A[20][30]
or A[40][60] ) but if it is given as A[L1.......U1, L2........U2]. In this case number of
rows and columns are calculated using the following methods:
Number of rows (M) will be calculated as = (U1 – L1) + 1
Number of columns (N) will be calculated as = (U2 – L2) + 1
And rest of the process will remain same as per requirement (Row Major Wise (U1 –
L1) or Column Major Wise (U2 – L2)).
Important: Usually number of rows and columns of a matrix are given (like A[20][30]
or A[40][60] ) but if it is given as A[L1.......U1, L2........U2]. In this case number of
rows and columns are calculated using the following methods:
Sparse Matrix
1. Storage: There are lesser non-zero elements than zeros and thus lesser memory
can be used to store only those elements.
2. Computing time: Computing time can be saved by logically designing a data
structure traversing only non-zero elements. Example:
00304
00570
00000
02600
1. Array representation
2. Linked list representation
Method 2: Using Linked Lists - In linked list, each node has four fields. These four
fields are defined as:
Other representations: - As a Dictionary where row and column numbers are used
as keys and values are matrix entries. This method saves space but sequential
access of items is costly. As a list of list. The idea is to make a list of rows and every
item of list contains values. We can keep list items sorted by column numbers.
Ordered List
An ordered list is a list where the items in the list are held in some kind of sorted
order. The criterion that the items are sorted by is usually called the key. The
ordered list can be accessed at arbitrary locations within the structure, either for
lookup or for deletion. Ordered list is the most basic of the searchable container. A
searchable container supports the following additional operations:
An ordered list is a container which holds a sequence of objects. Each object has a
unique position in the sequence. In addition to the basic selection of operations
supported by all searchable containers, ordered lists provide the following
operations: