Unit I
Unit I
DATA STRUCTURES
A Data Structure is a set of domain D, a designated domain d D, a set of functions F and a set of axioms A.
The triplet (D, F, A) denotes the data structure d and it will usually be abbreviated by writing d.
Or
A data structure is an organized collection of data along with the allowed operations on that.
Data Structure is a way of collecting and organizing data in such a way that we can perform operations on
these data in an effective way. Data Structures is about rendering data elements in terms of some relationship,
for better organization and storage.
Classifications of Data Structures can be done in many ways. We may classify the Data Structures are
1. Linear Data Structures
In a Linear Data Structure the data items are arranged in a linear sequence.
Ex: Array, stack, queue, linked list
An algorithm is a finite set of instructions or logic, written in order, to accomplish a certain predefined task.
Algorithm is not the complete code or program, it is just the core logic(solution) of a problem, which can be
expressed either as an informal high level description as pseudocode or using a flowchart.
An algorithm is said to be efficient and fast, if it takes less time to execute and consumes less memory space.
The performance of an algorithm is measured on the basis of following properties :
1. Time Complexity
2. Space Complexity
Its the amount of memory space required by the algorithm, during the course of its execution. Space
complexity must be taken seriously for multi-user systems and in situations where limited memory is available.
An algorithm generally requires space for following components :
• Instruction Space: It’s the space required to store the executable version of the program. This space is
fixed, but varies depending upon the number of lines of code in the program.
• Data Space: It’s the space required to store all the constants and variables value.
• Environment Space: It’s the space required to store the environment information needed to resume the
suspended function.
Time Complexity is a way to represent the amount of time needed by the program to run to completion.
Time complexity of an algorithm signifies the total time required by the program to run to completion. The
time complexity of algorithms is most commonly expressed using the big O notation.
Time Complexity is most commonly estimated by counting the number of elementary functions performed by
the algorithm. And since the algorithm's performance may vary with different types of input data, hence for
UNIT I
DATA STRUCTURES
an algorithm we usually use the worst-case Time complexity of an algorithm because that is the maximum
time taken for any input size.
Example#1
for(i=0; i < N; i++)
{
statement;
}
The time complexity for the above algorithm will be Linear. The running time of the loop is directly
proportional to N. When N doubles, so does the running time.
Example#2
for(i=0; i < N; i++)
{
for(j=0; j < N;j++)
{
statement;
}
}
This time, the time complexity for the above code will be Quadratic. The running time of the two loops is
proportional to the square of N. When N doubles, the running time increases by N * N.
Example#3
while(low <= high)
{
mid = (low + high) / 2;
if (target < list[mid])
high = mid - 1;
else if (target > list[mid])
low = mid + 1;
else break;
}
This is an algorithm to break a set of numbers into halves, to search a particular field(we will study this in
detail later). Now, this algorithm will have a Logarithmic Time Complexity. The running time of the algorithm
UNIT I
DATA STRUCTURES
is proportional to the number of times N can be divided by 2(N is high-low here). This is because the algorithm
divides the working area in half with each iteration.
Taking the previous algorithm forward, above we have a small logic of Quick Sort. Now in Quick Sort, we divide
the list into halves every time, but we repeat the iteration N times (where N is the size of list). Hence time
complexity will be N*log( N ). The running time consists of N loops (iterative or recursive) that are logarithmic,
thus the algorithm is a combination of linear and logarithmic.
Linear Search: The list is ordered from the smallest to the biggest. The easiest way to find our number is to
start at the beginning and compare our number to each number in the list. If we reach our target, then we are
done. This method of searching is called Linear Searching.
3. If the current value matches the target then we declare victory and stop.
4. If the current value is less than the target then set the current item to be the next item and repeat
from 2.
Given a sorted array arr[] of n elements, write a function to search a given element x in arr[]. A simple approach
is to do linear search, i.e., start from the leftmost element of arr[] and one by one compare x with each
element of arr[], if x matches with an element, return the index. If x doesn’t match with any of elements,
return -1.
UNIT I
DATA STRUCTURES
Iterative Method: The drawbacks of sequential search can be eliminated if it becomes possible to eliminate
large portions of the list from consideration in subsequent iterations. The binary search method just that, it
halves the size of the list to search in each iteration. Binary Search requires sorted data.
Algorithm:-
Input : Sorted data of size N, Target value T
Output : Position of T in the list = T
Begin
1. High = N
Low = 1
Found = false
2. While (Found is false and low <= High)
Mid = (low + high) / 2
If T == List[mid]
I = mid
Found = True
Else if T < List [mid]
High = mid –1
Else
Low = mid + 1
End.
UNIT I
DATA STRUCTURES
Analysis:-
In general, the binary search method requires no more than [log 2N]+1 comparisons.
Introduction to Sorting
Sorting is nothing but storage of data in sorted order, it can be in ascending or descending order. The term
Sorting comes into picture with the term Searching. There are so many things in our real life that we need to
search, like a particular record in database, roll numbers in merit list, a particular telephone number, any
particular page in a book etc.
There are several sorting techniques here. The list is given below:
• Bubble sort
• Selection sort
• Insertion sort
• Quick sort
• Merge sort
• Radix sort
• Heap sort Etc.,
Bubble sorting:
Bubble Sort is an algorithm which is used to sort N elements that are given in a memory for eg: an Array with
N number of elements. Bubble Sort compares the entire element one by one and sort them based on their values.
It is called Bubble sort, because with each iteration the smaller element in the list bubbles up towards the first
place, just like a water bubble rises up to the water surface.
Sorting takes place by stepping through all the data items one-by-one in pairs and comparing adjacent data items
and swapping each pair that is out of order.
Suppose there is a list of elements, say 44 33 77 11 55 66
Sample code is here:
UNIT I
DATA STRUCTURES
}
//now you can print the sorted array after this
Selection sorting is conceptually the simplest sorting algorithm. The algorithm first finds the smallest
element in the array and exchanges it with the element in the first position, then finds the second
smallest element and exchange it with the element in the second position, and continues in this way
until the entire array is sorted.
Insertion sort:
UNIT I
DATA STRUCTURES
Quick Sort, as the name suggests, sorts any list very quickly. Quick sort is not stable search, but it is very fast
and requires very less aditional space. It is based on the rule of Divide and Conquer (also called partition-
exchange sort). This algorithm divides the list into three main parts:
UNIT I
DATA STRUCTURES
a[left] = a[right];
left++;
}
while((a[left]<pivot)&&(left<right))
left++;
if(left!=right)
{
a[right] = a[left];
right--;
}
}
a[left]= pivot;
pivot = left;
left = l;
right = r;
if(left<pivot)
sort(a,left,pivot-1);
if(right>pivot)
sort(a,pivot+1,right);
}
Merge Sort follows the rule of Divide and Conquer. Here there are two issues in the process:
• We divide the list into two halfs, and each half into 2 halfs and we continue the same till no sub list is
further divided.
• Then we try to combine two sub lists each time and combine into 2 lists.
void merge_sort ( int a[100], int top, int size, int bot )
{
int u,i,j,k;
int temp[100];
i= top;
j = size+1;
k = top;
while((i<=size)&&(j<=bot))
{
if(a[i]<=a[j])
UNIT I
DATA STRUCTURES
{
temp[k]=a[i];
i++ ;
}
else
{
temp[k] = a[j];
j++;
}
k++;
}
if(i<=size)
{
while(i<=size)
{
temp[k] =a[i];
k++;
i++;
}
}
else
{
while(j<=bot)
{
temp[k]=a[j];
k++;
j++;
}
}
for(u=top;u<=bot;u++)
a[u] = temp[u];
return;
}
Here
Worst Case Time Complexity: O(n log n)
UNIT I
DATA STRUCTURES
Radix Sort is a clever and intuitive little sorting algorithm. Radix Sort puts the elements in order by comparing
the digits of the numbers.
max=a[0];
for(i=1;i<n;i++)
{
if(a[i] > max)
max = a[i];
}
digits = 0;
while(max!=0)
{
digits++;
max /= 10;
}
div = 1;
for(p=0;p<digits;p++)
{
for(k=0;k<10;k++)
bcount[k]=0; // Initialize all buckets' count to 0;
for(i=0;i<n;i++)
{
l = (a[i]/div)%10;
bucket[l][bcount[l]++]=a[i];
}
i=0;
for(k=0;k<10;k++)
{
for(j=0;j<bcount[k];j++)
UNIT I
DATA STRUCTURES
a[i++] = bucket[k][j];
}
div *= 10;
}
}
Arrays
An array is a group of logically related data items of the same data-type addressed by a common name and
all the items are stored in contiguous memory locations.
The computer does not need to keep track of the address of every element of the array but needs to
keep track only of the address of the first element of the array, using that address the computer calculates
the address of any element of array by the following formula,
Access the content of any specified location without scanning any other element of an array.
Some of the basic operations that can be performed on the arrays are,
UNIT I