Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Research About Data Structures, Searching and Sorting Algorithms

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Research about data structures, searching and sorting algorithms

Data structures, searching and storing algorithms are essential to build scalable systems, as
it is known data structures are used to store data in an organized form as it is defined as a
container that stores data in a specific layout. This “layout” allows a data structure to be
efficient in some operations and inefficient in others.

No matter what problem are you solving, in one way or another you have to deal with
data — whether it’s an employee’s salary, stock prices, a grocery list, or even a simple
telephone directory.

Based on different scenarios, data needs to be stored in a specific format. We have a


handful of data structures that cover our need to store data in different formats.

The most commonly used data structures are

1. Arrays
2. Stacks
3. Queues
4. Linked Lists
5. Trees
6. Graphs
7. Tries (they are effectively trees, but it’s still good to call them out separately).
8. Hash Tables

Arrays

An array is the simplest and most widely used data structure. Other data structures like
stacks and queues are derived from arrays.

Each data element is assigned a positive numerical value called the Index, which
corresponds to the position of that item in the array.

The following are the two types of arrays:

 One-dimensional arrays (as shown above)


 Multi-dimensional arrays (arrays within arrays)

Basic Operations on Arrays


 Insert — Inserts an element at given index
 Get — Returns the element at given index
 Delete — Deletes an element at given index
 Size — Get the total number of elements in array

Stacks
Stack is a linear data structure which follows a particular order in which the operations are
performed. The order may be LIFO(Last In First Out) or FILO(First In Last Out).
Mainly the following three basic operations are performed in the stack:
 Push: Adds an item in the stack. If the stack is full, then it is said to be an Overflow
condition.
 Pop: Removes an item from the stack. The items are popped in the reversed order in
which they are pushed. If the stack is empty, then it is said to be an Underflow
condition.
 Peek or Top: Returns top element of stack.
 isEmpty: Returns true if stack is empty, else false.

Time Complexities of operations on stack:


 push(), pop(), isEmpty() and peek() all take O(1) time. We do not run any loop in any
of these operations.
Implementation:
There are two ways to implement a stack:
 Using array
 Using linked list

Queues
A Queue is a linear structure which follows a particular order in which the operations are
performed. The order is First In First Out (FIFO).

The difference between stacks and queues is in removing. In a stack we remove the item
the most recently added; in a queue, we remove the item the least recently added.

Linked Lists
A linked list is a linear data structure, in which the elements are not stored at contiguous
memory locations. The elements in a linked list are linked using pointers as shown in the
below image:

In simple words, a linked list consists of nodes where each node contains a data field and a
reference(link) to the next node in the list.
Graphs
A graph is a set of nodes that are connected to each other in the form of a network. Nodes
are also called vertices. A pair(x,y) is called an edge, which indicates that vertex x is
connected to vertex y. An edge may contain weight/cost, showing how much cost is
required to traverse from vertex x to y.

Types of Graphs:
 Undirected Graph
 Directed Graph
In a programming language, graphs can be represented using two forms:
 Adjacency Matrix
 Adjacency List
Common graph traversing algorithms:
 Breadth First Search
 Depth First Search

Trees
A tree is a hierarchical data structure consisting of vertices (nodes) and edges that connect
them. Trees are similar to graphs, but the key point that differentiates a tree from the graph
is that a cycle cannot exist in a tree.

The following are the types of trees:


 N-ary Tree
 Balanced Tree
 Binary Tree
 Binary Search Tree
 AVL Tree
 Red Black Tree
 2–3 Tree
Out of the above, Binary Tree and Binary Search Tree are the most commonly used trees.
Trie
Trie, which is also known as “Prefix Trees”, is a tree-like data structure which proves to be
quite efficient for solving problems related to strings. It provides fast retrieval, and is mostly
used for searching words in a dictionary, providing auto suggestions in a search engine, and
even for IP routing.
Hash Table
Hashing is a process used to uniquely identify objects and store each object at some pre-
calculated unique index called its “key.” So, the object is stored in the form of a “key-value”
pair, and the collection of such items is called a “dictionary.” Each object can be searched
using that key. There are different data structures based on hashing, but the most
commonly used data structure is the hash table.
Hash tables are generally implemented using arrays.
The performance of hashing data structure depends upon these three factors:
 Hash Function
 Size of the Hash Table
 Collision Handling Method
Here’s an illustration of how the hash is mapped in an array. The index of this array is
calculated through a Hash Function.

Search algorithms form an important part of many programs. Some searches involve looking
for an entry in a database, such as looking up your record in the IRS database. Other search
algorithms trawl through a virtual space, such as those hunting for the best chess moves.
Although programmers can choose from numerous search types, they select the algorithm
that best matches the size and structure of the database to provide a user-friendly
experience.
Linear Search
The linear search is the algorithm of choice for short lists, because it’s simple and requires
minimal code to implement. The linear search algorithm looks at the first list item to see
whether you are searching for it and, if so, you are finished. If not, it looks at the next item
and on through each entry in the list.
How does it work ?
Linear search is the basic search algorithm used in data structures. It is also called as
sequential search. Linear search is used to find a particular element in an array. It is not
compulsory to arrange an array in any order (Ascending or Descending) as in the case of
binary search.

Linear search is rarely used practically because other search algorithms such as the binary
search algorithm and hash tables allow significantly faster searching comparison to Linear
search.
Time complexity
The time complexity of above algorithm is O(n).

Binary Search
Binary Search is one of the most fundamental and useful algorithms in Computer Science.
It describes the process of searching for a specific value in an ordered collection.
Binary search is a popular algorithm for large databases with records ordered by numerical
key.

Binary Search is generally composed of 3 main sections:


1. Pre-processing — Sort if collection is unsorted.
2. Binary Search — Using a loop or recursion to divide search space in half after each
comparison.
3. Post-processing — Determine viable candidates in the remaining space.
How does it work ?
In its simplest form, Binary Search operates on a contiguous sequence with a specified left
and right index. This is called the Search Space. Binary Search maintains the left, right, and
middle indices of the search space and compares the search target or applies the search
condition to the middle value of the collection; if the condition is unsatisfied or values
unequal, the half in which the target cannot lie is eliminated and the search continues on
the remaining half until it is successful. If the search ends with an empty half, the condition
cannot be fulfilled and target is not found.
Into Binary Search
Given a sample array, first we find out midpoint and split it out. If midpoint is the search
value, then it’s game over. So O(1) time complexity is achieved.
But if it’s not the midpoint’s value, then we have to go on an enchanted search for the value
in divided halves. Because of this, now we can achieve time complexity in order
of log(n) or n i.e. O(logn) or O(n).

The time complexity of Binary Search can be written as


T(n) = T(n/2) + c

Ordering the elements of a list is a problem that occurs in many contexts.

Bubble Sort
Bubble sort algorithm starts by comparing the first two elements of an array and swapping
if necessary, i.e., if you want to sort the elements of array in ascending order and if the first
element is greater than second then, you need to swap the elements but, if the first element
is smaller than second, you mustn’t swap the element. Then, again second and third
elements are compared and swapped if it is necessary and this process go on until last and
second last element is compared and swapped. This completes the first step of bubble sort.
To carry out the bubble sort, we perform the basic operation, that is, interchanging a larger
element with a smaller one following it, starting at the beginning of the list, for a full pass.
We iterate this procedure until the sort is complete.
It is one of the most inefficient sorting algorithms because of how simple it is. While
asymptotically equivalent to the other algorithms, it will require O(n²)swaps in the worst-
case.

Time Complexity
 Worst and Average Case Time Complexity: O(n*n). Worst case occurs when array
is reverse sorted.
 Best Case Time Complexity: O(n). Best case occurs when array is already sorted.

Insertion Sort
Insertion sort is a simple sorting algorithm that builds the final sorted array(or list) one item
at a time. It is much less efficient on large lists than more advanced algorithms such
as quicksort, heapsort, or merge sort. However, insertion sort provides several advantages:

How it works
To sort a list with n elements, the insertion sort begins with the second element. The
insertion sort compares this second element with the first element and inserts it before the
first element if it does not exceed the first element and after the first element if it exceeds
the first element. At this point, the first two elements are in the correct order. The third
element is then compared with the first element, and if it is larger than the first element, it
is compared with the second element; it is inserted into the correct position among the first
three elements.

Time Complexity: O(n*n)


Auxiliary Space: O(1)
Boundary Cases: Insertion sort takes maximum time to sort if elements are sorted in reverse
order. And it takes minimum time (Order of n) when elements are already sorted.

You might also like