Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Data Structure

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Data structure

A data structure is a specialized format for organizing, managing,


and storing data efficiently. It defines the way data is arranged in memory,
allowing for optimized access and modification. Data structures are
fundamental to designing efficient algorithms and software applications.

Common Types of Data Structures:

1. Arrays:
A collection of elements identified by index or key. They have
fixed size and store elements of the same type.
2. Linked Lists:
A sequence of elements, where each element points to the next
one, allowing for dynamic memory allocation.
3. Stacks:
A linear data structure following the Last-In-First-Out (LIFO)
principle. Elements are added and removed from the top.
4. Queues:
A linear data structure following the First-In-First-Out (FIFO)
principle. Elements are added at the rear and removed from the
front.
5. Trees:
A hierarchical structure consisting of nodes, with each node
containing a value and pointers to its child nodes. Examples include
binary trees and AVL trees.
6. Graphs:
A collection of nodes (vertices) and edges, representing
relationships between pairs of nodes. They can be directed or
undirected.
7. Hash Tables:
A data structure that maps keys to values, using a hash function
to index an array of buckets or slots.
8. Heaps:
A specialized tree-based data structure that satisfies the heap
property, useful for priority queues.

Significance of Data Structures:

• Efficiency:
Proper use of data structures can significantly improve the
efficiency of algorithms.
• Scalability:
They help in managing large amounts of data efficiently.
• Maintainability:
Well-organized data structures lead to more readable and
maintainable code.
• Memory Management:
They optimize memory usage and access patterns.

Algorithm
An algorithm is a step-by-step procedure or a set of rules designed
to solve a specific problem or perform a task. It serves as a blueprint for
carrying out computations, data processing, and automated reasoning
tasks. In computing, algorithms are essential for writing programs that
perform various functions efficiently and correctly.

Key Characteristics of Algorithms:

1. Finite Steps:
An algorithm must have a clear starting point and a definite end
after a finite number of steps.
2. Well-Defined Instructions:
Each step or operation in the algorithm must be precisely defined.
3. Input:
Algorithms take zero or more inputs to provide the required
information for processing.
4. Output:
An algorithm produces one or more outputs as a result of the
processing.
5. Effectiveness:
Every step in the algorithm must be simple enough to be carried
out, in principle, by a human using only pencil and paper.

Analysis of algorithm
The analysis of an algorithm involves evaluating its performance,
typically in terms of time and space complexity. It helps in understanding
how efficiently an algorithm runs and how much memory it consumes,
depending on the size of the input.

Complexity analysis
Complexity analysis evaluates how efficient an algorithm is in terms
of time (how long it takes) and space (how much memory it uses). This
helps in understanding how an algorithm performs as the size of the input
grows, ensuring that we choose the best one for a given problem.

Time complexity
Time complexity is a measure of the amount of time an algorithm
takes to complete as a function of the size of the input. It provides a way to
evaluate the efficiency of an algorithm and to compare the performance of
different algorithms. Time complexity is expressed using Big O notation, which
describes the upper bound of the time required for an algorithm to run as a function of
the input size.

Common Time Complexities:

1. O(1) - Constant Time:


a. The time taken is constant and does not depend on the size of
the input.
b. Example: Accessing an element in an array by index.
2. O(log n) - Logarithmic Time:
a. The time taken increases logarithmically as the input size
increases.
b. Example: Binary search in a sorted array.
3. O(n) - Linear Time:
a. The time taken increases linearly with the input size.
b. Example: Iterating through an array.
4. O(n log n) - Linearithmic Time:
a. A combination of linear and logarithmic time complexities.
b. Example: Efficient sorting algorithms like Merge Sort and Quick
Sort.
5. O(n^2) - Quadratic Time:
a. The time taken increases quadratically with the input size.
b. Example: Bubble Sort, Insertion Sort, and Selection Sort.
6. O(2^n) - Exponential Time:
a. The time taken doubles with each additional element in the
input.
b. Example: Solving the Traveling Salesman Problem using brute
force.
7. O(n!) - Factorial Time:
a. The time taken grows factorially with the input size.
b. Example: Generating all permutations of a set.

Space complexity
Space complexity measures the amount of memory an algorithm uses in
relation to the input size. It includes the memory required to store the input
data (input space) and the extra memory used by the algorithm (auxiliary
space).

Key Aspects of Space Complexity:

1. Input Space:
a. The memory required to store the input data.
b. Depends on the size and type of the input.
2. Auxiliary Space:
a. The extra memory an algorithm uses aside from the input data.
b. Includes temporary variables, data structures, and function
call stacks.

Common Space Complexities:

• O(1) - Constant Space:


o The algorithm uses a fixed amount of memory, regardless of
input size.
o Example: Swapping two variables.
• O(n) - Linear Space:
o The memory usage grows linearly with the input size.
o Example: Creating a copy of an array.
• O(n^2) - Quadratic Space:
o The memory usage grows quadratically with the input size.
o Example: A 2D matrix representation of input data.

Asymptiotic notations
Asymptotic notations are mathematical tools used to describe the
efficiency of algorithms in terms of time and space complexity as the input
size grows. They provide a high-level understanding of an algorithm's
performance and are essential for comparing different algorithms.

Here are the most common asymptotic notations:

1. Big O Notation (O)

• Purpose: Describes the upper bound of an algorithm's time or space


complexity.
• Indicates: The worst-case scenario.
• Example: O(n) means that the algorithm's performance grows
linearly with the input size.
2. Omega Notation (Ω)

• Purpose: Describes the lower bound of an algorithm's time or space


complexity.
• Indicates: The best-case scenario.
• Example: Ω(n) means that the algorithm takes at least linear time in
the best case.

3. Theta Notation (Θ)

• Purpose: Describes the exact bound of an algorithm's time or space


complexity.
• Indicates: Both the upper and lower bounds.
• Example: Θ(n) means that the algorithm's performance grows
linearly with the input size in both the best and worst cases.

4. Little o Notation (o)

• Purpose: Describes an upper bound that is not asymptotically tight.


• Indicates: The upper bound, but not necessarily the best possible.
• Example: o(n^2) means the algorithm's performance is less than
quadratic for large input sizes but not necessarily linear or
logarithmic.

5. Little omega Notation (ω)

• Purpose: Describes a lower bound that is not asymptotically tight.


• Indicates: The lower bound, but not necessarily the best possible.
• Example: ω(n) means the algorithm's performance is more than
linear for large input sizes but not necessarily quadratic or higher.

Big O Notation
Big O Notation is crucial for understanding the efficiency of algorithms in
Java, just as in any other programming language. It helps you evaluate how
the runtime or space requirements of an algorithm grow with the size of the
input.

Key Concepts

1. Upper Bound:
a. Big O Notation describes the upper bound or worst-case
scenario of an algorithm's time or space complexity.
b. It abstracts away constant factors and lower-order terms to
focus on the dominant term that affects growth as input size
increases.
2. Growth Rate:
a. Big O Notation helps in understanding the growth rate of an
algorithm, providing insights into its performance scalability.

EXAMPLE

O(log n) - Logarithmic Time

• Description: The algorithm’s running time grows logarithmically with


the input size.
• Example in Java: Binary search in a sorted array.

Example
java
public int binarySearch(int[] arr, int target) {
int left = 0, right = arr.length - 1;
while (left <= right) {
int mid = (left + right) / 2;
if (arr[mid] == target) {
return mid; // O(log n)
} else if (arr[mid] < target) {
left = mid + 1;
} else {
right = mid - 1;
}
}
return -1;
}

O(n) - Linear Time

• Description: The running time grows linearly with the input size.
• Example in Java: Iterating through an array.

Example
java
public int sumArray(int[] arr) {
int total = 0;
for (int num : arr) {
total += num; // O(n)
}
return total;
}

Theta Notation
Theta (Θ) Notation is used in computer science to describe the exact
asymptotic behavior of an algorithm’s complexity. It gives a tight bound on
the running time or space requirements of an algorithm by defining both
the upper and lower bounds.

Key Concepts of Theta Notation:

1. Exact Bound:
a. Θ Notation provides an exact asymptotic characterization of
an algorithm's growth rate.
b. It means the algorithm’s performance lies within a constant
factor of this bound for sufficiently large input sizes.
2. Expression:
a. An algorithm is said to have a time complexity of Θ(f(n)) if there
exist positive constants c1, c2, and n0 such that for all n ≥ n0,
the running time T(n) satisfies:

c1⋅f(n)≤T(n)≤c2⋅f(n)

Example:
Consider the linear search algorithm which searches for an element in an
array:

java
public int linearSearch(int[] arr, int target) {
for (int i = 0; i < arr.length; i++) {
if (arr[i] == target) {
return i; // Target found
}
}
return -1; // Target not found
}

Omega Notation
Omega (Ω) Notation is used in computer science to describe the lower
bound of an algorithm's complexity. It provides a way to express the best-
case scenario of an algorithm's performance, essentially guaranteeing that
the algorithm will take at least this much time or use this much space.
Key Concepts of Omega Notation:

1. Lower Bound:
a. Definition: Ω Notation describes the minimum time or space
an algorithm will require for any input of size n.
b. Purpose: To establish a baseline performance level that the
algorithm won't be able to perform better than.
2. Expression:
a. An algorithm is said to have a time complexity of Ω(f(n)) if there
exist positive constants c and n0 such that for all n ≥ n0, the
running time T(n) satisfies:

T(n)≥c⋅f(n)

Significance:

• Best Case: Unlike Big O Notation, which focuses on the worst-case, Ω


Notation focuses on the best-case scenario.
• Performance Guarantee: It provides a guarantee that the algorithm
will not perform better than this bound.

Example:

Consider the linear search algorithm:

java
public int linearSearch(int[] arr, int target) {
for (int i = 0; i < arr.length; i++) {
if (arr[i] == target) {
return i; // Target found
}
}
return -1; // Target not found
}
Logarithm time complexity
Logarithmic time complexity, denoted as O(log n), describes an algorithm
whose running time grows logarithmically in relation to the input size. This
means that the time it takes to complete the algorithm increases slowly as
the size of the input grows larger.

Key Characteristics of Logarithmic Time Complexity:

• Divide and Conquer: Often, algorithms with logarithmic time


complexity follow a divide-and-conquer approach, where the
problem size is reduced by a constant factor (usually by half) in each
step.
• Efficiency: Logarithmic time complexity is highly efficient, especially
for large datasets, as the number of operations required grows very
slowly compared to the input size.

Common Examples:

1. Binary Search: One of the most well-known algorithms with O(log n)


time complexity. It works by repeatedly dividing a sorted array in half
to find a target element.

java
public int binarySearch(int[] arr, int target) {
int left = 0, right = arr.length - 1;
while (left <= right) {
int mid = (left + right) / 2;
if (arr[mid] == target) {
return mid; // Target found
} else if (arr[mid] < target) {
left = mid + 1;
} else {
right = mid - 1;
}
}
return -1; // Target not found
}

Linear search
Linear search is a fundamental algorithm used to find an element within a
list or array. It works by sequentially checking each element of the list until
the desired element is found or the list ends.

Characteristics:

• Time Complexity: O(n), where n is the number of elements in the


list. This means that in the worst-case scenario, every element in the
list must be checked.
• Space Complexity: O(1), which means it requires a constant
amount of additional memory regardless of the input size.
• Unsorted Data: Linear search can be used on both sorted and
unsorted lists.
• Simplicity: It is simple and easy to implement but not very efficient
for large lists compared to more advanced search algorithms like
binary search.

Example in Java:

public class LinearSearch {

public static int linearSearch(int[] arr, int target) {


for (int i = 0; i < arr.length; i++) {

if (arr[i] == target) {
return i; // Target found, return index
}
}

return -1; // Target not found


}

public static void main(String[] args) {


int[] array = {10, 20, 30, 40, 50};

int target = 30;


int result = linearSearch(array, target);
if (result != -1) {
System.out.println("Element found at index: " + result);

} else {
System.out.println("Element not found in the array.");

}
}

Binary Search
Binary search is an efficient algorithm used to find the position of a target
value within a sorted array. It works by repeatedly dividing the search
interval in half. If the target value is less than the middle element of the
interval, the search continues in the lower half; otherwise, it continues in
the upper half. This process is repeated until the target value is found or
the interval is empty.
Characteristics:

• Time Complexity: O(log n), where n is the number of elements in the


array. This makes binary search much more efficient than linear
search, especially for large datasets.
• Space Complexity: O(1) for iterative implementation, which means
it uses a constant amount of extra memory. For recursive
implementation, the space complexity is O(log n) due to the call
stack.

Example in Java:

java
public class BinarySearch {
public static int binarySearch(int[] arr, int target)
{
int left = 0;
int right = arr.length - 1;

while (left <= right) {


int mid = (left + right) / 2;
if (arr[mid] == target) {
return mid; // Target found, return index
} else if (arr[mid] < target) {
left = mid + 1; // Search in the right
half
} else {
right = mid - 1; // Search in the left
half
}
}

return -1; // Target not found


}
public static void main(String[] args) {
int[] array = {10, 20, 30, 40, 50};
int target = 30;
int result = binarySearch(array, target);
if (result != -1) {
System.out.println("Element found at index: "
+ result);
} else {
System.out.println("Element not found in the
array.");
}
}
}

Selection sort
Selection sort is a simple comparison-based sorting algorithm. It works by
repeatedly finding the minimum (or maximum) element from the unsorted
part of the array and moving it to the beginning (or end).

Characteristics:

• Time Complexity: O(n^2) for both the best and worst cases, as it
always involves two nested loops.
• Space Complexity: O(1) since it sorts the array in place and does
not require additional memory.
• Stability: Selection sort is not stable, as it may change the relative
order of equal elements.
• Simplicity: It is easy to understand and implement but not efficient
for large datasets compared to more advanced algorithms like
quicksort or mergesort.
Pseudocode for Selection Sort:

procedure selectionSort(arr: list of items)


n = length(arr) // Get the length of
the array
for i = 0 to n - 1 do // Iterate over each
element in the array
minIndex = i // Assume the current
index is the minimum
for j = i + 1 to n - 1 do // Iterate over the
remaining unsorted elements
if arr[j] < arr[minIndex] then
minIndex = j // Update minIndex if a
smaller element is found
end if
end for
// Swap the minimum element with the first
unsorted element
if minIndex != i then
swap(arr[i], arr[minIndex])
end if
end for
end procedure

procedure swap(a, b: int)


temp = a
a = b
b = temp
end procedure

Example in Java:
java
public class SelectionSort {
public static void selectionSort(int[] arr) {
int n = arr.length;
for (int i = 0; i < n - 1; i++) {
// Find the minimum element in the unsorted
part
int minIndex = i;
for (int j = i + 1; j < n; j++) {
if (arr[j] < arr[minIndex]) {
minIndex = j;
}
}
// Swap the found minimum element with the
first unsorted element
int temp = arr[minIndex];
arr[minIndex] = arr[i];
arr[i] = temp;
}
}

public static void main(String[] args) {


int[] array = {64, 25, 12, 22, 11};
System.out.println("Unsorted Array:");
System.out.println(Arrays.toString(array));

selectionSort(array);
System.out.println("Sorted Array:");
System.out.println(Arrays.toString(array));
}
}
Bubble sort
Bubble sort is a simple comparison-based sorting algorithm. It works by
repeatedly stepping through the list to be sorted, comparing each pair of
adjacent elements, and swapping them if they are in the wrong order. This
process is repeated until the list is sorted. The algorithm gets its name
because smaller elements "bubble" to the top of the list.

Characteristics:

• Time Complexity:
o Worst-case: O(n^2)
o Average-case: O(n^2)
o Best-case: O(n) (when the array is already sorted)
• Space Complexity: O(1), meaning it sorts the array in place without
needing extra memory.
• Stability: Bubble sort is stable, meaning it maintains the relative
order of equal elements.
• Simplicity: It is easy to understand and implement but inefficient for
large datasets.

Pseudocode for Bubble Sort

procedure bubbleSort(arr: list of items)


n = length(arr) // Get the length of the array
for i = 0 to n - 1 do // Loop through each element in the array
swapped = false // Initialize swapped as false

for j = 0 to n - i - 2 do // Inner loop for comparing adjacent elements


if arr[j] > arr[j + 1] then

// Swap the elements


swap(arr[j], arr[j + 1])
swapped = true // Set swapped to true after a swap

end if
end for

// If no elements were swapped, the array is sorted


if swapped = false then

break
end if

end for
end procedure

procedure swap(a, b: int)


temp = a

a=b
b = temp

end procedure

Example in Java:

java
public class BubbleSort {
public static void bubbleSort(int[] arr) {
int n = arr.length;
boolean swapped;
for (int i = 0; i < n - 1; i++) {
swapped = false;
for (int j = 0; j < n - i - 1; j++) {
if (arr[j] > arr[j + 1]) {
// Swap arr[j] and arr[j + 1]
int temp = arr[j];
arr[j] = arr[j + 1];
arr[j + 1] = temp;
swapped = true;
}
}
// If no two elements were swapped, the array
is sorted
if (!swapped) {
break;
}
}
}

public static void main(String[] args) {


int[] array = {64, 34, 25, 12, 22, 11, 90};
System.out.println("Unsorted Array:");
System.out.println(Arrays.toString(array));

bubbleSort(array);
System.out.println("Sorted Array:");
System.out.println(Arrays.toString(array));
}
}

Insertion Sort
Insertion sort is a simple and intuitive comparison-based sorting algorithm.
It builds the final sorted array one element at a time by repeatedly picking
the next element and inserting it into its correct position among the
already-sorted elements. This process is akin to the way you might sort
playing cards in your hand.
Characteristics:

• Time Complexity:
o Worst-case: O(n^2)
o Average-case: O(n^2)
o Best-case: O(n) (when the array is already sorted)
• Space Complexity: O(1), since it sorts the array in place without
requiring additional memory.
• Stability: Insertion sort is stable, meaning it maintains the relative
order of equal elements.
• Simplicity: It is easy to implement and efficient for small or nearly
sorted arrays.

Pseudocode for Insertion Sort:

procedure insertionSort(arr: list of items)


n = length(arr) // Get the length of the array
for i = 1 to n - 1 do // Iterate from the second element to the
last element
key = arr[i] // Store the current element in key
j=i-1 // Initialize j to the previous index
// Move elements of arr[0..i-1] that are greater than key to one
position ahead of their current position
while j >= 0 and arr[j] > key do
arr[j + 1] = arr[j] // Shift element to the right
j=j-1 // Decrement j
end while
arr[j + 1] = key // Insert the key at its correct position
end for
end procedure
Example in Java:

java
public class InsertionSort {
public static void insertionSort(int[] arr) {
int n = arr.length;
for (int i = 1; i < n; ++i) {
int key = arr[i];
int j = i - 1;

// Move elements of arr[0..i-1] that are


greater than key to one position ahead of their current
position
while (j >= 0 && arr[j] > key) {
arr[j + 1] = arr[j];
j = j - 1;
}
arr[j + 1] = key;
}
}

public static void main(String[] args) {


int[] array = {12, 11, 13, 5, 6};
System.out.println("Unsorted Array:");
System.out.println(Arrays.toString(array));

insertionSort(array);
System.out.println("Sorted Array:");
System.out.println(Arrays.toString(array));
}
}
Quick Array
Quicksort is one of the most efficient sorting algorithms, and it works by
selecting a 'pivot' element from the array and partitioning the other
elements into two sub-arrays, according to whether they are less than or
greater than the pivot. The sub-arrays are then sorted recursively.

Characteristics:

• Time Complexity:
o Average-case: O(n log n)
o Worst-case: O(n^2) (rare, usually mitigated by choosing a good
pivot)
• Space Complexity: O(log n) for the recursive stack space.
• In-place: Uses constant extra space for sorting (apart from the
recursion stack).
• Divide-and-Conquer: Efficiently handles large datasets.

Pseudocode for Quicksort:

procedure quickSort(arr: list of items, low: int, high: int)

if low < high then

// Partition the array and get the pivot index

pi = partition(arr, low, high)

// Recursively sort elements before and after partition

quickSort(arr, low, pi - 1)

quickSort(arr, pi + 1, high)

end if
end procedure

procedure partition(arr: list of items, low: int, high: int) -> int

pivot = arr[high] // Choose the pivot element

i = low - 1 // Index of the smaller element

for j = low to high - 1 do

if arr[j] <= pivot then

i=i+1 // Increment the index of the smaller element

swap(arr[i], arr[j]) // Swap the current element with the element at


index i

end if

end for

swap(arr[i + 1], arr[high]) // Swap the pivot element with the element at
index i + 1

return i + 1 // Return the partition index

end procedure

procedure swap(a: int, b: int)

temp = a

a=b

b = temp

end procedure
Heap Sort
Heap sort is a comparison-based sorting algorithm that uses a binary heap
data structure. It is an efficient algorithm with a time complexity of O(n log
n) and is particularly useful for sorting large datasets. Heap sort works by
first converting the array into a max heap, then repeatedly extracting the
maximum element from the heap and rebuilding the heap until the array is
sorted.

Characteristics:

• Time Complexity:
o Worst-case: O(n log n)
o Average-case: O(n log n)
o Best-case: O(n log n)
• Space Complexity: O(1) as it is an in-place sorting algorithm.
• Stability: Heap sort is not stable, meaning it does not maintain the
relative order of equal elements.
• Efficiency: It is efficient for large datasets but may not be as fast as
quicksort in practice due to less efficient data access patterns.

Pseudocode for Heap Sort:

procedure heapSort(arr: list of items)


n = length(arr) // Get the length of
the array

// Build a max heap


for i = n / 2 - 1 to 0 do
heapify(arr, n, i)
end for

// Extract elements from the heap one by one


for i = n - 1 to 1 do
// Move the current root to the end
swap(arr[0], arr[i])
// Call max heapify on the reduced heap
heapify(arr, i, 0)
end for
end procedure

procedure heapify(arr: list of items, n: int, i: int)


largest = i // Initialize largest as
root
left = 2 * i + 1 // Left child index
right = 2 * i + 2 // Right child index

// If left child is larger than root


if left < n and arr[left] > arr[largest] then
largest = left
end if

// If right child is larger than largest so far


if right < n and arr[right] > arr[largest] then
largest = right
end if

// If largest is not root


if largest != i then
swap(arr[i], arr[largest])

// Recursively heapify the affected sub-tree


heapify(arr, n, largest)
end if
end procedure

procedure swap(a, b: int)


temp = a
a = b
b = temp
end procedure
Example in Java:

java
public class HeapSort {
public static void heapSort(int[] arr) {
int n = arr.length;

// Build max heap


for (int i = n / 2 - 1; i >= 0; i--) {
heapify(arr, n, i);
}

// Extract elements from the heap


for (int i = n - 1; i > 0; i--) {
// Swap the root (maximum element) with the
last element
int temp = arr[0];
arr[0] = arr[i];
arr[i] = temp;

// Rebuild the heap with the reduced size


heapify(arr, i, 0);
}
}

private static void heapify(int[] arr, int n, int i) {


int largest = i; // Initialize largest as root
int left = 2 * i + 1; // Left child
int right = 2 * i + 2; // Right child

// If left child is larger than root


if (left < n && arr[left] > arr[largest]) {
largest = left;
}

// If right child is larger than largest so far


if (right < n && arr[right] > arr[largest]) {
largest = right;
}

// If largest is not root


if (largest != i) {
int swap = arr[i];
arr[i] = arr[largest];
arr[largest] = swap;

// Recursively heapify the affected sub-tree


heapify(arr, n, largest);
}
}

public static void main(String[] args) {


int[] array = {12, 11, 13, 5, 6, 7};
System.out.println("Unsorted Array:");
System.out.println(Arrays.toString(array));

heapSort(array);
System.out.println("Sorted Array:");
System.out.println(Arrays.toString(array));
}
}

Continue...

You might also like