unit 2(data and file structure)part(1)
unit 2(data and file structure)part(1)
descending order. Sorting is important in data structures as it makes it easier to search for data,
retrieve it efficiently, and make the data more organized for processing.
Types of Sorting:
1. Ascending Order: The data is arranged from the smallest to the largest.
2. Descending Order: The data is arranged from the largest to the smallest.
Example of Sorting:
[12, 5, 9, 1, 15, 8]
1. Start with the first element and compare it with the next element.
2. If the first element is greater than the second element, swap them.
3. Move to the next pair of elements and repeat the process.
4. Continue this until the entire array is sorted.
First pass:
After the first pass, the largest number (15) is at the end of the array.
Second pass:
After the second pass, the second largest number (12) is in its correct place.
Continue this process until the entire array is sorted.
Sorting Algorithms:
Bubble Sort: A simple but inefficient algorithm that compares adjacent elements and swaps
them if they are in the wrong order.
Selection Sort: Divides the array into two parts, one sorted and one unsorted, and selects
the smallest element from the unsorted part and places it at the end of the sorted part.
Insertion Sort: Builds the sorted array one element at a time by picking the next element
and inserting it into the correct position in the already sorted part.
Merge Sort: A divide-and-conquer algorithm that divides the array into two halves,
recursively sorts them, and then merges the sorted halves.
Quick Sort: Another divide-and-conquer algorithm that selects a pivot element and
partitions the array around it, sorting the subarrays recursively.
Sorting is a fundamental operation in many algorithms, such as searching, merging data sets, or
optimizing processes.
Bubble Sort
Bubble Sort is a simple sorting algorithm that repeatedly steps through the list, compares adjacent
elements, and swaps them if they are in the wrong order. The algorithm gets its name because
smaller elements "bubble" to the top (beginning of the array) and larger elements "sink" to the
bottom (end of the array) with each pass.
1. Starting from the first element, compare it with the next element.
2. If the first element is greater than the second, swap them.
3. Move to the next pair of elements, compare, and swap if necessary.
4. After one full pass through the array, the largest element will have "bubbled up" to the last
position.
5. Repeat the process for the remaining unsorted part of the array, reducing the number of
comparisons each time, since the largest elements are already sorted at the end.
[5, 2, 9, 1, 5, 6]
Pass 1:
After the first pass, the largest element (9) is in its correct position.
Pass 2:
Pass 3:
Time Complexity:
Space Complexity:
Bubble Sort is simple to understand but not efficient for large datasets due to its O(n²) time
complexity. It is mainly used for educational purposes or small arrays.
Selection Sort
Selection Sort is a simple and intuitive sorting algorithm that works by repeatedly finding the
smallest (or largest, depending on the order) element from the unsorted part of the array and
swapping it with the first unsorted element. This process continues until the entire array is sorted.
Now consider the subarray starting from the second element: [25, 12, 22, 64].
The smallest element in this subarray is 12.
Swap 12 with the second element (25).
After the swap, the array becomes:
Now consider the subarray starting from the third element: [25, 22, 64].
The smallest element in this subarray is 22.
Swap 22 with the third element (25).
After the swap, the array becomes:
Now consider the subarray starting from the fourth element: [25, 64].
The smallest element in this subarray is 25, which is already in the correct position.
No swap is needed.
Pass 5:
Time Complexity:
Best, Worst, and Average Case: O(n²) because, in each pass, you need to find the minimum
from the unsorted portion of the array (which takes O(n) time) and do this for each element.
Space Complexity:
O(1), as the algorithm is in-place, meaning it only requires a constant amount of extra space
for the swaps.
In-place: It doesn't require extra space for another array, only swaps elements within the
array.
Unstable: The relative order of equal elements might change.
Simple: Easy to understand and implement, but inefficient for large datasets.
Selection Sort is rarely used in practice for large datasets because its time complexity is O(n²).
However, it can be useful for small arrays or when memory space is very limited because it doesn't
require additional memory.
Insertion Sort
Insertion Sort is a simple sorting algorithm that builds the sorted array one element at a time by
repeatedly picking the next element and inserting it into the correct position within the already
sorted portion of the array. It is much like the way you might sort playing cards in your hands:
starting with one card and inserting each new card into its correct position in the sorted hand.
1. Start from the second element (because a single element is already considered sorted).
2. Compare the current element with the elements in the sorted portion of the array (to its
left).
3. Shift all larger elements one position to the right to make space for the current element.
4. Insert the current element into its correct position.
5. Continue this process for all elements until the entire array is sorted.
Current element: 2
Compare 2 with 5 (the element before it). Since 2 is smaller, shift 5 one position to the right.
Insert 2 in the first position.
Array after the first pass:
[2, 5, 9, 1, 5, 6]
Pass 2:
Current element: 9
Compare 9 with 5 (the element before it). Since 9 is larger, no shifting is needed.
Insert 9 in its current position.
Array after the second pass:
[2, 5, 9, 1, 5, 6]
Pass 3:
Current element: 1
Compare 1 with 9 → shift 9 one position to the right.
Compare 1 with 5 → shift 5 one position to the right.
Compare 1 with 2 → shift 2 one position to the right.
Insert 1 in the first position.
Array after the third pass:
[1, 2, 5, 9, 5, 6]
Pass 4:
Current element: 5
Compare 5 with 9 → shift 9 one position to the right.
Compare 5 with 5 → no shift needed (they are equal).
Insert 5 in the position before 9.
Array after the fourth pass:
[1, 2, 5, 5, 9, 6]
Pass 5:
Current element: 6
Compare 6 with 9 → shift 9 one position to the right.
Compare 6 with 5 → no shift needed.
Insert 6 between the two 5's.
Array after the fifth pass:
[1, 2, 5, 5, 6, 9]
Time Complexity:
Best Case (already sorted array): O(n) – In this case, each element only needs to be
compared with the element before it, and no shifts occur.
Worst Case (reverse sorted array): O(n²) – Every element will need to be shifted past all
previously sorted elements.
Average Case: O(n²)
Space Complexity:
O(1) – Insertion Sort is an in-place sorting algorithm, meaning it only requires a constant
amount of extra space.
Insertion Sort is particularly useful for applications that require frequent insertion of new data into
an already sorted collection, like maintaining a sorted list in a dynamic environment.
Quick Sort
Quick Sort is a highly efficient, divide-and-conquer sorting algorithm that works by selecting a
"pivot" element from the array, partitioning the array into two subarrays, and recursively sorting
those subarrays. The key idea behind Quick Sort is that it efficiently sorts the data by comparing and
partitioning rather than using nested loops like simpler algorithms such as Bubble Sort or Selection
Sort.
1. Pick a pivot element from the array. Various strategies can be used to pick the pivot (e.g.,
choosing the first element, the last element, the middle element, or even a random
element).
2. Partition the array: Reorder the array so that:
o All elements smaller than the pivot are placed before it.
o All elements greater than the pivot are placed after it.
o After the partitioning, the pivot is in its final sorted position.
3. Recursively apply the above steps to the left and right subarrays (the elements before and
after the pivot).
4. Continue recursively dividing and sorting the subarrays until the base case is reached, which
occurs when the subarray has only one element (or is empty).
We compare each element with the pivot (70) and rearrange the array so that:
o All elements smaller than 70 are on the left.
o All elements greater than 70 are on the right.
Step 4: Apply Quick Sort on the Left Subarray [10, 30, 40, 50]
Now 50 is in its correct position. The left subarray [10, 30, 40] is recursively sorted.
[80, 90]
Final Sorted Array:
Time Complexity:
Best Case: O(n log n) – Occurs when the pivot divides the array into roughly two equal
halves.
Average Case: O(n log n) – This is typically the case when the pivot element is chosen
randomly or when the input array is randomly distributed.
Worst Case: O(n²) – This happens when the pivot is always the smallest or largest element
(e.g., in already sorted or reverse-sorted arrays).
Space Complexity:
O(log n) – This is the space complexity for the recursive stack in the average case (when the
pivot divides the array into balanced subarrays).
O(n) – In the worst case (when the array is sorted in reverse order and the pivot always picks
the smallest or largest element), the recursion depth could go as deep as the size of the
array, leading to a space complexity of O(n).
Efficient on large datasets: Quick Sort is one of the fastest sorting algorithms for large
datasets.
In-place sorting: It doesn't require extra space for another array, making it memory-
efficient.
Unstable: It may change the relative order of equal elements (e.g., it might not maintain the
original order of elements with the same value).
Divide-and-conquer: Quick Sort uses the divide-and-conquer approach, which helps break
down the problem into smaller sub-problems.
Quick Sort is generally preferred for large datasets because, on average, it has a time complexity of
O(n log n), making it faster than other O(n²) algorithms like Bubble Sort, Selection Sort, and Insertion
Sort. However, care must be taken to avoid the worst-case scenario by choosing a good pivot
(randomized or using techniques like the median-of-three pivot selection).
Merge Sort
Merge Sort is a divide-and-conquer sorting algorithm that breaks down the array into smaller
subarrays, recursively sorts them, and then merges the sorted subarrays to produce the final sorted
array. It is highly efficient for large datasets and guarantees a time complexity of O(n log n), making
it one of the most efficient comparison-based sorting algorithms.
1. Divide: Recursively split the array into two halves until each subarray contains only one
element (a single element is considered sorted).
2. Conquer: Once the array is divided into individual elements, merge them back together in
sorted order. During the merge process, the algorithm compares the elements of the
subarrays and combines them into a larger sorted subarray.
3. Combine: The merging process continues recursively, eventually combining the subarrays
into a fully sorted array.
↘ ↙
[38, 27, 43, 3, 9, 82, 10]
↘ ↙ ↘ ↙
[38, 27, 43] [3, 9, 82, 10]
↘ ↙ ↘ ↙
[38, 27] [43] [3, 9] [82, 10]
Now we begin merging the individual elements back together in sorted order.
Merge [27, 38, 43] and [3, 9, 10, 82] → [3, 9, 10, 27, 38, 43, 82]
Time Complexity:
O(n): Merge Sort requires extra space to store the temporary subarrays during the merging
process, which makes it not an in-place sorting algorithm.
Merge Sort is ideal for applications where stable sorting is required, such as when sorting
linked lists or large datasets that don't fit into memory (external sorting). It's also very useful
when you need guaranteed O(n log n) time performance regardless of the input data.
However, due to its extra space requirement, it might not be as efficient as in-place
algorithms like Quick Sort when memory usage is a concern.
1. Bubble Sort
Time Complexity:
o Best Case: O(n) (if the array is already sorted)
o Average and Worst Case: O(n²)
Space Complexity: O(1) (in-place sorting)
Stability: Stable (doesn't change the relative order of equal elements)
Efficiency:
o Very inefficient for large datasets.
o Best suited for small arrays or when simplicity is more important than performance.
Use Case: Educational purposes or when working with small datasets.
2. Selection Sort
Time Complexity:
o Best, Average, and Worst Case: O(n²)
Space Complexity: O(1) (in-place sorting)
Stability: Unstable (relative order of equal elements might change)
Efficiency:
o More efficient than Bubble Sort in practice (less swapping), but still inefficient for
large datasets.
Use Case: Good for small arrays or when memory space is limited.
3. Insertion Sort
Time Complexity:
o Best Case: O(n) (if the array is already sorted)
o Average and Worst Case: O(n²)
Space Complexity: O(1) (in-place sorting)
Stability: Stable
Efficiency:
o Highly efficient for small datasets or nearly sorted arrays.
o For larger datasets, it’s slower than more efficient algorithms like Quick Sort or
Merge Sort.
Use Case: Used for small datasets, partially sorted datasets, or when simplicity is desired.
4. Quick Sort
Time Complexity:
o Best and Average Case: O(n log n) (when the pivot divides the array fairly evenly)
o Worst Case: O(n²) (if the pivot is always the smallest or largest element)
Space Complexity: O(log n) (recursive stack space)
Stability: Unstable (relative order of equal elements might change)
Efficiency:
o Very efficient for large datasets on average.
o The worst-case time complexity is rare with good pivot selection (e.g., random or
median-of-three pivot).
Use Case: Quick Sort is ideal for large datasets, and when time efficiency is crucial. It is the
go-to algorithm in most practical situations.
5. Merge Sort
Time Complexity:
o Best, Average, and Worst Case: O(n log n)
Space Complexity: O(n) (requires additional space for temporary subarrays)
Stability: Stable
Efficiency:
o Consistently efficient with O(n log n) time complexity.
o Space complexity is higher compared to in-place algorithms.
Use Case: Ideal for large datasets that require guaranteed performance and stable sorting.
Excellent for sorting linked lists and external sorting.
6. Heap Sort
Time Complexity:
o Best, Average, and Worst Case: O(n log n)
Space Complexity: O(1) (in-place sorting)
Stability: Unstable
Efficiency:
o Generally less efficient than Quick Sort for most use cases because of constant
factors, despite both having O(n log n) time complexity.
Use Case: Useful for scenarios where in-place sorting with guaranteed O(n log n)
performance is needed, but stability is not a concern (e.g., priority queues).
7. Radix Sort
Time Complexity:
o Best, Average, and Worst Case: O(nk) where n is the number of elements and k is
the number of digits (or bits) in the largest number.
Space Complexity: O(n + k)
Stability: Stable
Efficiency:
o Very efficient for large numbers or datasets with a limited range of values (e.g.,
integers).
o When k (the number of digits) is small, it can outperform comparison-based
algorithms.
Use Case: Best for sorting integers or strings, especially when the range of values is known in
advance.
8. Counting Sort
Time Complexity:
o Best, Average, and Worst Case: O(n + k) where n is the number of elements and k is
the range of the input (i.e., the difference between the maximum and minimum
values).
Space Complexity: O(k)
Stability: Stable
Efficiency:
o Extremely efficient when the range of the data (k) is not too large, as it sorts in linear
time.
Use Case: Best for sorting integers or categorical data when the range of values is relatively
small compared to the size of the array.
9. Bucket Sort
Time Complexity:
o Best, Average, and Worst Case: O(n + k) where n is the number of elements and k is
the number of buckets.
Space Complexity: O(n + k)
Stability: Stable (if the sorting algorithm used within the buckets is stable)
Efficiency:
o Very efficient when the data is uniformly distributed.
o Can outperform comparison-based algorithms (e.g., Quick Sort) for certain types of
data.
Use Case: Useful when sorting data with a known uniform distribution, like floating-point
numbers in a fixed range.
Summary of Efficiency:
Selection
O(n²) O(n²) O(n²) O(1) No No
Sort
Insertion
O(n) O(n²) O(n²) O(1) Yes No
Sort
Merge Sort O(n log n) O(n log n) O(n log n) O(n) Yes Yes
Heap Sort O(n log n) O(n log n) O(n log n) O(1) No Yes
Yes (with
Radix Sort O(nk) O(nk) O(nk) O(n + k) Yes
numeric data)
Yes (with
Bucket Sort O(n + k) O(n + k) O(n²) O(n + k) Yes
uniform data)
Conclusion:
Quick Sort and Merge Sort are the most efficient algorithms for general use with large
datasets due to their O(n log n) time complexity.
Merge Sort is stable and guarantees O(n log n) performance, but requires extra space.
Quick Sort is typically faster in practice due to its smaller constant factors, but it can degrade
to O(n²) if the pivot selection is poor.
Radix Sort, Counting Sort, and Bucket Sort can outperform comparison-based algorithms in
specific scenarios, such as when working with integers or uniformly distributed data, though
they require additional space and are not comparison-based.