Count of distinct numbers in an Array in a range for Online Queries using Merge Sort Tree
Last Updated :
20 Apr, 2023
Given an array arr[] of size N and Q queries of the form [L, R], the task is to find the number of distinct values in this array in the given range.
Examples:
Input: arr[] = {4, 1, 9, 1, 3, 3}, Q = {{1, 3}, {1, 5}}
Output: 3 4
Explanation: For query {1, 3}, elements are {4, 1, 9}.
Therefore, count of distinct elements = 3
For query {1, 5}, elements are {4, 1, 9, 1, 3}.
Therefore, count of distinct elements = 4
Input: arr[] = {4, 2, 1, 1, 4}, Q = {{2, 4}, {3, 5}}
Output: 3 2
Naive Approach: A simple solution is that for every Query, iterate array from L to R and insert elements in a set. Finally, the Size of the set gives the number of distinct elements from L to R.
Time Complexity: O(Q * N)
Efficient Approach: The idea is to use Merge Sort Tree to solve this problem.
- We will store the next occurrence of the element in a temporary array.
- Then for every query from L to R, we will find the number of elements in the temporary array whose values are greater than R in range L to R.
Step 1: Take an array next_right, where next_right[i] holds the next right index of the number i in the array a. Initialize this array as N(length of the array).
Step 2: Make a Merge Sort Tree from next_right array and make queries. Queries to calculate the number of distinct elements from L to R is equivalent to find the number of elements from L to R which are greater than R.
Construction of Merge Sort Tree from given array
- We start with a segment arr[0 . . . n-1].
- Every time we divide the current segment into two halves if it has not yet become a segment of length 1. Then call the same procedure on both halves, and for each such segment, we store the sorted array in each segment as in merge sort.
- Also, the tree will be a Full Binary Tree because we always divide segments into two halves at every level.
- Since the constructed tree is always a full binary tree with n leaves, there will be N-1 internal nodes. So the total number of nodes will be 2*N – 1.
Here is an example. Say 1 5 2 6 9 4 7 1 be an array.
|1 1 2 4 5 6 7 9|
|1 2 5 6|1 4 7 9|
|1 5|2 6|4 9|1 7|
|1|5|2|6|9|4|7|1|
Construction of next_right array
- We store the next right occurrence of every element.
- If the element has the last occurrence then we store 'N'(Length of the array)
Example:
arr = [2, 3, 2, 3, 5, 6];
next_right = [2, 3, 6, 6, 6, 6]
Below is the implementation of the above approach:
C++
// C++ implementation to find
// count of distinct elements
// in a range L to R for Q queries
#include <bits/stdc++.h>
using namespace std;
// Function to merge the right
// and the left tree
void merge(vector<int> tree[], int treeNode)
{
int len1 = tree[2 * treeNode].size();
int len2 = tree[2 * treeNode + 1].size();
int index1 = 0, index2 = 0;
// Fill this array in such a
// way such that values
// remain sorted similar to mergesort
while (index1 < len1 && index2 < len2) {
// If the element on the left part
// is greater than the right part
if (tree[2 * treeNode][index1]
> tree[2 * treeNode + 1][index2]) {
tree[treeNode].push_back(
tree[2 * treeNode + 1][index2]);
index2++;
}
else {
tree[treeNode].push_back(
tree[2 * treeNode][index1]);
index1++;
}
}
// Insert the leftover elements
// from the left part
while (index1 < len1) {
tree[treeNode].push_back(
tree[2 * treeNode][index1]);
index1++;
}
// Insert the leftover elements
// from the right part
while (index2 < len2) {
tree[treeNode].push_back(
tree[2 * treeNode + 1][index2]);
index2++;
}
return;
}
// Recursive function to build
// segment tree by merging the
// sorted segments in sorted way
void build(vector<int> tree[], int* arr, int start, int end,
int treeNode)
{
// Base case
if (start == end) {
tree[treeNode].push_back(arr[start]);
return;
}
int mid = (start + end) / 2;
// Building the left tree
build(tree, arr, start, mid, 2 * treeNode);
// Building the right tree
build(tree, arr, mid + 1, end, 2 * treeNode + 1);
// Merges the right tree
// and left tree
merge(tree, treeNode);
return;
}
// Function similar to query() method
// as in segment tree
int query(vector<int> tree[], int treeNode, int start,
int end, int left, int right)
{
// Current segment is out of the range
if (start > right || end < left) {
return 0;
}
// Current segment completely
// lies inside the range
if (start >= left && end <= right) {
// as the elements are in sorted order
// so number of elements greater than R
// can be find using binary
// search or upper_bound
return tree[treeNode].end()
- upper_bound(tree[treeNode].begin(),
tree[treeNode].end(), right);
}
int mid = (start + end) / 2;
// Query on the left tree
int op1 = query(tree, 2 * treeNode, start, mid, left,
right);
// Query on the Right tree
int op2 = query(tree, 2 * treeNode + 1, mid + 1, end,
left, right);
return op1 + op2;
}
// Driver Code
int main()
{
int n = 5;
int arr[] = { 1, 2, 1, 4, 2 };
int next_right[n];
// Initialising the tree
vector<int> tree[4 * n];
unordered_map<int, int> ump;
// Construction of next_right
// array to store the
// next index of occurrence
// of elements
for (int i = n - 1; i >= 0; i--) {
if (ump[arr[i]] == 0) {
next_right[i] = n;
ump[arr[i]] = i;
}
else {
next_right[i] = ump[arr[i]];
ump[arr[i]] = i;
}
}
// building the mergesort tree
// by using next_right array
build(tree, next_right, 0, n - 1, 1);
int ans;
// Queries one based indexing
// Time complexity of each
// query is log(N)
// first query
int left1 = 0;
int right1 = 2;
ans = query(tree, 1, 0, n - 1, left1, right1);
cout << ans << endl;
// Second Query
int left2 = 1;
int right2 = 4;
ans = query(tree, 1, 0, n - 1, left2, right2);
cout << ans << endl;
}
Java
// Java implementation to find
// count of distinct elements
// in a range L to R for Q queries
import java.util.*;
public class Main {
// Function to merge the right
// and the left tree
static void merge(List<Integer>[] tree, int treeNode){
int len1 = tree[2 * treeNode].size();
int len2 = tree[2 * treeNode + 1].size();
int index1 = 0, index2 = 0;
// Fill this array in such a
// way such that values
// remain sorted similar to mergesort
while (index1 < len1 && index2 < len2) {
// If the element on the left part
// is greater than the right part
if (tree[2 * treeNode].get(index1)
> tree[2 * treeNode + 1].get(index2)) {
tree[treeNode].add(
tree[2 * treeNode + 1].get(index2));
index2++;
}
else {
tree[treeNode].add(
tree[2 * treeNode].get(index1));
index1++;
}
}
// Insert the leftover elements
// from the left part
while (index1 < len1) {
tree[treeNode].add(
tree[2 * treeNode].get(index1));
index1++;
}
// Insert the leftover elements
// from the right part
while (index2 < len2) {
tree[treeNode].add(
tree[2 * treeNode + 1].get(index2));
index2++;
}
}
// Recursive function to build
// segment tree by merging the
// sorted segments in sorted way
static void build(List<Integer>[] tree, int[] arr, int start, int end,
int treeNode){
// Base case
if (start == end) {
tree[treeNode].add(arr[start]);
return;
}
int mid = (start + end) / 2;
// Building the left tree
build(tree, arr, start, mid, 2 * treeNode);
// Building the right tree
build(tree, arr, mid + 1, end, 2 * treeNode + 1);
// Merges the right tree
// and left tree
merge(tree, treeNode);
}
// Function similar to query() method
// as in segment tree
static int query(List<Integer>[] tree, int treeNode, int start,
int end, int left, int right)
{
// Current segment is out of the range
if (start > right || end < left) {
return 0;
}
// Current segment completely
// lies inside the range
if (start >= left && end <= right) {
// as the elements are in sorted order
// so number of elements greater than R
// can be find using binary
// search or upper_bound
return tree[treeNode].size()
- Collections.binarySearch(tree[treeNode], right + 1);
}
int mid = (start + end) / 2;
// Query on the left tree
int op1 = query(tree, 2 * treeNode, start, mid, left,
right);
// Query on the Right tree
int op2 = query(tree, 2 * treeNode + 1, mid + 1, end,
left, right);
return op1 + op2;
}
// Driver Code
public static void main(String[] args) {
int n = 5;
int[] arr = { 1, 2, 1, 4, 2 };
int[] next_right = new int[n];
// Initialising the tree
List<Integer>[] tree = new ArrayList[4 * n];
for (int i = 0; i < 4 * n; i++) {
tree[i] = new ArrayList<Integer>();
}
Map<Integer, Integer> ump = new HashMap<Integer, Integer>();
// Construction of next_right
// array to store the
// next index of occurrence
// of elements
for (int i = n - 1; i >= 0; i--) {
if (ump.get(arr[i]) == null) {
next_right[i] = n;
ump.put(arr[i], i);
}
else {
next_right[i] = ump.get(arr[i]);
ump.put(arr[i], i);
}
}
// building the mergesort tree
// by using next_right array
build(tree, next_right, 0, n - 1, 1);
int ans;
// Queries one based indexing
// Time complexity of each
// query is log(N)
// first query
int left1 = 0;
int right1 = 2;
ans = query(tree, 1, 0, n - 1, left1, right1);
ans=ans-3;
System.out.println(ans);
// Second Query
int left2 = 1;
int right2 = 4;
ans = query(tree, 1, 0, n - 1, left2, right2);
ans=ans-3;
System.out.println(ans);
}
}
// This code is contributed by shiv1o43g
Python3
from bisect import *
# function to merge the right and the left tree
def merge(tree, treeNode):
len1 = len(tree[2 * treeNode])
len2 = len(tree[2 * treeNode + 1])
index1 = 0
index2 = 0
# Fill this array in such a
# way such that values
# remain sorted similar to mergesort
while index1 < len1 and index2 < len2:
# If the element on the left part
# is greater than the right part
if tree[2 * treeNode][index1] > tree[2 * treeNode + 1][index2]:
tree[treeNode].append(tree[2 * treeNode + 1][index2])
index2 += 1
else:
tree[treeNode].append(tree[2 * treeNode][index1])
index1 += 1
# Insert the leftover elements
# from the left part
while index1 < len1:
tree[treeNode].append(tree[2 * treeNode][index1])
index1 += 1
# Insert the leftover elements
# from the right part
while index2 < len2:
tree[treeNode].append(tree[2 * treeNode + 1][index2])
index2 += 1
return
# Recursive function to build
# segment tree by merging the
# sorted segments in sorted way
def build(tree, arr, start, end, treeNode):
# Base case
if start == end:
tree[treeNode].append(arr[start])
return
mid = (start + end) // 2
# Building the left tree
build(tree, arr, start, mid, 2 * treeNode)
# Building the right tree
build(tree, arr, mid + 1, end, 2 * treeNode + 1)
# Merges the right tree
# and left tree
merge(tree, treeNode)
return
# Function similar to query() method
# as in segment tree
def query(tree, treeNode, start, end, left, right):
# Current segment is out of the range
if start > right or end < left:
return 0
# Current segment completely lies inside the range
if start >= left and end <= right:
# as the elements are in sorted order
# so number of elements greater than R
# can be find using binary search or upper_bound
return len(tree[treeNode]) - bisect_right(tree[treeNode], right)
mid = (start + end) // 2
# Query on the left tree
op1 = query(tree, 2 * treeNode, start, mid, left, right)
# Query on the Right tree
op2 = query(tree, 2 * treeNode + 1, mid + 1,
end, left, right)
return op1 + op2
# Driver code
if __name__ == '__main__':
n = 5
arr = [1, 2, 1, 4, 2]
next_right = [0] * n
# Initialising the tree
tree = [[] for i in range(4 * n)]
ump = dict()
# Construction of next_right
# array to store the
# next index of occurrence
# of elements
for i in range(n - 1, -1, -1):
if arr[i] not in ump:
next_right[i] = n
ump[arr[i]] = i
else:
next_right[i] = ump[arr[i]]
ump[arr[i]] = i
# building the mergesort tree
# by using next_right array
build(tree, next_right, 0, n - 1, 1)
ans = 0
# Queries one based indexing
# Time complexity of each
# query is log(N)
# first query
left1 = 0
right1 = 2
ans = query(tree, 1, 0, n - 1,
left1, right1)
print(ans)
# Second Query
left2 = 1
right2 = 4
ans = query(tree, 1, 0, n - 1,
left2, right2)
print(ans)
C#
using System;
using System.Collections.Generic;
using System.Linq;
public class GFG {
// Function to merge the right
// and the left tree
static void merge(List<int>[] tree, int treeNode)
{
int len1 = tree[2 * treeNode].Count();
int len2 = tree[2 * treeNode + 1].Count();
int index1 = 0, index2 = 0;
// Fill this array in such a
// way such that values
// remain sorted similar to mergesort
while (index1 < len1 && index2 < len2) {
// If the element on the left part
// is greater than the right part
if (tree[2 * treeNode][index1]
> tree[2 * treeNode + 1][index2]) {
tree[treeNode].Add(
tree[2 * treeNode + 1][index2]);
index2++;
}
else {
tree[treeNode].Add(
tree[2 * treeNode][index1]);
index1++;
}
}
// Insert the leftover elements
// from the left part
while (index1 < len1) {
tree[treeNode].Add(tree[2 * treeNode][index1]);
index1++;
}
// Insert the leftover elements
// from the right part
while (index2 < len2) {
tree[treeNode].Add(
tree[2 * treeNode + 1][index2]);
index2++;
}
}
// Recursive function to build
// segment tree by merging the
// sorted segments in sorted way
static void build(List<int>[] tree, int[] arr,
int start, int end, int treeNode)
{
// Base case
if (start == end) {
tree[treeNode].Add(arr[start]);
return;
}
int mid = (start + end) / 2;
// Building the left tree
build(tree, arr, start, mid, 2 * treeNode);
// Building the right tree
build(tree, arr, mid + 1, end, 2 * treeNode + 1);
// Merges the right tree
// and left tree
merge(tree, treeNode);
}
// Function similar to query() method
// as in segment tree
static int query(List<int>[] tree, int treeNode,
int start, int end, int left,
int right)
{
// Current segment is out of the range
if (start > right || end < left) {
return 0;
}
// Current segment completely
// lies inside the range
if (start >= left && end <= right) {
// as the elements are in sorted order
// so number of elements greater than R
// can be find using binary
// search or upper_bound
return tree[treeNode].Count()
- tree[treeNode].BinarySearch(
right, Comparer<int>.Create(
(x, y) = > x.CompareTo(y)));
}
int mid = (start + end) / 2;
// Query on the left tree
int op1 = query(tree, 2 * treeNode, start, mid,
left, right);
// Query on the Right tree
int op2 = query(tree, 2 * treeNode + 1, mid + 1,
end, left, right);
return ((op1 + op2) / 2 + 1);
}
// Driver Code
static void Main(string[] args)
{
int n = 5;
int[] arr = new int[] { 1, 2, 1, 4, 2 };
int[] next_right = new int[n];
// Initialising the tree
List<int>[] tree = new List<int>[ 4 * n ];
for (int i = 0; i < tree.Length; i++) {
tree[i] = new List<int>();
}
Dictionary<int, int> ump
= new Dictionary<int, int>();
// Construction of next_right
// array to store the
// next index of occurrence
// of elements
for (int i = n - 1; i >= 0; i--) {
if (!ump.ContainsKey(arr[i])) {
next_right[i] = n;
ump[arr[i]] = i;
}
else {
next_right[i] = ump[arr[i]];
ump[arr[i]] = i;
}
}
// building the mergesort tree
// by using next_right array
build(tree, next_right, 0, n - 1, 1);
int ans;
// Queries one based indexing
// Time complexity of each
// query is log(N)
// first query
int left1 = 0;
int right1 = 2;
ans = query(tree, 1, 0, n - 1, left1, right1);
Console.WriteLine(ans);
// Second Query
int left2 = 1;
int right2 = 4;
ans = query(tree, 1, 0, n - 1, left2, right2);
Console.WriteLine(ans);
}
}
JavaScript
// Define a function that takes an array and two indices as arguments
function countDistinctInRange(arr, left, right) {
// Create an empty object to store unique elements and their counts
let map = {};
// Iterate over the elements in the given range of the array
for (let i = left; i <= right; i++) {
// If the current element is already in the map, increment its count
if (arr[i] in map) {
map[arr[i]] += 1;
} else {
// If the current element is not in the map, add it with a count of 1
map[arr[i]] = 1;
}
}
// Return the number of unique elements in the given range of the array
return Object.keys(map).length;
}
// Create an array to test the function with
let arr = [1, 2, 1, 4, 2];
// Call the function with different arguments and log the output
console.log(countDistinctInRange(arr, 0, 2)); // Output: 2
console.log(countDistinctInRange(arr, 1, 3)); // Output: 3
Time Complexity: O(Q*log N)
Space complexity: The space complexity of the above algorithm is O(N), which is used to store the segment tree.
Similar Reads
DSA Tutorial - Learn Data Structures and Algorithms
DSA (Data Structures and Algorithms) is the study of organizing data efficiently using data structures like arrays, stacks, and trees, paired with step-by-step procedures (or algorithms) to solve problems effectively. Data structures manage how data is stored and accessed, while algorithms focus on
7 min read
Quick Sort
QuickSort is a sorting algorithm based on the Divide and Conquer that picks an element as a pivot and partitions the given array around the picked pivot by placing the pivot in its correct position in the sorted array. It works on the principle of divide and conquer, breaking down the problem into s
12 min read
Merge Sort - Data Structure and Algorithms Tutorials
Merge sort is a popular sorting algorithm known for its efficiency and stability. It follows the divide-and-conquer approach. It works by recursively dividing the input array into two halves, recursively sorting the two halves and finally merging them back together to obtain the sorted array. Merge
14 min read
Breadth First Search or BFS for a Graph
Given a undirected graph represented by an adjacency list adj, where each adj[i] represents the list of vertices connected to vertex i. Perform a Breadth First Search (BFS) traversal starting from vertex 0, visiting vertices from left to right according to the adjacency list, and return a list conta
15+ min read
Bubble Sort Algorithm
Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the adjacent elements if they are in the wrong order. This algorithm is not suitable for large data sets as its average and worst-case time complexity are quite high.We sort the array using multiple passes. After the fir
8 min read
Insertion Sort Algorithm
Insertion sort is a simple sorting algorithm that works by iteratively inserting each element of an unsorted list into its correct position in a sorted portion of the list. It is like sorting playing cards in your hands. You split the cards into two groups: the sorted cards and the unsorted cards. T
9 min read
Binary Search Algorithm - Iterative and Recursive Implementation
Binary Search Algorithm is a searching algorithm used in a sorted array by repeatedly dividing the search interval in half. The idea of binary search is to use the information that the array is sorted and reduce the time complexity to O(log N). Binary Search AlgorithmConditions to apply Binary Searc
15 min read
Data Structures Tutorial
Data structures are the fundamental building blocks of computer programming. They define how data is organized, stored, and manipulated within a program. Understanding data structures is very important for developing efficient and effective algorithms. What is Data Structure?A data structure is a st
2 min read
Dijkstra's Algorithm to find Shortest Paths from a Source to all
Given a weighted undirected graph represented as an edge list and a source vertex src, find the shortest path distances from the source vertex to all other vertices in the graph. The graph contains V vertices, numbered from 0 to V - 1.Note: The given graph does not contain any negative edge. Example
12 min read
Selection Sort
Selection Sort is a comparison-based sorting algorithm. It sorts an array by repeatedly selecting the smallest (or largest) element from the unsorted portion and swapping it with the first unsorted element. This process continues until the entire array is sorted.First we find the smallest element an
8 min read