SQL Scenario Based Questions-1
SQL Scenario Based Questions-1
1. Question: Retrieve all employees from the Employees table whose salaries are higher
than the average salary.
Answer:
5. Question: Write a query to delete duplicate rows in the Products table while keeping
one.
Answer:
WITH CTE AS (
SELECT ProductID, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY
ProductID) AS RowNum
FROM Products
)
DELETE FROM Products
WHERE ProductID IN (SELECT ProductID FROM CTE WHERE RowNum > 1);
6. Question: Get the top 3 customers with the highest purchase amounts.
Answer:
WITH CTE AS (
SELECT EmployeeID, ManagerID FROM Employees
WHERE ManagerID = 1
UNION ALL
SELECT e.EmployeeID, e.ManagerID FROM Employees e
INNER JOIN CTE c ON e.ManagerID = c.EmployeeID
)
SELECT * FROM CTE;
9. Question: Write a query to find the total sales amount for each product category and
region.
Answer:
10. Question: Fetch the employee names with the highest salary in each department.
Answer:
11. Question: Retrieve the EmployeeID and the cumulative salary for each employee ordered
by hire date.
Answer:
12. Question: Write a query to find orders where the total amount is greater than the average
total amount across all orders.
Answer:
14. Question: Find the total number of orders placed on each weekday.
Answer:
15. Question: Identify employees who earn a salary above the department average.
Answer:
16. Question: Write a query to find products that have never been sold.
Answer:
17. Question: List customers who bought the same product on more than one day.
Answer:
18. Question: Write a query to find the average sales per day for the last 30 days.
Answer:
19. Question: Get the CategoryID and the percentage contribution of each category to total
sales.
Answer:
20. Question: Write a query to identify orders that are missing shipping information.
Answer:
7. Fetch the details of employees who joined after 1st January 2023.
11. Get the second highest salary from the Salaries table.
26. Delete duplicate rows in the Orders table, keeping only one.
WITH CTE AS (
SELECT OrderID, ROW_NUMBER() OVER (PARTITION BY OrderDetails ORDER
BY OrderID) AS RowNum
FROM Orders
)
DELETE FROM Orders
WHERE OrderID IN (SELECT OrderID FROM CTE WHERE RowNum > 1);
28. List customers who bought the same product on more than one day.
Real-Life Scenarios
31. Find customers who placed orders worth more than $5000 in total.
33. Identify employees who have been with the company for more than 10 years.
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'TableName' AND COLUMN_NAME IS NULL;
14. Question: How do you apply row-level security (RLS) in Power BI?
Answer:
o Define roles in the Model view.
o Use DAX filters like [Region] = "East".
o Assign roles to users in Power BI Service.
15. Question: What is the difference between a line chart and an area chart?
Answer:
o Line Chart: Displays trends over time.
o Area Chart: Adds shading below the line for emphasis on volume.
16. Question: How do you remove duplicates in Power Query?
Answer:
o In Power Query Editor, select the columns and click Remove Duplicates in the
Home tab.
17. Question: What is the use of the CALCULATE function in DAX?
Answer:
It modifies the filter context of an expression.
18. Question: How can you implement a drill-through filter?
Answer:
o Create a drill-through page with the desired fields.
o Add a drill-through filter in the Filters pane.
19. Question: How do you create a KPI visual in Power BI?
Answer:
o Select the KPI visual from the pane.
o Add a field for value, target, and trend.
20. Question: How do you export data from a visualization?
Answer:
o Click the ellipsis (…) on the visual.
o Select Export data and choose a format.
RollingAverage =
AVERAGEX(DATESINPERIOD('Date'[Date], LASTDATE('Date'[Date]), -30, DAY),
[SalesAmount]
)
SalesContribution =
DIVIDE(SUM(Sales[Amount]), CALCULATE(SUM(Sales[Amount]), ALL(Sales)))
CumulativeProfit =
CALCULATE(SUM(Sales[Profit]), FILTER(ALL('Date'), 'Date'[Date] <=
MAX('Date'[Date])))
reversed_string = "example"[::-1]
def is_prime(num):
if num < 2:
return False
for i in range(2, int(num ** 0.5) + 1):
if num % i == 0:
return False
return True
3. Question: How do you find the largest number in a list without using max()?
Answer:
def find_max(lst):
largest = lst[0]
for num in lst:
if num > largest:
largest = num
return largest
5. Question: Write a Python program to count the occurrences of each word in a string.
Answer:
a, b = b, a
import os
file_exists = os.path.isfile("example.txt")
11. Question: How can you read a large file line by line?
Answer:
try:
result = 10 / 0
except ZeroDivisionError:
print("Cannot divide by zero.")
import time
def timer(func):
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
end = time.time()
print(f"Execution time: {end - start}")
return result
return wrapper
square = lambda x: x ** 2
def is_palindrome(s):
return s == s[::-1]
import sys
args = sys.argv[1:]
import random
random_number = random.randint(1, 100)
import threading
def print_numbers():
for i in range(5):
print(i)
thread = threading.Thread(target=print_numbers)
thread.start()
class MyClass:
def __init__(self, name):
self.name = name
import json
data = json.loads(json_string)
json_string = json.dumps(data)
def my_generator():
yield 1
yield 2
yield 3
27. Question: What is a Python metaclass?
Answer:
A metaclass is a class of a class, controlling the creation of classes.
28. Question: How do you implement a singleton class?
Answer:
class Singleton:
_instance = None
def __new__(cls, *args, **kwargs):
if not cls._instance:
cls._instance = super().__new__(cls, *args, **kwargs)
return cls._instance
import pickle
with open("file.pkl", "wb") as f:
pickle.dump(obj, f)
Real-World Scenarios
General Questions
1. Question: How would you approach cleaning a dataset with missing values?
Answer:
o Identify missing values using tools like isnull() in Python or conditional
formatting in Excel.
o Handle missing values by:
Imputing with mean/median/mode.
Removing rows/columns with excessive missing data.
Using advanced techniques like interpolation.
2. Question: What steps would you take if you found inconsistencies in the dataset?
Answer:
o Investigate data sources for errors.
o Standardize formats (e.g., dates, text casing).
o Remove duplicates or outliers.
o Validate data with stakeholders.
3. Question: How do you ensure the quality of your analysis?
Answer:
o Perform data validation and checks.
o Use visualization to identify anomalies.
o Peer review and cross-check results.
o Ensure reproducibility by documenting steps.
4. Question: How do you prioritize tasks in a tight deadline?
Answer:
o Break the task into smaller chunks.
o Focus on high-impact tasks first.
o Communicate with stakeholders to set realistic expectations.
5. Question: What KPIs would you track for an e-commerce company?
Answer:
o Sales revenue, conversion rate, customer lifetime value (CLV), average order
value (AOV), cart abandonment rate.
SQL-Related Questions
SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
Excel-Related Questions
11. Question: How would you analyze monthly sales trends in Excel?
Answer:
o Use a pivot table to summarize sales by month.
o Create a line chart to visualize trends.
12. Question: What formula would you use to find duplicate rows in Excel?
Answer:
Use the formula:
16. Question: How would you visualize sales performance across regions?
Answer:
o Create a map visualization.
o Use slicers/filters for detailed insights.
o Display KPIs using cards.
17. Question: How do you handle data blending in Tableau?
Answer:
o Define relationships between datasets.
o Use a common field to join multiple datasets.
18. Question: Explain how you would use DAX to calculate year-over-year sales.
Answer:
Python-Related Questions
import pandas as pd
df.fillna(value, inplace=True)
23. Question: Write a Python script to group data by category and calculate the mean.
Answer:
grouped = df.groupby('category')['value'].mean()
Advanced Scenarios
Basic Operations
import pandas as pd
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
2. Question: How do you filter rows where the "Age" column is greater than 25?
Answer:
df.dropna(inplace=True)
Intermediate Operations
6. Question: How do you fill missing values with the mean of a column?
Answer:
df['Age'].fillna(df['Age'].mean(), inplace=True)
duplicates = df[df.duplicated()]
8. Question: How do you group data by a column and calculate the mean?
Answer:
grouped = df.groupby('Age')['Salary'].mean()
Data Manipulation
df['Date'] = pd.to_datetime(df['Date'])
Advanced Scenarios
df['Rolling_Avg'] = df['Sales'].rolling(window=3).mean()
correlation = df.corr()
19. Question: How do you rank rows based on a column?
Answer:
df['Rank'] = df['Salary'].rank()
Time-Series Analysis
21. Question: How do you filter rows for a specific date range?
Answer:
22. Question: How do you extract the year and month from a datetime column?
Answer:
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month
24. Question: How do you calculate the difference between consecutive rows?
Answer:
df['Difference'] = df['Sales'].diff()
Q1 = df['Sales'].quantile(0.25)
Q3 = df['Sales'].quantile(0.75)
IQR = Q3 - Q1
outliers = df[(df['Sales'] < Q1 - 1.5 * IQR) | (df['Sales'] > Q3 + 1.5 *
IQR)]
Performance Optimization
26. Question: How do you handle large datasets in Pandas?
Answer:
o Use chunk_size in read_csv() for incremental loading.
o Optimize datatypes using astype() or dtypes.
o Filter unnecessary columns and rows.
27. Question: How do you reduce memory usage of a DataFrame?
Answer:
df['Int_Column'] = df['Int_Column'].astype('int32')
df['Float_Column'] = df['Float_Column'].astype('float32')
df.to_csv('output.csv', index=False)
df['Age'].plot(kind='hist')