Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

SQL Scenario Based Questions-1

The document contains a comprehensive list of SQL and Power BI scenario-based questions and answers, covering basic to advanced topics. It includes queries for data retrieval, manipulation, and analysis in SQL, as well as functionalities and features in Power BI. The content serves as a guide for users looking to enhance their SQL and Power BI skills.

Uploaded by

vishakha chavan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

SQL Scenario Based Questions-1

The document contains a comprehensive list of SQL and Power BI scenario-based questions and answers, covering basic to advanced topics. It includes queries for data retrieval, manipulation, and analysis in SQL, as well as functionalities and features in Power BI. The content serves as a guide for users looking to enhance their SQL and Power BI skills.

Uploaded by

vishakha chavan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

SQL Scenario Based Questions

Basic SQL Scenarios

1. Question: Retrieve all employees from the Employees table whose salaries are higher
than the average salary.
Answer:

SELECT * FROM Employees


WHERE Salary > (SELECT AVG(Salary) FROM Employees);

2. Question: Find the second-highest salary from the Salaries table.


Answer:

SELECT MAX(Salary) AS SecondHighestSalary FROM Salaries


WHERE Salary < (SELECT MAX(Salary) FROM Salaries);

3. Question: Write a query to count the number of employees in each department.


Answer:

SELECT DepartmentID, COUNT(*) AS EmployeeCount FROM Employees


GROUP BY DepartmentID;

4. Question: Fetch details of employees who joined in the last 6 months.


Answer:

SELECT * FROM Employees


WHERE JoinDate >= DATEADD(MONTH, -6, GETDATE());

5. Question: Write a query to delete duplicate rows in the Products table while keeping
one.
Answer:

WITH CTE AS (
SELECT ProductID, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY
ProductID) AS RowNum
FROM Products
)
DELETE FROM Products
WHERE ProductID IN (SELECT ProductID FROM CTE WHERE RowNum > 1);

6. Question: Get the top 3 customers with the highest purchase amounts.
Answer:

SELECT TOP 3 CustomerID, SUM(Amount) AS TotalAmount FROM Orders


GROUP BY CustomerID
ORDER BY TotalAmount DESC;
7. Question: Find all employees who report directly or indirectly to a manager with
ManagerID = 1.
Answer:

WITH CTE AS (
SELECT EmployeeID, ManagerID FROM Employees
WHERE ManagerID = 1
UNION ALL
SELECT e.EmployeeID, e.ManagerID FROM Employees e
INNER JOIN CTE c ON e.ManagerID = c.EmployeeID
)
SELECT * FROM CTE;

8. Question: Identify customers who made purchases in every month of 2024.


Answer:

SELECT CustomerID FROM Orders


WHERE YEAR(OrderDate) = 2024
GROUP BY CustomerID
HAVING COUNT(DISTINCT MONTH(OrderDate)) = 12;

9. Question: Write a query to find the total sales amount for each product category and
region.
Answer:

SELECT CategoryID, Region, SUM(SalesAmount) AS TotalSales FROM Sales


GROUP BY CategoryID, Region;

10. Question: Fetch the employee names with the highest salary in each department.
Answer:

SELECT DepartmentID, EmployeeName, Salary FROM Employees e


WHERE Salary = (SELECT MAX(Salary) FROM Employees
WHERE DepartmentID = e.DepartmentID
);

11. Question: Retrieve the EmployeeID and the cumulative salary for each employee ordered
by hire date.
Answer:

SELECT EmployeeID, Salary, SUM(Salary) OVER (ORDER BY HireDate) AS


CumulativeSalary FROM Employees;

12. Question: Write a query to find orders where the total amount is greater than the average
total amount across all orders.
Answer:

SELECT * FROM Orders


WHERE TotalAmount > (SELECT AVG(TotalAmount) FROM Orders);
13. Question: Get the ProductID of all products sold more than twice to the same customer.
Answer:

SELECT ProductID FROM Orders


GROUP BY CustomerID, ProductID
HAVING COUNT(*) > 2;

14. Question: Find the total number of orders placed on each weekday.
Answer:

SELECT DATENAME(WEEKDAY, OrderDate) AS Weekday, COUNT(*) AS OrderCount


FROM Orders
GROUP BY DATENAME(WEEKDAY, OrderDate);

15. Question: Identify employees who earn a salary above the department average.
Answer:

SELECT EmployeeID, EmployeeName, Salary FROM Employees e


WHERE Salary > (SELECT AVG(Salary) FROM Employees
WHERE DepartmentID = e.DepartmentID
);

16. Question: Write a query to find products that have never been sold.
Answer:

SELECT * FROM Products


WHERE ProductID NOT IN (SELECT DISTINCT ProductID FROM Sales);

17. Question: List customers who bought the same product on more than one day.
Answer:

SELECT CustomerID, ProductID FROM Orders


GROUP BY CustomerID, ProductID
HAVING COUNT(DISTINCT OrderDate) > 1;

18. Question: Write a query to find the average sales per day for the last 30 days.
Answer:

SELECT AVG(DailySales) FROM (SELECT CAST(OrderDate AS DATE) AS SaleDate,


SUM(Amount) AS DailySales FROM Sales
WHERE OrderDate >= DATEADD(DAY, -30, GETDATE())
GROUP BY CAST(OrderDate AS DATE)
) AS DailySalesData;

19. Question: Get the CategoryID and the percentage contribution of each category to total
sales.
Answer:

SELECT CategoryID, SUM(SalesAmount) * 100.0 / (SELECT SUM(SalesAmount)


FROM Sales) AS PercentageContribution FROM Sales
GROUP BY CategoryID;

20. Question: Write a query to identify orders that are missing shipping information.
Answer:

SELECT * FROM Orders


WHERE ShippingAddress IS NULL;

1. Retrieve all employees whose salary is greater than $5000.

SELECT * FROM Employees


WHERE Salary > 5000;

2. Find the employees whose names start with 'A'.

SELECT * FROM Employees


WHERE Name LIKE 'A%';

3. Count the number of employees in each department.

SELECT DepartmentID, COUNT(*) AS EmployeeCount FROM Employees


GROUP BY DepartmentID;

4. Retrieve all customers who placed an order in 2024.

SELECT DISTINCT CustomerID FROM Orders


WHERE YEAR(OrderDate) = 2024;

5. List all unique cities from the Customers table.

SELECT DISTINCT City FROM Customers;

6. Find all orders where the quantity is greater than 100.

SELECT * FROM Orders


WHERE Quantity > 100;

7. Fetch the details of employees who joined after 1st January 2023.

SELECT * FROM Employees


WHERE JoinDate > '2023-01-01';

8. Retrieve the total revenue from the Sales table.

SELECT SUM(Amount) AS TotalRevenue FROM Sales;

9. Find the highest salary in the Employees table.


SELECT MAX(Salary) AS HighestSalary FROM Employees;

10. Display all employees who do not have a manager.

SELECT * FROM Employees


WHERE ManagerID IS NULL;

11. Get the second highest salary from the Salaries table.

SELECT MAX(Salary) AS SecondHighestSalary


FROM Salaries
WHERE Salary < (SELECT MAX(Salary) FROM Salaries);

12. Find all products that have never been ordered.

SELECT * FROM Products


WHERE ProductID NOT IN (SELECT ProductID FROM Orders);

13. Retrieve the top 5 customers by total sales.

SELECT TOP 5 CustomerID, SUM(SalesAmount) AS TotalSales FROM Sales


GROUP BY CustomerID
ORDER BY TotalSales DESC;

14. List departments with more than 10 employees.

SELECT DepartmentID, COUNT(*) AS EmployeeCount FROM Employees


GROUP BY DepartmentID
HAVING COUNT(*) > 10;

15. Fetch the product with the maximum number of orders.

SELECT ProductID, COUNT(*) AS OrderCount FROM Orders


GROUP BY ProductID
ORDER BY OrderCount DESC
LIMIT 1;

16. Find employees who work in multiple departments.

SELECT EmployeeID FROM EmployeeDepartments


GROUP BY EmployeeID
HAVING COUNT(DISTINCT DepartmentID) > 1;

17. Retrieve the total quantity of products sold by each salesperson.

SELECT SalespersonID, SUM(Quantity) AS TotalQuantity


FROM Sales
GROUP BY SalespersonID;
18. Find orders where the total amount is higher than the average order amount.

SELECT * FROM Orders


WHERE TotalAmount > (SELECT AVG(TotalAmount) FROM Orders);

19. Retrieve all orders placed in the last 7 days.

SELECT * FROM Orders


WHERE OrderDate >= DATEADD(DAY, -7, GETDATE());

20. Fetch the average salary of employees in each department.

SELECT DepartmentID, AVG(Salary) AS AvgSalary


FROM Employees
GROUP BY DepartmentID;

Advanced SQL Scenarios

21. List employees who earn more than their managers.

SELECT e.EmployeeID, e.Name


FROM Employees e
JOIN Employees m ON e.ManagerID = m.EmployeeID
WHERE e.Salary > m.Salary;

22. Get the cumulative sales for each day.

SELECT OrderDate, SUM(SalesAmount) OVER (ORDER BY OrderDate) AS


CumulativeSales FROM Sales;

23. Fetch orders along with the customer's name.

SELECT o.OrderID, o.OrderDate, c.CustomerName


FROM Orders o
JOIN Customers c ON o.CustomerID = c.CustomerID;

24. List the highest sales for each region.

SELECT Region, MAX(SalesAmount) AS HighestSales FROM Sales


GROUP BY Region;

25. Find duplicate records in the Products table.

SELECT ProductName, COUNT(*) FROM Products


GROUP BY ProductName
HAVING COUNT(*) > 1;

26. Delete duplicate rows in the Orders table, keeping only one.
WITH CTE AS (
SELECT OrderID, ROW_NUMBER() OVER (PARTITION BY OrderDetails ORDER
BY OrderID) AS RowNum
FROM Orders
)
DELETE FROM Orders
WHERE OrderID IN (SELECT OrderID FROM CTE WHERE RowNum > 1);

27. Retrieve the top 3 salaries in each department.

SELECT DepartmentID, EmployeeID, Salary


FROM (SELECT DepartmentID, EmployeeID, Salary,
ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary
DESC) AS Rank
FROM Employees
) AS Ranked
WHERE Rank <= 3;

28. List customers who bought the same product on more than one day.

SELECT CustomerID, ProductID FROM Orders


GROUP BY CustomerID, ProductID
HAVING COUNT(DISTINCT OrderDate) > 1;

29. Fetch the month-wise sales growth percentage.

SELECT MONTH(OrderDate) AS Month,


(SUM(SalesAmount) - LAG(SUM(SalesAmount)) OVER (ORDER BY
MONTH(OrderDate))) * 100.0
/ LAG(SUM(SalesAmount)) OVER (ORDER BY MONTH(OrderDate)) AS
GrowthPercentage
FROM Sales
GROUP BY MONTH(OrderDate);

Real-Life Scenarios

31. Find customers who placed orders worth more than $5000 in total.

SELECT CustomerID FROM Orders


GROUP BY CustomerID
HAVING SUM(TotalAmount) > 5000;

32. Fetch the last 5 orders placed in the system.

SELECT * FROM Orders


ORDER BY OrderDate DESC
LIMIT 5;

33. Identify employees who have been with the company for more than 10 years.

SELECT * FROM Employees


WHERE DATEDIFF(YEAR, HireDate, GETDATE()) > 10;
34. Write a query to find out if a table has any null values in any column.

SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'TableName' AND COLUMN_NAME IS NULL;

Basic Power BI Scenarios

1. Question: How can you connect Power BI to a SQL Server database?


Answer:
o Open Power BI Desktop.
o Click Home > Get Data > SQL Server.
o Enter the server name, database name, and credentials, then click OK.
2. Question: What is the difference between a calculated column and a measure?
Answer:
o Calculated Column: Computed row by row at the data level, stored in the model.
o Measure: Calculated on the fly during visualization interactions.
3. Question: What is the purpose of relationships in Power BI?
Answer:
Relationships connect tables in a data model, enabling cross-table filtering and data
aggregation.
4. Question: How do you create a slicer in Power BI?
Answer:
o Add a slicer visual from the visualization pane.
o Drag a field to the slicer, such as category or region.
5. Question: How can you set a custom color for data points in a chart?
Answer:
o Click on the chart, go to the Format pane > Data colors, and assign custom
colors to each category.
6. Question: What is the role of Power Query in Power BI?
Answer:
Power Query is used for ETL (Extract, Transform, Load) operations to clean and shape
data before loading it into the model.
7. Question: How do you create a new column in Power Query?
Answer:
o Open Power Query Editor.
o Select Add Column > Custom Column and write the formula.
8. Question: What are the default storage modes in Power BI?
Answer:
o Import: Data is imported into Power BI.
o DirectQuery: Data stays in the source, and queries are sent in real-time.
o Dual: Combines Import and DirectQuery.
9. Question: What are the key components of a Power BI dashboard?
Answer:
o Tiles, visualizations, KPIs, slicers, and pinned reports.
10. Question: How do you publish a report to Power BI Service?
Answer:
o Save the report in Power BI Desktop.
o Click Publish, log in, and choose a workspace.

Intermediate Power BI Scenarios

11. Question: How do you create a relationship between two tables?


Answer:
o Go to the Model view, drag a field from one table to a related field in another,
and configure the relationship.
12. Question: What is a Star Schema?
Answer:
A Star Schema organizes data into fact and dimension tables, with a fact table at the
center connected to multiple dimension tables.
13. Question: How can you calculate year-over-year growth in Power BI?
Answer:

YoY Growth = CALCULATE(SUM(Sales[Amount]),


SAMEPERIODLASTYEAR('Date'[Date]))

14. Question: How do you apply row-level security (RLS) in Power BI?
Answer:
o Define roles in the Model view.
o Use DAX filters like [Region] = "East".
o Assign roles to users in Power BI Service.
15. Question: What is the difference between a line chart and an area chart?
Answer:
o Line Chart: Displays trends over time.
o Area Chart: Adds shading below the line for emphasis on volume.
16. Question: How do you remove duplicates in Power Query?
Answer:
o In Power Query Editor, select the columns and click Remove Duplicates in the
Home tab.
17. Question: What is the use of the CALCULATE function in DAX?
Answer:
It modifies the filter context of an expression.
18. Question: How can you implement a drill-through filter?
Answer:
o Create a drill-through page with the desired fields.
o Add a drill-through filter in the Filters pane.
19. Question: How do you create a KPI visual in Power BI?
Answer:
o Select the KPI visual from the pane.
o Add a field for value, target, and trend.
20. Question: How do you export data from a visualization?
Answer:
o Click the ellipsis (…) on the visual.
o Select Export data and choose a format.

Advanced Power BI Scenarios

21. Question: How do you handle large datasets in Power BI?


Answer:
o Use DirectQuery or aggregations.
o Optimize the model by removing unused columns and tables.
22. Question: Explain the difference between calculated tables and regular tables.
Answer:
o Calculated Table: Created using DAX.
o Regular Table: Imported from a data source.
23. Question: How do you create a rolling average in Power BI?
Answer:

RollingAverage =
AVERAGEX(DATESINPERIOD('Date'[Date], LASTDATE('Date'[Date]), -30, DAY),
[SalesAmount]
)

24. Question: How do you implement What-If parameters in Power BI?


Answer:
o Go to Modeling > New Parameter.
o Set the range and increment.
o Use the parameter in DAX measures.
25. Question: How do you track refresh history in Power BI Service?
Answer:
o Go to the dataset in the workspace.
o Check the Refresh History tab.
26. Question: How do you optimize DAX formulas?
Answer:
o Use variables (VAR) to store intermediate results.
o Avoid nested calculations.
o Replace filters with simpler expressions.
27. Question: What is the difference between ALL and REMOVEFILTERS?
Answer:
o ALL: Ignores all filters on a column or table.
o REMOVEFILTERS: Removes specific filters.
28. Question: How do you handle circular relationships in Power BI?
Answer:
o Avoid bidirectional relationships.
o Use calculated columns or measures instead.
29. Question: How do you create a waterfall chart?
Answer:
o Use the waterfall chart visual.
o Add fields for category, breakdown, and values.
30. Question: How can you create bookmarks for navigation in Power BI?
Answer:
o Arrange visuals and filters.
o Add a bookmark via View > Bookmarks.
o Use buttons to navigate bookmarks.

Real-Life Business Scenarios

31. Create a year-to-date sales measure.

YTD Sales = CALCULATE(SUM(Sales[Amount]), DATESYTD('Date'[Date]))

32. Highlight the top 5 customers by sales in a bar chart.


o Use a DAX measure with RANKX to rank customers.
33. Filter data dynamically using slicers.
o Add slicers and bind them to fields for dynamic filtering.
34. Track the percentage contribution of sales by region.

SalesContribution =
DIVIDE(SUM(Sales[Amount]), CALCULATE(SUM(Sales[Amount]), ALL(Sales)))

35. Display cumulative profit by month.

CumulativeProfit =
CALCULATE(SUM(Sales[Profit]), FILTER(ALL('Date'), 'Date'[Date] <=
MAX('Date'[Date])))

36. Detect outliers in sales data.


o Use scatter plots and conditional formatting.
37. Combine multiple datasets using Power Query.
o Use Append Queries or Merge Queries.
38. Track employee headcount trends over time.
o Create a measure using COUNTROWS and time-based filters.
39. Implement RLS for different regions.
o Define roles like [Region] = USERPRINCIPALNAME().
40. Visualize sales targets vs. actuals.
o Use combination charts or KPI visuals.

Additional Advanced Scenarios

41. Create a custom tooltip.


42. Set up scheduled refresh.
43. Use Python or R visuals.
44. Handle multi-language support in Power BI.
45. Implement live connections to datasets.
46. Build a custom theme file for branding.
47. Optimize model size by removing unused columns.
48. Use drill-down functionality in hierarchical visuals.
49. Track sales pipeline stages using funnel charts.
50. Add a trendline in line charts to forecast data.

Basic Python Scenarios

1. Question: How can you reverse a string in Python?


Answer:

reversed_string = "example"[::-1]

2. Question: Write a function to check if a number is prime.


Answer:

def is_prime(num):
if num < 2:
return False
for i in range(2, int(num ** 0.5) + 1):
if num % i == 0:
return False
return True

3. Question: How do you find the largest number in a list without using max()?
Answer:

def find_max(lst):
largest = lst[0]
for num in lst:
if num > largest:
largest = num
return largest

4. Question: How can you remove duplicates from a list?


Answer:

unique_list = list(set([1, 2, 2, 3, 4]))

5. Question: Write a Python program to count the occurrences of each word in a string.
Answer:

from collections import Counter


text = "this is a test this is only a test"
word_counts = Counter(text.split())
6. Question: How do you swap two variables without using a temporary variable?
Answer:

a, b = b, a

7. Question: How can you iterate over a dictionary?


Answer:

for key, value in my_dict.items():


print(key, value)

8. Question: What’s the difference between a tuple and a list?


Answer:
o Tuple: Immutable, uses parentheses ( ).
o List: Mutable, uses square brackets [ ].
9. Question: How do you merge two dictionaries?
Answer:

merged_dict = {**dict1, **dict2}

10. Question: How can you check if a file exists in Python?


Answer:

import os
file_exists = os.path.isfile("example.txt")

Intermediate Python Scenarios

11. Question: How can you read a large file line by line?
Answer:

with open("large_file.txt") as file:


for line in file:
print(line.strip())

12. Question: How do you handle exceptions in Python?


Answer:

try:
result = 10 / 0
except ZeroDivisionError:
print("Cannot divide by zero.")

13. Question: Write a decorator to measure the execution time of a function.


Answer:

import time
def timer(func):
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
end = time.time()
print(f"Execution time: {end - start}")
return result
return wrapper

14. Question: How do you sort a list of dictionaries by a specific key?


Answer:

sorted_list = sorted(data, key=lambda x: x['key_name'])

15. Question: How do you create a lambda function in Python?


Answer:

square = lambda x: x ** 2

16. Question: How do you use zip() to combine two lists?


Answer:

combined = list(zip(list1, list2))

17. Question: Write a function to check if a string is a palindrome.


Answer:

def is_palindrome(s):
return s == s[::-1]

18. Question: What’s the difference between deepcopy and copy?


Answer:
o copy: Shallow copy, references nested objects.
o deepcopy: Recursively copies all objects.
19. Question: How do you handle command-line arguments in Python?
Answer:

import sys
args = sys.argv[1:]

20. Question: How can you generate random numbers in Python?


Answer:

import random
random_number = random.randint(1, 100)

Advanced Python Scenarios


21. Question: Write a program to implement a binary search.
Answer:

def binary_search(arr, target):


low, high = 0, len(arr) - 1
while low <= high:
mid = (low + high) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
low = mid + 1
else:
high = mid - 1
return -1

22. Question: How can you create a thread in Python?


Answer:

import threading
def print_numbers():
for i in range(5):
print(i)
thread = threading.Thread(target=print_numbers)
thread.start()

23. Question: How do you create a class in Python?


Answer:

class MyClass:
def __init__(self, name):
self.name = name

24. Question: Explain the GIL (Global Interpreter Lock).


Answer:
GIL allows only one thread to execute at a time in CPython, limiting parallelism.
25. Question: How do you handle JSON data in Python?
Answer:

import json
data = json.loads(json_string)
json_string = json.dumps(data)

26. Question: How can you create a generator?


Answer:

def my_generator():
yield 1
yield 2
yield 3
27. Question: What is a Python metaclass?
Answer:
A metaclass is a class of a class, controlling the creation of classes.
28. Question: How do you implement a singleton class?
Answer:

class Singleton:
_instance = None
def __new__(cls, *args, **kwargs):
if not cls._instance:
cls._instance = super().__new__(cls, *args, **kwargs)
return cls._instance

29. Question: How can you serialize an object in Python?


Answer:

import pickle
with open("file.pkl", "wb") as f:
pickle.dump(obj, f)

30. Question: What are Python descriptors?


Answer:
Descriptors manage the behavior of an attribute with __get__, __set__, and
__delete__.

Real-World Scenarios

31. Parse a CSV file using pandas.


32. Connect to a database using sqlite3.
33. Scrape a website with BeautifulSoup.
34. Create an API using Flask.
35. Automate emails with smtplib.
36. Visualize data using matplotlib.
37. Handle missing data in a DataFrame.
38. Load and process images with Pillow.
39. Train a machine learning model using scikit-learn.
40. Perform text analysis with NLTK.

Additional Advanced Scenarios

41. Asynchronous programming using asyncio.


42. Implement caching using functools.lru_cache.
43. Work with time zones using pytz.
44. Build a chatbot using ChatGPT API.
45. Integrate Python with Excel using openpyxl.
46. Test a Python application with pytest.
47. Secure sensitive data using cryptography.
48. Optimize code performance using cProfile.
49. Create a REST API client.
50. Use NumPy for array operations.

General Questions

1. Question: How would you approach cleaning a dataset with missing values?
Answer:
o Identify missing values using tools like isnull() in Python or conditional
formatting in Excel.
o Handle missing values by:
 Imputing with mean/median/mode.
 Removing rows/columns with excessive missing data.
 Using advanced techniques like interpolation.
2. Question: What steps would you take if you found inconsistencies in the dataset?
Answer:
o Investigate data sources for errors.
o Standardize formats (e.g., dates, text casing).
o Remove duplicates or outliers.
o Validate data with stakeholders.
3. Question: How do you ensure the quality of your analysis?
Answer:
o Perform data validation and checks.
o Use visualization to identify anomalies.
o Peer review and cross-check results.
o Ensure reproducibility by documenting steps.
4. Question: How do you prioritize tasks in a tight deadline?
Answer:
o Break the task into smaller chunks.
o Focus on high-impact tasks first.
o Communicate with stakeholders to set realistic expectations.
5. Question: What KPIs would you track for an e-commerce company?
Answer:
o Sales revenue, conversion rate, customer lifetime value (CLV), average order
value (AOV), cart abandonment rate.

SQL-Related Questions

6. Question: Write a query to find duplicate records in a table.


Answer:

SELECT column_name, COUNT(*)


FROM table_name
GROUP BY column_name
HAVING COUNT(*) > 1;

7. Question: How do you calculate the second-highest salary from a table?


Answer:

SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);

8. Question: Write a query to find the total sales by region.


Answer:

SELECT region, SUM(sales) AS total_sales


FROM sales_data
GROUP BY region;

9. Question: How do you handle data joins with missing values?


Answer:
Use appropriate joins:
o INNER JOIN to exclude missing values.
o LEFT/RIGHT JOIN to include missing values and handle them later.
10. Question: Explain the difference between WHERE and HAVING.
Answer:
o WHERE: Filters rows before aggregation.
o HAVING: Filters aggregated results.

Excel-Related Questions

11. Question: How would you analyze monthly sales trends in Excel?
Answer:
o Use a pivot table to summarize sales by month.
o Create a line chart to visualize trends.
12. Question: What formula would you use to find duplicate rows in Excel?
Answer:
Use the formula:

=COUNTIF(range, criteria) > 1

13. Question: How do you calculate CAGR in Excel?


Answer:

=((End Value / Start Value)^(1 / Number of Years)) - 1


14. Question: How do you handle large datasets in Excel?
Answer:
o Use filters and pivot tables.
o Break the data into smaller sheets.
o Use Power Query for efficient processing.
15. Question: How do you use conditional formatting for highlighting trends?
Answer:
o Use color scales for numerical trends.
o Use icons for increase/decrease indicators.

Power BI/Tableau Questions

16. Question: How would you visualize sales performance across regions?
Answer:
o Create a map visualization.
o Use slicers/filters for detailed insights.
o Display KPIs using cards.
17. Question: How do you handle data blending in Tableau?
Answer:
o Define relationships between datasets.
o Use a common field to join multiple datasets.
18. Question: Explain how you would use DAX to calculate year-over-year sales.
Answer:

YoY Sales = CALCULATE(SUM(Sales[Sales]), SAMEPERIODLASTYEAR(Date[Date]))

19. Question: How do you create a calculated field in Power BI?


Answer:
o Go to the "Modeling" tab, select "New Column" or "New Measure," and write
DAX expressions.
20. Question: What steps would you take to optimize a slow Power BI dashboard?
Answer:
o Reduce visualizations.
o Aggregate data before importing.
o Disable unnecessary interactions.
o Use query folding.

Python-Related Questions

21. Question: Write a Python function to calculate the median of a dataset.


Answer:
import statistics
def calculate_median(data):
return statistics.median(data)

22. Question: How do you handle missing data in Python?


Answer:

import pandas as pd
df.fillna(value, inplace=True)

23. Question: Write a Python script to group data by category and calculate the mean.
Answer:

grouped = df.groupby('category')['value'].mean()

24. Question: How do you visualize a time series in Python?


Answer:

import matplotlib.pyplot as plt


plt.plot(df['date'], df['sales'])
plt.show()

25. Question: Explain the use of apply() in Pandas.


Answer:
apply() applies a function along an axis of the DataFrame.

Real-World Analysis Scenarios

26. Question: How would you forecast next quarter’s sales?


Answer:
o Use historical data.
o Apply regression models or time series techniques like ARIMA.
27. Question: How do you identify key drivers of revenue growth?
Answer:
o Use correlation analysis.
o Perform feature importance analysis.
28. Question: Describe how you’d create a churn analysis report.
Answer:
o Analyze retention and churn rates.
o Identify patterns in churned customers.
29. Question: How would you segment customers based on behavior?
Answer:
o Use clustering techniques (e.g., K-means).
o Segment based on RFM (Recency, Frequency, Monetary) analysis.
30. Question: How do you present data insights to non-technical stakeholders?
Answer:
o Use clear and simple visualizations.
o Focus on actionable insights.

Advanced Scenarios

31. Detect outliers using Z-score or IQR in Python.


32. Calculate rolling averages in Excel and Python.
33. Automate a daily report in Power BI.
34. Predict product demand using regression.
35. Analyze customer feedback using sentiment analysis.

Additional Real-World Scenarios

36. Track website performance using Google Analytics data.


37. Perform a competitor analysis for market trends.
38. Measure the impact of a marketing campaign.
39. Build a dashboard for tracking expenses.
40. Create a model to predict employee attrition.

More Analysis Scenarios

41. Identify profitable product categories.


42. Measure the ROI of ad campaigns.
43. Spot patterns in customer complaints.
44. Create dashboards for financial performance.
45. Use A/B testing for website optimization.
46. Measure sales funnel efficiency.
47. Detect anomalies in sales trends.
48. Model customer lifetime value.
49. Visualize supply chain bottlenecks.
50. Analyze and optimize inventory levels.

Basic Operations

1. Question: How do you create a DataFrame from a dictionary?


Answer:

import pandas as pd
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)

2. Question: How do you filter rows where the "Age" column is greater than 25?
Answer:

filtered_df = df[df['Age'] > 25]

3. Question: How do you select specific columns from a DataFrame?


Answer:

selected_columns = df[['Name', 'Age']]

4. Question: How do you rename columns in a DataFrame?


Answer:

df.rename(columns={'Name': 'Full Name', 'Age': 'Years'}, inplace=True)

5. Question: How do you drop rows with missing values?


Answer:

df.dropna(inplace=True)

Intermediate Operations

6. Question: How do you fill missing values with the mean of a column?
Answer:

df['Age'].fillna(df['Age'].mean(), inplace=True)

7. Question: How do you find duplicate rows in a DataFrame?


Answer:

duplicates = df[df.duplicated()]

8. Question: How do you group data by a column and calculate the mean?
Answer:

grouped = df.groupby('Age')['Salary'].mean()

9. Question: How do you add a new column to a DataFrame?


Answer:

df['Salary'] = [50000, 60000]

10. Question: How do you reset the index of a DataFrame?


Answer:
df.reset_index(drop=True, inplace=True)

Data Manipulation

11. Question: How do you sort a DataFrame by a column?


Answer:

df.sort_values(by='Age', ascending=False, inplace=True)

12. Question: How do you merge two DataFrames on a common column?


Answer:

merged_df = pd.merge(df1, df2, on='common_column')

13. Question: How do you concatenate two DataFrames vertically?


Answer:

concatenated_df = pd.concat([df1, df2], axis=0)

14. Question: How do you convert a column’s datatype to datetime?


Answer:

df['Date'] = pd.to_datetime(df['Date'])

15. Question: How do you pivot a DataFrame?


Answer:

pivot_table = df.pivot(index='Region', columns='Product',


values='Sales')

Advanced Scenarios

16. Question: How do you calculate a rolling average?


Answer:

df['Rolling_Avg'] = df['Sales'].rolling(window=3).mean()

17. Question: How do you create a column based on a condition?


Answer:

df['Status'] = df['Age'].apply(lambda x: 'Adult' if x > 18 else 'Minor')

18. Question: How do you find the correlation between columns?


Answer:

correlation = df.corr()
19. Question: How do you rank rows based on a column?
Answer:

df['Rank'] = df['Salary'].rank()

20. Question: How do you split a column into multiple columns?


Answer:

df[['First', 'Last']] = df['Name'].str.split(' ', expand=True)

Time-Series Analysis

21. Question: How do you filter rows for a specific date range?
Answer:

filtered_df = df[(df['Date'] >= '2023-01-01') & (df['Date'] <= '2023-12-


31')]

22. Question: How do you extract the year and month from a datetime column?
Answer:

df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month

23. Question: How do you resample a time-series dataset by month?


Answer:

monthly_data = df.resample('M', on='Date').sum()

24. Question: How do you calculate the difference between consecutive rows?
Answer:

df['Difference'] = df['Sales'].diff()

25. Question: How do you identify outliers in a column using IQR?


Answer:

Q1 = df['Sales'].quantile(0.25)
Q3 = df['Sales'].quantile(0.75)
IQR = Q3 - Q1
outliers = df[(df['Sales'] < Q1 - 1.5 * IQR) | (df['Sales'] > Q3 + 1.5 *
IQR)]

Performance Optimization
26. Question: How do you handle large datasets in Pandas?
Answer:
o Use chunk_size in read_csv() for incremental loading.
o Optimize datatypes using astype() or dtypes.
o Filter unnecessary columns and rows.
27. Question: How do you reduce memory usage of a DataFrame?
Answer:

df['Int_Column'] = df['Int_Column'].astype('int32')
df['Float_Column'] = df['Float_Column'].astype('float32')

28. Question: How do you parallelize operations on a large DataFrame?


Answer:
Use libraries like Dask or Vaex for parallelized Pandas-like operations.
29. Question: How do you export a DataFrame to a CSV without the index?
Answer:

df.to_csv('output.csv', index=False)

30. Question: How do you visualize the distribution of a column in Pandas?


Answer:

df['Age'].plot(kind='hist')

You might also like