SQL Excel Test
SQL Excel Test
Task 1
Description
You need to write an SQL query that will show all employees who have less than three tasks. The SQL script should result in a table
with the following columns:
employee name
number of employee’s tasks.
Dataset
employee tasks
i emp_na ratin emp_i ratin
id
d me g d g
1 Vasya A 5 1 1 5
2 Vasya B 3 2 1 4
3 Petya C 9 3 2 2
4 Igor D 4 4 5 5
5 Nikita E 2 5 5 1
6 5 7
7 5 9
8 5 3
9 5 3
1
5 5
0
1
2 7
1
1
3 8
2
1
3 8
3
1
2 7
4
1
3 9
5
1
5 4
6
1
1 5
7
Task 1 Solution:
SELECT
e.emp_name, count(t.id) AS Tasks
FROM
employee e
LEFT JOIN
tasks t ON e.id = t.emp_id
GROUP BY e.emp_name
HAVING count(t.id) < 3;
Number of tasks: 3
Task 2
Description
You need to write an SQL query that will show the total payments for both the principal amount and interest. Calculate the sums
only for those contracts where payments were made towards both the principal amount and interest. Present the result as follows:
agreement ID
total payments towards the principal
total payments towards the interest.
Dataset
payments_principal payments_interest
payment_ agr_i payment_a payment_ agr_i payment_a
dt d mt dt d mt
5/11/19 31 8281 5/11/19 31 98
5/12/19 7 4622 5/12/19 7 90
5/13/19 5 7686 5/13/19 5 39
7/1/19 1 9917 7/1/19 1 82
7/23/19 1 6534 7/23/19 1 59
8/20/19 64 4336 8/20/19 50 96
8/24/19 3 7464 8/24/19 3 1
8/25/19 9 8505 8/25/19 9 22
8/27/19 1 9857 8/27/19 1 95
7/7/19 7 6294 7/7/19 7 79
7/17/19 7 3182 7/17/19 7 72
8/28/19 4 9708 8/28/19 4 61
8/29/19 4 8632 8/29/19 4 49
8/30/19 3 8303 8/30/19 3 78
9/1/19 7 3141 9/1/19 7 29
8/25/19 1 9139 8/25/19 1 88
8/25/19 2 7624 8/25/19 2 77
9/1/19 7 3793 9/1/19 6 6
9/1/19 3 3260 9/1/19 3 18
8/21/19 5 9002 8/21/19 5 15
8/22/19 2 5500 8/22/19 2 28
5/12/19 7 4622 8/23/19 2 23
8/23/19 2 3980 8/29/19 2 84
8/29/19 2 5849
Number of tasks: 3
Task 2 Solution:
Old Solution
SELECT
p.agr_id,
sum(p.payment_amt) as 'Principal Payment',
sum(i.payment_amt) as 'Interest Payment'
FROM
payments_principal p
JOIN
payments_interest i ON p.payment_dt = i.payment_dt
AND p.agr_id = i.agr_id
GROUP BY p.agr_id;
New Solution:
SELECT
p.agr_id,
SUM(p.payment_amt) AS 'Principal Payment',
SUM(i.payment_amt) AS 'Interest Payment'
FROM(
SELECT
payment_dt,
agr_id,
sum(payment_amt) as payment_amt
FROM payments_principal
GROUP BY payment_dt,agr_id) as p
JOIN
(SELECT
payment_dt,
agr_id,
sum(payment_amt) as payment_amt
FROM payments_interest
GROUP BY payment_dt,agr_id) as i
ON p.payment_dt = i.payment_dt
AND p.agr_id = i.agr_id
GROUP BY p.agr_id;
Task 3
Description
This is a Funnel type task. You are provided with a dataset containing information on applications (application_id) for a specific
period from May 17th to May 25 th, 2022 (see the attached Excel file SQL_Task_3_dataset.xlsx). For each application, there is
information about the date and the final stage of the application (up_stage – top-level stage, mid_stage – middle-level, sub_stage –
most detailed).
Imagine that there is a process of filling out an application, consisting of the following stages (up_stage):
1. Personal Details
2. KYC Details
3. Bank Verification
4. Application Summary
5. e-Nach
Number of tasks: 3
6. e-Sign
Each such step essentially represents a page with certain fields that client must fill-in to successfully complete application and get a
loan. However, not every client completes application due to certain reason (which fields mid_stage and sub_stage describe).
In the end, the entire customer journey from the beginning of filling the application to disbursement is characterized by
"conversions", i.e. the proportion of clients moving from one page to another, from one stage to another, etc.
The main conversion is the Disbursement rate, equal to the ratio of all disbursed loans (is_disbursed = 1) to the total number of
applications.
In the provided dataset, you can see that between May 19th and May 22nd, the Disbursement rate (in other words, the proportion
of loans disbursed to all applications) significantly decreased compared to other days.
A. Knowing the above, analyze the data and formulate hypotheses about the reasons for the decrease in the Disbursement
rate from May 19th to May 22nd. Investigate what caused this decline based on the data available and find the "root of the
problem". To search for a solution and to display the results of your analysis, you can use both SQL and Excel/Power BI.
B. Create a report in Power BI (or use Excel if you have no experience in Power BI) and visually display the following:
A chart with the Disbursement rate (line) and the number of applications (columns) dynamically by days.
A table by days (for Power BI – add week and month hierarchies) in the breakdown of up_stage, mid_stage, sub_stage
as a percentage of 100% of applications (see the screenshot below as an example).
Task 3 Solution:
Kindly refer to the attached SQL_Task3 excel.