Thelook E-Commerce Analysis: Assignment Fahmi Dayanto
Thelook E-Commerce Analysis: Assignment Fahmi Dayanto
SELECT
DATE_TRUNC(DATE(created_at), MONTH) AS month_year,
status,
COUNT(DISTINCT user_id) AS total_unique_user,
COUNT(order_id) AS total_order,
ROUND(SUM(sale_price),2) AS total_sale_price,
FROM `bigquery-public-data.thelook_ecommerce.order_items`
WHERE DATE_TRUNC(DATE(created_at), MONTH) BETWEEN '2019-01-01' AND
'2022-08-01' AND status = ‘Complete’
GROUP BY month_year,status
ORDER BY month_year;
Result Preview
The chart above shows the trendline of total orders completed increasing
significantly from January 2019 until July 2022.
Question Number 2
Create a query to get frequencies, average order value
and total number of unique users where status is
complete grouped by month.
Please use time frame from Jan 2019 until Aug 2022
Expected output:
● Month-year
● Frequencies (number of order per customer)
● AOV (sale price per order)
● Unique buyers
The Query of Question Number 2
SQL Syntax
SELECT
DATE_TRUNC(DATE(created_at), MONTH) AS month_year,
COUNT(DISTINCT user_id) AS unique_buyer,
COUNT(order_id)/COUNT(DISTINCT user_id) AS frequencies,
ROUND(SUM(sale_price)/COUNT(DISTINCT order_id),2) AS AOV,
from `bigquery-public-data.thelook_ecommerce.order_items`
WHERE DATE_TRUNC(DATE(created_at), month) BETWEEN '2019-01-01' AND
'2022-08-01' AND
status='Complete'
GROUP BY month_year
ORDER BY month_year;
Result Preview
Starting January 2020 the
average of customer
purchases tends to be more
stable than in 2019.
Question Number 3
Find the user id, email, first and last name of users
whose status is refunded on Aug 22
Expected output:
● User id
● Email
● First name
● Last name
The Query of Question Number 3
SQL Syntax
SELECT
DISTINCT(A1.id),
A1.email,
A1.first_name,
A1.last_name
FROM `bigquery-public-data.thelook_ecommerce.users` A1
INNER JOIN `bigquery-public-data.thelook_ecommerce.order_items` A2
ON A1.id=A2.user_id
WHERE A2.status='Returned' AND DATE_TRUNC(DATE(A2.returned_at),
MONTH) = '2022-09-01';
Result Preview
Question Number 4
Get the top 5 least and most profitable product over all
time.
Expected output:
a. Product id
b. Product name
c. Retail price
d. Cost
e. Profit
The Query of Question Number 4
SQL Syntax
WITH least_profit AS
( FROM
SELECT `bigquery-public-data.thelook_ecommerce.inventory_items`
'Least_profit' AS Remark, AS A1
A1.product_id, INNER JOIN
SUM(A2.sale_price - A1.cost) AS profit `bigquery-public-data.thelook_ecommerce.order_items` AS
FROM `bigquery-public-data.thelook_ecommerce.inventory_items` AS A1 A2
INNER JOIN `bigquery-public-data.thelook_ecommerce.order_items` AS ON A1.id = A2.inventory_item_id
A2 WHERE A2.status='Complete'
ON A1.id = A2.inventory_item_id GROUP BY 2
WHERE A2.status='Complete' ORDER BY profit DESC
GROUP BY 2 LIMIT 5
ORDER BY profit )
LIMIT 5 SELECT
), A3.product_id,
most_profit AS A4.Remark,
( A3.product_name,
SELECT A3.product_retail_price,
'Most_profit'AS Remark, A3.cost,
A1.product_id, A4.profit
SUM(A2.sale_price - A1.cost) AS profit
The Query of Question Number 4
FROM `bigquery-public-data.thelook_ecommerce.inventory_items` AS A3
INNER JOIN least_profit AS A4 USING (product_id)
UNION DISTINCT
SELECT
A3.product_id,
A5.Remark,
A3.product_name,
A3.product_retail_price,
A3.cost,
A5.profit
FROM `bigquery-public-data.thelook_ecommerce.inventory_items` AS A3
INNER JOIN most_profit AS A5 USING (product_id)
ORDER BY profit;
Result Preview
The Air Jordan Dominate Shorts Mens became the product with
the highest profit, and the Indestructible Aluminum Aluma
Wallet in Red became the product with the lowest profit. Men's
apparel dominates the category of most profitable products.
Question Number 5
WITH Base_2 as
Base_1 as (SELECT
(SELECT Month_year,
DATE_TRUNC(date(A1.created_at), month) AS product_category,
Month_year, order_completed,
A1.product_category, LAG(order_completed,1) OVER(PARTITION BY
COUNT(A2.status) order_completed product_category ORDER BY Month_year)
FROM previous_order_completed,
`bigquery-public-data.thelook_ecommerce.inventory_it FROM Base_1)
ems` A1 SELECT
INNER JOIN Month_year,
`bigquery-public-data.thelook_ecommerce.order_items` Product_category,
A2 CONCAT(((order_completed-previous_order_completed)/pr
ON A1.product_id = A2.product_id evious_order_completed)*100),’%’) AS Growth,
WHERE A2.status = "Complete" FROM Base_2
GROUP BY 1,2), WHERE Month_year BETWEEN "2019-01-01" AND
"2022-04-30"
ORDER BY 1 DESC, 2;
Result Preview
Insight & Recommendation:
Insight:
● The total number of new users slightly increased from 2953 in January 2022 to 4953 in the
beginning of October 2022.
● In September 2022, the retention rate decrease to 7.57% from the previous month caused
by incomplete dataset in October 2022.
● August 2022 became the highest retention rate in the first month at 18.60%.
● Out of all of the new users during this time range (41787 users), 9.66% users are retained on
month 1 that means about 90% of customers do not place an order after the first month.
Recommendation