Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
58 views

Thelook E-Commerce Analysis: Assignment Fahmi Dayanto

The document contains details about an analysis of an e-commerce dataset from TheLook, a fictional clothing site. It includes 6 questions asking to write SQL queries to analyze trends in orders, customers, products and inventory over time. The queries analyze metrics like total sales, order frequency, profitable products, monthly profit by category, and monthly inventory growth by category. The final question finds the top categories with highest average monthly inventory growth and notes apparel categories generally saw high and steady growth.

Uploaded by

AdamMaula
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

Thelook E-Commerce Analysis: Assignment Fahmi Dayanto

The document contains details about an analysis of an e-commerce dataset from TheLook, a fictional clothing site. It includes 6 questions asking to write SQL queries to analyze trends in orders, customers, products and inventory over time. The queries analyze metrics like total sales, order frequency, profitable products, monthly profit by category, and monthly inventory growth by category. The final question finds the top categories with highest average monthly inventory growth and notes apparel categories generally saw high and steady growth.

Uploaded by

AdamMaula
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

TheLook E-commerce Analysis

Assignment Fahmi Dayanto


Overview of Dataset

TheLook is a fictitious eCommerce clothing


site developed by the Looker team. The
dataset contains information about
customers, products, orders, logistics, web
events, and digital marketing campaigns. The
contents of this dataset are synthetic, and
are provided to industry practitioners for the
purpose of product discovery, testing, and
evaluation
Question Number 1

Create a query to get the number of unique users,


number of orders, and total sale price per status
and month.
Please use time frame from Jan 2019 until Aug 2022
Expected output:
● Month-year
● Status
● Total Unique Users
● Total Orders
● Total Sale Price
The Query of Question Number 1
SQL Syntax

SELECT
DATE_TRUNC(DATE(created_at), MONTH) AS month_year,
status,
COUNT(DISTINCT user_id) AS total_unique_user,
COUNT(order_id) AS total_order,
ROUND(SUM(sale_price),2) AS total_sale_price,
FROM `bigquery-public-data.thelook_ecommerce.order_items`
WHERE DATE_TRUNC(DATE(created_at), MONTH) BETWEEN '2019-01-01' AND
'2022-08-01' AND status = ‘Complete’
GROUP BY month_year,status
ORDER BY month_year;
Result Preview
The chart above shows the trendline of total orders completed increasing
significantly from January 2019 until July 2022.
Question Number 2
Create a query to get frequencies, average order value
and total number of unique users where status is
complete grouped by month.
Please use time frame from Jan 2019 until Aug 2022
Expected output:
● Month-year
● Frequencies (number of order per customer)
● AOV (sale price per order)
● Unique buyers
The Query of Question Number 2
SQL Syntax

SELECT
DATE_TRUNC(DATE(created_at), MONTH) AS month_year,
COUNT(DISTINCT user_id) AS unique_buyer,
COUNT(order_id)/COUNT(DISTINCT user_id) AS frequencies,
ROUND(SUM(sale_price)/COUNT(DISTINCT order_id),2) AS AOV,
from `bigquery-public-data.thelook_ecommerce.order_items`
WHERE DATE_TRUNC(DATE(created_at), month) BETWEEN '2019-01-01' AND
'2022-08-01' AND
status='Complete'
GROUP BY month_year
ORDER BY month_year;
Result Preview
Starting January 2020 the
average of customer
purchases tends to be more
stable than in 2019.
Question Number 3

Find the user id, email, first and last name of users
whose status is refunded on Aug 22
Expected output:
● User id
● Email
● First name
● Last name
The Query of Question Number 3
SQL Syntax
SELECT
DISTINCT(A1.id),
A1.email,
A1.first_name,
A1.last_name
FROM `bigquery-public-data.thelook_ecommerce.users` A1
INNER JOIN `bigquery-public-data.thelook_ecommerce.order_items` A2
ON A1.id=A2.user_id
WHERE A2.status='Returned' AND DATE_TRUNC(DATE(A2.returned_at),
MONTH) = '2022-09-01';
Result Preview
Question Number 4

Get the top 5 least and most profitable product over all
time.
Expected output:
a. Product id
b. Product name
c. Retail price
d. Cost
e. Profit
The Query of Question Number 4
SQL Syntax
WITH least_profit AS
( FROM
SELECT `bigquery-public-data.thelook_ecommerce.inventory_items`
'Least_profit' AS Remark, AS A1
A1.product_id, INNER JOIN
SUM(A2.sale_price - A1.cost) AS profit `bigquery-public-data.thelook_ecommerce.order_items` AS
FROM `bigquery-public-data.thelook_ecommerce.inventory_items` AS A1 A2
INNER JOIN `bigquery-public-data.thelook_ecommerce.order_items` AS ON A1.id = A2.inventory_item_id
A2 WHERE A2.status='Complete'
ON A1.id = A2.inventory_item_id GROUP BY 2
WHERE A2.status='Complete' ORDER BY profit DESC
GROUP BY 2 LIMIT 5
ORDER BY profit )
LIMIT 5 SELECT
), A3.product_id,
most_profit AS A4.Remark,
( A3.product_name,
SELECT A3.product_retail_price,
'Most_profit'AS Remark, A3.cost,
A1.product_id, A4.profit
SUM(A2.sale_price - A1.cost) AS profit
The Query of Question Number 4
FROM `bigquery-public-data.thelook_ecommerce.inventory_items` AS A3
INNER JOIN least_profit AS A4 USING (product_id)
UNION DISTINCT
SELECT
A3.product_id,
A5.Remark,
A3.product_name,
A3.product_retail_price,
A3.cost,
A5.profit
FROM `bigquery-public-data.thelook_ecommerce.inventory_items` AS A3
INNER JOIN most_profit AS A5 USING (product_id)
ORDER BY profit;
Result Preview
The Air Jordan Dominate Shorts Mens became the product with
the highest profit, and the Indestructible Aluminum Aluma
Wallet in Red became the product with the lowest profit. Men's
apparel dominates the category of most profitable products.
Question Number 5

Create a query to get Month to Date of total profit in


each product categories of past 3 months (current date
15 Aug 2022), breakdown by month and categories
Expected output:
a. Date (in date format)
b. Product
c. Profit in the past 3 month to date
The Query of Question Number 5 SQL Syntax

WITH profit AS mtd_table AS


( SELECT DATE(o.shipped_at) AS order_date, ( SELECT order_date,
p.category AS product_category, product_category,
ROUND(SUM(o.sale_price - p.cost),2) AS category_profit category_profit,
FROM SUM(category_profit) OVER(PARTITION BY
`bigquery-public-data.thelook_ecommerce.order_items` AS o product_category, EXTRACT(MONTH FROM order_date) ORDER BY
INNER JOIN product_category,order_date) AS mtd
`bigquery-public-data.thelook_ecommerce.products` AS p FROM profit
ON o.product_id = p.id ORDER BY 2,1
WHERE o.shipped_at BETWEEN "2022-06-01" AND "2022-08-16" AND )
status = "Complete" SELECT order_date,product_category, mtd
GROUP BY 1,2 FROM mtd_table
ORDER BY 2,1 WHERE order_date BETWEEN "2022-06-01" AND "2022-08-16"
), AND EXTRACT(DAY FROM order_date) = 15
Result Preview
Profit in the past 3 Month by Product Category

With the exception of sweaters, which show a decrease from July to


August 2022, as well as the Swim and Sleep & Lounge categories
decreasing from June to July 2022, almost all product categories show
an increase in profit from June to August 2022.
Question Number 6
Find monthly growth of inventory in percentage
breakdown by product categories, ordered by time
descendingly. After analyzing the monthly growth, is
there any interesting insight that we can get?
Time frame Jan 2019 until Apr 2022.
Expected output:
a. Month_year
b. Categories
c. Growth (%)
The Query of Question Number 6 syntax SQL

WITH Base_2 as
Base_1 as (SELECT
(SELECT Month_year,
DATE_TRUNC(date(A1.created_at), month) AS product_category,
Month_year, order_completed,
A1.product_category, LAG(order_completed,1) OVER(PARTITION BY
COUNT(A2.status) order_completed product_category ORDER BY Month_year)
FROM previous_order_completed,
`bigquery-public-data.thelook_ecommerce.inventory_it FROM Base_1)
ems` A1 SELECT
INNER JOIN Month_year,
`bigquery-public-data.thelook_ecommerce.order_items` Product_category,
A2 CONCAT(((order_completed-previous_order_completed)/pr
ON A1.product_id = A2.product_id evious_order_completed)*100),’%’) AS Growth,
WHERE A2.status = "Complete" FROM Base_2
GROUP BY 1,2), WHERE Month_year BETWEEN "2019-01-01" AND
"2022-04-30"
ORDER BY 1 DESC, 2;
Result Preview
Insight & Recommendation:

- On average, the top 3 highest categories of


inventory growth (per month) is Clothing Sets
(90.63%), Underwear (75.08%), and Jumpsuits
& Rompers (52.55%) during January 2019 till
April 2022.

- During January 2019 till April 2022, the category


of clothing sets has become the best-selling
product on The Look Ecommerce and the product
that did not sell well was in the Active category
with 24.26%. More promotion and campaign
needs to be done for products whose growth
is below 30%.
Question Number 7

Create monthly retention cohorts (the groups, or cohorts,


can be defined based upon the date that a user
completely purchased a product), and then how many of
them (%) coming back for the following months in 2022.
The Query of Question Number 7 syntax
SQL
WITH retention AS (
cohort_base AS ( SELECT
SELECT cohort_month,
user_id, count_user count_user_permonth
MIN(DATE(DATE_TRUNC(created_at, month))) FROM cohort_size
OVER(PARTITION BY user_id) cohort_month, WHERE month = 0)
DATE(DATE_TRUNC(created_at, month)) order_time SELECT
FROM A1.cohort_month,
`bigquery-public-data.thelook_ecommerce.orders`), A1.month,
activities AS ( A2.count_user_permonth,
SELECT A1.count_user,
*, A1.count_user/A2.count_user_permonth AS percentage
DATE_DIFF(order_time, cohort_month, month) month FROM cohort_size A1
FROM cohort_base), LEFT JOIN retention A2
cohort_size AS ( ON A1.cohort_month = A2.cohort_month
SELECT WHERE A1.cohort_month >= '2022-01-01'
cohort_month, ORDER BY 1,2;
month,
COUNT(user_id) count_user
FROM activities
WHERE month<=9
GROUP BY 1,2),
Result and Schema Preview
Cohort Analysis

Insight:
● The total number of new users slightly increased from 2953 in January 2022 to 4953 in the
beginning of October 2022.
● In September 2022, the retention rate decrease to 7.57% from the previous month caused
by incomplete dataset in October 2022.
● August 2022 became the highest retention rate in the first month at 18.60%.
● Out of all of the new users during this time range (41787 users), 9.66% users are retained on
month 1 that means about 90% of customers do not place an order after the first month.
Recommendation

- TheLook Ecommerce should increase user engagement and retention by


sending notifications about products they've searched for. Include a statement
that purchasing the product now will result in cashback or points.
- to increase sales for products in the active category with clearance sales
during high seasons like summer holidays or payday sales.
- Another suggestion for increasing active category sales is to combine it with
the best-selling product.

You might also like