Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

Target SQL

Uploaded by

monthuno netflix
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Target SQL

Uploaded by

monthuno netflix
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

-- 1.

Import the dataset and do usual


exploratory analysis steps like checking
the structure & characteristics of the
dataset:
-- 1.Data type of all columns in the "customers"
table. (done)

select column_name,data_type
from `dsml-march-384117.Target_SQL.INFORMATION_SCHEMA.COLUMNS`
where table_name='customers'

--2. Get the time range between which the orders were
placed. (done)
select max(order_purchase_timestamp) as latest_date,min(order_purchase_timestamp)
as earliest_date
from `Target_SQL.Orders`;
INSIGHT: We can see that the first order was placed on 2016-09-04 and the latest
order is plaed on 2018-10-17

-- 3.Count the number of Cities and States in our


dataset. (done)
select count(distinct(customer_city)) as number_of_cities
,count(distinct(customer_state)) as number_of_states from `Target_SQL.customers`

Insight: There are 4119 different


cities across 27 different states
-- 2.IN DEPTH EXPLORATION (done)
-- 1.Is there a growing trend in the no. of orders
placed over the past years? (done)
select extract(year from order_purchase_timestamp) as year_of_order
,count(distinct(order_id)) as number_of_orders from `Target_SQL.Orders`
group by extract( year from order_purchase_timestamp);

Insight: There is a growing trend in the number of orders by each year.

--2.Can we see some kind of monthly seasonality in


terms of the no. of orders being placed? (done)
select extract(year from order_purchase_timestamp) as year_of_order,extract(month
from order_purchase_timestamp) as month_of_order ,count(distinct(order_id)) as
number_of_orders from `Target_SQL.Orders`
group by extract( year from order_purchase_timestamp),extract(month from
order_purchase_timestamp);
-- 3.What time do Brazilian customers tend to buy
(Dawn, Morning, Afternoon or Night)? (done)
with cte as (
select *,extract(hour from order_purchase_timestamp) as hour_ordered,
case
when extract(hour from order_purchase_timestamp) >=0 and extract(hour from
order_purchase_timestamp)<7 then 'Dawn'
when extract(hour from order_purchase_timestamp) > 6 and extract(hour from
order_purchase_timestamp)< 13 then 'Mornings'
when extract(hour from order_purchase_timestamp) > 12 and extract(hour from
order_purchase_timestamp)< 19 then 'Afternoon'
when extract(hour from order_purchase_timestamp) > 18 and extract(hour from
order_purchase_timestamp)<24 then 'Night'
end as time_of_day
from `Target_SQL.Orders`
) select time_of_day,count(order_id) as number_of_purchases from cte group by
time_of_day order by count(order_id);
Insights: Customers tend to by the most in afternoons and then there is a close call between
mornings and nights. This can be justified that they have a lot of free times in the evenings and after
lunch.

-- 3. Evolution of E-commerce orders in


the Brazil region: (done)

-- 1.Get month on month orders by states (done)


select customer_state,Extract(year from order_purchase_timestamp) as
year_of_order,Extract(month from order_purchase_timestamp) as month_of_order
,count(distinct(t2.order_id)) as number_of_orders from `Target_SQL.customers` t1
join `Target_SQL.Orders` t2 on t1.customer_id=t2.customer_id
group by customer_state,Extract(month from order_purchase_timestamp),Extract(year
from order_purchase_timestamp);

--2. Distribution of customers across the states in


Brazil (done)
select customer_state,count(distinct(customer_id)) as number_of_customers from
`Target_SQL.customers` group by customer_state order by number_of_customers desc;
Inisghts: The top states are listed above. We could run an ad campaign to increase
the sales. And could run some discounts in other states to help them increase the
sales.

-- 4 Impact on Economy: Analyze the money


movement by e-commerce by looking at
order prices, freight and others. (done)
-- 1.Get the % increase in the cost of orders from year 2017
to 2018 (include months between Jan to Aug only).You can use
the "payment_value" column in the payments table to get the
cost of orders. (done)

with cte as (
select extract(year from order_purchase_timestamp) as year,sum(t2.payment_value) as
total_payment_value from `Target_SQL.Orders` t1 join `Target_SQL.payments` t2 on
t1.order_id=t2.order_id
where extract(year from order_purchase_timestamp) >= 2017 and extract(month from
order_purchase_timestamp)<=8
group by extract(year from order_purchase_timestamp)),
cte2 as (
select max(case when year=2018 then total_payment_value end) as sum_2018, max(case
when year=2017 then total_payment_value end) as sum_2017
from cte)
select round(100*(sum_2018-sum_2017)/(sum_2017),2) as percent_increase from cte2;

INSIGHT: THERE IS A TREMENDOUS INCREASE IN THE REVENUE FROM 2017 TO 2018

-- 2.Calculate the Total & Average value of order


price for each state. (done)
select customer_state,sum(payment_value) as total,avg(payment_value) as
avg_order_price from `Target_SQL.payments` t1 join `Target_SQL.Orders` t2 on
t2.order_id=t1.order_id
join `Target_SQL.customers` t3 on t3.customer_id=t2.customer_id
group by customer_state;
-- 3.Calculate the Total & Average value of order
freight for each state. (done)
select customer_state,sum(freight_value) as total_freight,avg(freight_value) as
average_freight from `Target_SQL.order_items` t1 join `Target_SQL.Orders` t2 on
t1.order_id=t2.order_id
join `Target_SQL.customers` t3 on t3.customer_id=t2.customer_id
group by customer_state

-- 5 Analysis based on sales, freight and


delivery time (done)
-- 1. Find the no. of days taken to deliver each order
from the order’s purchase date as delivery time.
-- Also, calculate the difference (in days) between
the estimated & actual delivery date of an order.
-- Do this in a single query.

-- You can calculate the delivery time and the


difference between the estimated & actual delivery date using
the given formula:
-- time_to_deliver = order_delivered_customer_date -
order_purchase_timestamp
-- diff_estimated_delivery =
order_estimated_delivery_date - order_delivered_customer_date
(done)
select
order_id,order_delivered_customer_date,order_purchase_timestamp,order_estimated_del
ivery_date,date_diff(order_delivered_customer_date,order_purchase_timestamp,day) as
time_to_delivery_in_days,date_diff(order_delivered_customer_date,order_estimated_de
livery_date,day) as diff_estimated_delivery_days
from `Target_SQL.Orders`;

-- 2.Find out the top 5 states with the highest & lowest
average freight value. (done )
with cte as (
select seller_state,avg(freight_value) as average_freight from
`Target_SQL.order_items`t1 join `Target_SQL.Orders` t2 on t1.order_id=t2.order_id
join `Target_SQL.sellers` t3 on t3.seller_id=t1.seller_id
group by seller_state)
select * from cte order by average_freight desc limit 5;

Highest Average frieght

Insights: The highest average states are the


above

Lowest Average frieght


with cte as (
select seller_state,avg(freight_value) as average_freight from
`Target_SQL.order_items`t1 join `Target_SQL.Orders` t2 on t1.order_id=t2.order_id
join `Target_SQL.sellers` t3 on t3.seller_id=t1.seller_id
group by seller_state)
select * from cte order by average_freight limit 5

Insights: The lowest average freights are the above

--3.Find out the top 5 states with the highest & lowest
average delivery time. (done)

(highest 5 states)
with cte as (
select
customer_state,avg(date_diff(order_delivered_customer_date,order_purchase_timestamp
,day)) as avg_time_to_delivery_in_days
from `Target_SQL.Orders` t1 join `Target_SQL.customers` t2 on
t1.customer_id=t2.customer_id
group by customer_state )
select * from cte order by cte.avg_time_to_delivery_in_days desc limit 5;

(lowest 5 states)

with cte as (
select
customer_state,avg(date_diff(order_delivered_customer_date,order_purchase_timestamp
,day)) as avg_time_to_delivery_in_days
from `Target_SQL.Orders` t1 join `Target_SQL.customers` t2 on
t1.customer_id=t2.customer_id
group by customer_state )
select * from cte order by cte.avg_time_to_delivery_in_days limit 5;
-- 4.Find out the top 5 states where the order delivery is
really fast as compared to the estimated date of delivery.
(done)

with cte as (
select
customer_state,avg(date_diff(order_estimated_delivery_date,order_delivered_customer
_date,day)) as day_diff from `Target_SQL.Orders` t1 join `Target_SQL.customers` t2
on t1.customer_id=t2.customer_id
group by customer_state)
select * from cte order by day_diff desc limit 5

Insights: There are the in which the deliveries are delivered quicker compared to the estimated
delivery time.
-- 6 Analysis based on the payments:
(done)
-- 1.Find the month on month no. of orders placed
using different payment types. (done)
select extract(year from order_purchase_timestamp
) as year_of_purchase,extract(month from
order_purchase_timestamp) as
month_of_purchase,payment_type,count(t1.order_id) as
number_of_orders
from `Target_SQL.payments` t1 join `Target_SQL.Orders` t2 on
t1.order_id=t2.order_id
group by payment_type,extract(year from
order_purchase_timestamp),extract(month from
order_purchase_timestamp);

Insights: The number of orders for the payment type


are the most. As people like to pay using their creid
card.
-- 2.Find the no. of orders placed on the basis of
the payment installments that have been paid. (done)

select payment_installments,count(t1.order_id) as number_of_orders


from `Target_SQL.payments` t1 join `Target_SQL.Orders` t2 on
t1.order_id=t2.order_id
group by payment_installments;

INSIGHTS:

Number of orders for the payment_installment 1 is outweighing the other payment instalments.

OVERALL INSIGHTS:

• Customer behaviour: Customers tend to buy more between 12:00 and 19:00. We can
introduce discounts, offers and deals during this period to increase the sales further.
• Operational Efficiency: States AC RO AP MP RR are efficient in terms of delivering the orders
way within time.
• Performance metrics: The sales have increased around 136% from the previous previous
year for in 2017 and 2018 for the months jan to august.
• Payment type: Majority of the payments include credit card as the payment method. So we
can issue some offers or discounts on the credit card payments to increase the sales further.
• Delivery time: States like RR ap am al pl have the highest average delivery time. This could be
because of the operational inefficiency or the lack of proper infrastructure. We should invest
in these states.

You might also like