Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Assignment 5

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

--A.

Create a YOY analysis for the count of customers enrolled with the company
--each month. The output should look like:

SELECT "Month",[2020] as Year_2020,[2021] as Year_2021


from
(select customerid,MONTH(DateEntered)as "Month",year(DateEntered) as yr from Customers
)as source_table
PIVOT (count(CustomerID) FOR yr IN ([2020],[2021])) AS PIVOT_TABLE;

--B. Find out the top 3 best-selling products in each of the categories that are
--currently active on the Website

SELECT *
FROM (
SELECT P.ProductID,
C.CategoryID,
SUM(O.Quantity) sales,
ROW_NUMBER() OVER (PARTITION BY MAX(P.category_id) ORDER BY SUM(O.quantity)
DESC) as ranking
FROM Products P INNER JOIN OrderDetails O
ON O.ProductID = P.ProductID
join Category C on C.CategoryID=P.category_id
where active = 1
GROUP BY C.CategoryID , P.ProductID ) t
where ranking<=3
ORDER BY categoryid;

--C. Find the out the least selling products in each of the categories that are
--currently active on the website

SELECT *
FROM (
SELECT P.ProductID,
C.CategoryID,
MAX(P.Category_ID) category,
SUM(O.Quantity) sales,
ROW_NUMBER() OVER (PARTITION BY MAX(P.category_id) ORDER BY SUM(O.quantity))
as ranking
FROM Products P INNER JOIN OrderDetails O
ON O.ProductID = P.ProductID
join Category C on C.CategoryID=P.category_id
where active = 1
GROUP BY C.CategoryID , P.ProductID
) t
ORDER BY categoryid desc;
--D. We are trying to find paired products that are often purchased together by
--the same user, such as chips and soft drinks, milk and curd etc..
--Find the top paired products names.

SELECT c.original_ID, c.bought_with, count(*) as times_bought_together


FROM (
SELECT a.ProductID as original_ID, b.ProductID as bought_with
FROM OrderDetails A
INNER join OrderDetails B
ON a.orderid = b.orderid AND a.ProductID != b.ProductID) c
GROUP BY c.original_ID, c.bought_with
having count(*)>1
--E. We want to understand the impact of running a campaign during
--July’21-Oct’21 what was the total sales generated for the categories
--“Beauty & Hygiene” and “Bevarages” by
--b. customers who enrolled with the company during the same period

select count(B.OrderID) as total_sales, C.CategoryName


from Orders A
join OrderDetails B on A.OrderID = B.OrderID
join Products P on P.ProductID = B.ProductID
join Category C on C.CategoryID = P.Category_ID
join Customers D on D.CustomerID = A.CustomerID
where CategoryName = 'Beauty & Hygiene' or CategoryName= 'Beverages'
and OrderDate between '2021-07-01' and '2021-10-01'
and DateEntered between '2021-07-01' and '2021-10-01'
group by C.CategoryName
--F. Create a Quarter-wise ranking in terms of revenue generated in each
--category in Year 2020

select distinct categoryid,sum(total_order_amount) as RevenueGen ,


ntile(4) over (order by sum(total_order_amount)) as QuarterRanking
from(
select c.categoryid,year(o.orderDate) as yr,o.total_order_amount
from orders as o
join orderdetails as o1 on o1.orderid=o.orderid
join Products as p on o1.ProductID= p.ProductID
join Category as c on c.CategoryID= p.Category_ID
join customers as cst on cst.customerid=o.customerid
)n
where yr=2020
group by categoryid;
-G. Find the top 3 Shipper companies in terms of
--a. Average delivery time for each category for the latest year

select * from Orders

select top(3) S.CompanyName , P.Category_ID , avg(datediff(day , OrderDate ,


DeliveryDate)) as avg_delivery_time
from orders O
join orderdetails D on D.orderid=O.orderid
join Shippers S on S.ShipperID = O.ShipperID
join products P on P.productid= D.productid
where datename(YY,OrderDate)=2021
group by S.CompanyName , P.Category_ID
--H. Find the top 25 customers in terms of
--a. Total no. of orders placed for Year 2021

select top(25) CustomerID ,count(CustomerID)


from Orders
where datename(YY,OrderDate)=2021
group by CustomerID
order by count(CustomerID)desc
--H. Find the top 25 customers in terms of
--b. Total Purchase Amount for the Year 2021

select top(25) CustomerID ,sum(Total_order_amount)


from Orders
where datename(YY,OrderDate)=2021
group by CustomerID
order by count(CustomerID)desc
--J. FInd the cumulative average order amount at a monthly level for year 2021
--b. Each customer

select CustomerID , MAX(DATENAME(MM,OrderDate)) AS JOININGMONTH ,


avg(Total_order_amount) over (partition by customerID ) as avg_amt
from Orders
where datename(YY,OrderDate)=2021
group by CustomerID, Total_order_amount
order by JOININGMONTH

--K. Find the 3-day rolling average for the total purchase amount by each
--customer

SELECT customerID ,AVG(total_order_amount)


OVER (ORDER BY [orderdate]
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
) MovingAveragethreeDay
from Orders
group by customerID, OrderDate , Total_order_amount
order by CustomerID
Q2. What is the difference between Group by and Partition by?

A GROUP BY normally reduces the number of rows returned by rolling them up and calculating
averages or sums for each row. PARTITION BY does not affect the number of rows returned, but it
changes how a window function's result is calculated
GROUP BY
The GROUP BY clause is used in SQL queries to define groups based on some given
criteria. These criteria are what we usually find as categories in reports. Examples
of criteria for grouping are:

 group all employees by their annual salary level


 group all trains by their first station
 group incomes and expenses by month
 group students according to the class in which they are enrolled

PARTITION BY
Depending on what you need to do, you can use a PARTITION BY in our queries to
calculate aggregated values on the defined groups. The PARTITION BY is combined
with OVER() and windows functions to calculate aggregated values. This is very
similar to GROUP BY and aggregate functions, but with one important difference:
when you use a PARTITION BY, the row-level details are preserved and not
collapsed. That is, you still have the original row-level details as well as the
aggregated values at your disposal. All aggregate functions can be used as window
functions.

Q3. What is the difference between a Temp Table and a View?

A view exists only for a single query. Each time you use the name of a view, its table is recreated
from existing data. A temporary table exists for the entire database session in which it was created.

In a relational database like SQL Server, views provide a way of giving users a way to work
with specific portions of a larger schema. Temporary tables are another means of providing
end users with a subset or collation of data from base tables, and at first glance it seems that
these are similar methods of achieving the same end. So it is important to understand what
the difference is between a view and a temp table.

A temp table is a base table that is not stored in the database. Instead it only exists while the
database session remains active, and it must be populated each session with data using
SQL INSERT commands. Similarly, a view is not stored with data but with a query that will
retrieve data. However, views exist only for a single query, and each time you generate a
view it is recreated from current data. In contrast, a temp table exists for the entire database
session, and once populated it retains those records until the session ends.

Q4. What is the difference between View and CTE?

Views being a physical object on database (but does not store data physically) and can be used on
multiple queries, thus provide flexibility and centralized approach. CTE, on the other hand are
temporary and will be created when they are used; that's why they are called as inline view

CTE:

CTE stands for Common Table expressions can be thought of as a temporary result set that is defined within the
execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE is like a derived
table in that it is not stored as an object and lasts only for the duration of the query. Unlike a derived table, a CTE
can be self-referencing and can be referenced multiple times in the same query. CTE improves readability and
ease in maintenance of complex queries and sub-queries.

View:

A view is a virtual table which doesn’t physically store any data, it consists of columns from one or more tables.
So, whenever we query a view then it retrieves data from the underlying base tables. It is a query stored as an
object. Views are used for security purpose in databases, views restrict the user from viewing certain column and
rows means by using view we can apply the restriction on accessing the rows and columns for specific user. Views
display only those data which are mentioned in the query, so it shows only data which is returned by the query
that is defined at the time of creation of the View.

Q5. What is the difference between Row Number, Rank and Dense Rank?

RANK Function
The RANK function is used to retrieve ranked rows based on the condition
of the ORDER BY clause. For example, if you want to find the name of the
car with third highest power, you can use RANK Function.
DENSE_RANK Function
The DENSE_RANK function is similar to RANK function however the
DENSE_RANK function does not skip any ranks if there is a tie between the
ranks of the preceding records. 

ROW_NUMBER Function
Unlike the RANK and DENSE_RANK functions, the ROW_NUMBER function
simply returns the row number of the sorted records starting with 1. For
example, if RANK and DENSE_RANK functions of the first two records in
the ORDER BY column are equal, both of them are assigned 1 as their
RANK and DENSE_RANK. However, the ROW_NUMBER function will assign
values 1 and 2 to those rows without taking the fact that they are equally
into account

Q6. Consider the single-column table below.

COL1 1 1.2 1.5 2.2 NA NAN NULL

Answer the following questions:

1. What is the possible data type of the column ‘COL1’?


2. What will the output of the following SQL statements

a. ‘SELECT COUNT (*) AS ENTRIES FROM TABLE;’

b. ‘SELECT COUNT(COL1) AS ENTRIES FROM TABLE;’

c. ‘SELECT COUNT (DISTINCT COL1) AS ENTRIES FROM TABLE;’

You might also like