Problem Statement
Problem Statement
• Introduction: You are provided with a sample dataset from a retail store,
Super_Store. This dataset contains information about orders, customers, products,
and sales. Your task involves cleaning the data, analyzing sales, customer orders,
customer geography, and order processing time using Informatica PowerCenter.
Data Preparation:
• Username: system
• Password: Admin
Row_ID INT
Order_Date DATE
Ship_Date DATE
Ship_Mode VARCHAR(50)
Customer_ID VARCHAR(50)
Customer_Name VARCHAR(50)
Segment VARCHAR(50)
Country VARCHAR(50)
City VARCHAR(50)
State VARCHAR(50)
Postal_Code VARCHAR(50)
Region VARCHAR(50)
Product_ID VARCHAR(250)
Category VARCHAR(250)
Sub_Category VARCHAR(250)
Product_Name VARCHAR(250)
Sales INT
• NOTE : while loading data into table update order_date, ship_date Date Format to
DD/MM/YYYY
• Username: Administrator
• Password: Administrator
Following are the steps to import source table in Informatica Source Analyzer:
Data Cleaning:
Operations:
• After cleaning Load data into the Super_Store_Cleaned_Data target table (For
columns check sample output)
Username : Administrator
Password : Administrator
• Step 1 – Double click on the session object in wokflow manager. It will open
a task window to modify the task properties.
CUSTOMER_ID_NAME
21925-Zushuss Donatelli
16585-Ken Black
21520-Tracy Blumstein
NOTE : Super_Store_Cleaned_Data table data is used for the below every tasks.
Analysis Tasks:
Problem Statement: Summarize total sales and average sales for each customer. Identify
customers with significant contribution to overall sales.
Operations:
• Summarize the sales data by calculating the sum of sales and store it in Total_Sales
and average sales store it in a Avg_Sales column for each customer using their
'Customer_ID_Name'
• Order the summarized data in descending order based on the total sales
('Total_Sales').
• Filter customers with total sales greater than 3000 and average sales greater than
300 to focus on significant contributors.
• Load data into the Sales_Summary target table (For columns check sample
output)
Username : Administrator
Password : Administrator
• Step 1 – Double click on the session object in wokflow manager. It will open
a task window to modify the task properties.
Sample Output:
Problem Statement: Analyze customer orders to determine the most frequent buyers
and their order patterns.
Operations:
• Filter records for customers in category 'Office Supplies' and City in 'San Francisco'
to analyze local customer behavior.
• Create new column orders_count, Calculate the count of orders for each customer
to determine their order frequency.
• Sort the results by order count in descending order to identify the most frequent
buyers and get only top 10 records.
• Load data into the Order_Analysis target table (For columns check sample output)
Username : Administrator
Password : Administrator
• Step 1 – Double click on the session object in wokflow manager. It will open
a task window to modify the task properties.
CUSTOMER_ID_NA ORDERS_COUN
ME T
15520-Jeremy 4
Pistek
16510-Keith Herrera 4
Operations:
• Group customers by region (North, South, East, West) based on their location data.
• Load data into the Geography_Analysis target table (For columns check sample
output)
Username : Administrator
Password : Administrator
• Step 1 – Double click on the session object in wokflow manager. It will open
a task window to modify the task properties.
Problem Statement: Evaluate order processing efficiency by analyzing the time taken
between order placement and shipment,
Operations:
• Calculate the processing days for each order by subtracting the order date from
the ship date and store it in new column Processing_days.
• Categorize processing days (e.g., Less than 1 day then Immediate delivery, 1 to 3
days then Moderate Delivery, 3 or more days then Long term delivery ).
• Count the number of orders falling with in each categorise processing days for
each to analyze processing days distributions.
• Load data into the Order_Processing target table (For columns check sample
output)
Username : Administrator
Password : Administrator
• Step 1 – Double click on the session object in wokflow manager. It will open
a task window to modify the task properties.
CATEGORISE_PROCESSING_ ORDERS_COUN
DAYS T
Immediate delivery 32
After completing the challenge, Double click on the SAMPLE_TEST.EXE file to get the
sample score