Module 4 Advanced Data Analytics Techniques BRM
Module 4 Advanced Data Analytics Techniques BRM
ANALYTICS
TECHNIQUES
UNIT-4
UNDERSTANDING CORRELATION: A RETAIL CASE
STUDY
In a bustling retail environment, understanding the dynamics between different variables is crucial for optimizing
business strategies. A prominent retail chain, let's call it "Fashion Mart," sought to enhance its sales performance
through a deeper comprehension of the correlations between various factors affecting customer purchasing
behavior.
Fashion Mart collected data on several key variables, including store foot traffic, promotional activities, average
customer spending, and customer satisfaction ratings over a six-month period. Employing correlation analysis,
they aimed to unveil the relationships between these factors and identify areas for improvement.
The results revealed compelling insights. Firstly, there was a strong positive correlation between promotional
activities and store foot traffic, indicating that effective promotions attracted more customers to the stores.
Secondly, customer satisfaction ratings exhibited a significant positive correlation with average customer
spending, emphasizing the importance of providing exceptional customer service to drive sales.
2
CORRELATION
Correlation measures the degree of the association between
two or more set of variables.
3
TYPES OF
CORRELATION
4
5
6
ZERO
CORRELATION
7
QUANTITATIVE ESTIMATE OF A
LINEAR CORRELATION
9
TESTING THE
SIGNIFICANCE OF
THE
CORRELATION
COEFFICIENT
The statistical test for the
significance of a correlation
coefficient is conducted using a t-
statistic.
The hypothesis to be tested is
mentioned below:
H0 : Ρ = 0
H1 : Ρ ≠ 0
10
REGRESSION
ANALYSIS
11
CASE STUDY
In a retail chain, the management noticed a decline in sales despite steady foot traffic. They turned to
regression analysis to uncover the underlying factors. By analyzing historical sales data alongside
variables like advertising expenditure, seasonal trends, and competitor activity, the team employed
regression analysis. Results revealed a significant correlation between sales and advertising expenditure,
as well as a negative impact of competitor promotions. Armed with these insights, the company adjusted
its marketing strategy, allocating more budget towards targeted advertising campaigns and timing
promotions strategically to counter competitor activity. Consequently, sales rebounded, surpassing
previous levels. This case highlights the power of regression analysis in identifying key drivers of
business performance and informing data-driven decision-making.
12
ASSIGNMENT
• Using the simple linear regression equation: Y = 2X + 3, predict the value of Y when X = 8.
• Given the following data points for a simple linear regression:
X: [2, 4, 6, 8, 10]
Y: [5, 7, 9, 11, 13]
Calculate the slope (β1) and intercept (β0) of the regression line, where Y is dependent on
X.
• Given the linear regression model: Y=3X+2Y=3X+2, interpret the coefficients β0 and β1.
13
LINEAR
REGRESSION
14
MULTIPLE
LINEAR
REGRESSION
15
DUMMY VARIABLES IN REGRESSION ANALYSIS
There could be situations where the dependent variable may be influenced by the qualitative variables like gender,
marital status, profession, geographical region, colour, or religion. For instance, the demand for cosmetics is not
only influenced by the price of cosmetics and consumer’s income but also by the gender of the respondents. This is
important because we have reasons to believe that females use more cosmetics than males. Therefore, its inclusion
in the regression model as the regressor (independent variable) is required. The important question which comes to
our mind is how to quantify the qualitative variable mentioned as above. In situations like this, the dummy
variables come to our rescue. They are used to quantify the qualitative variables. The number of dummy variables
required in the regression model is equal to the number of categories of data less one.
For example, in the case of gender (male and female) we will use one dummy variable. In case we are considering
four religions (Hindu, Sikh, Christian and Muslim) there would be three dummy variables required in the model.
Dummy variable usually assumes, two values 0 and 1.
16
EXAMPLE OF DUMMY VARIABLE
Let us consider an example to illustrate the concept of dummy variables. Suppose the starting salary of a college
lecturer is influenced not only by years of teaching experience but also by gender. Therefore, the model could be
specified as:
17
ASSIGNMENT
• In a multiple linear regression model with two independent variables X1and X2, if the coefficients
are β1=0.4 and β2=0.6, how would you interpret these coefficients in the context of predicting the
dependent variable Y?
• Consider a multiple linear regression model: Y=2X1+3X2+4X3+5. If the value of X1 increases by 1
unit and the values of X2 and X3remain constant, how much does the predicted value of Y change?
18
REGRESSION ANALYSIS BY USING SPSS
19
SPSS RESULTS
20
INTERPRETATION OF SPSS RESULTS
From Table 15.12, the following estimated equation can be written.
The above estimated equation states that by keeping the other things constant as the experience increases by one
year, the average starting salary increases by 1.545 thousands of rupees. Further, other things being constant, the
starting salary of a male lecturer is more than the starting salary of a female lecturer by `3.286 thousands.
Further, both the numbers of years of experience as well as the gender are found to be significant variables as the
p values for their coefficients is 0.000. Here, through an example, we have shown that the constant term varies
for the male and the female salary functions.
The R2 for the model is 0.987 (Table 15.10) which is high and significant as seen from the p value of the F
statistic (Table 15.11).
21
ASSIGNMENT
--------------------------------------------------------------
Coefficients Standard Error t-value p-value
--------------------------------------------------------------
Intercept 2.345 0.567 4.137 0.001
X1 0.874 0.123 7.098 0.000
X2 -1.235 0.198 -6.244 0.000
X3 0.543 0.087 6.248 0.000
X4 0.321 0.054 5.932 0.000
22
TEST OF SIGNIFICANCE OF
REGRESSION PARAMETERS
H0 : Β = 0
H1 : Β ≠ 0
23
ASSIGNMENT
24
GOODNESS OF FIT OF
REGRESSION EQUATION
25
ASSIGNMENT
• Suppose a simple linear regression model is used to predict monthly sales (Y) based on advertising
expenditure (X). After analyzing the data, the coefficient of determination (R²) is found to be 0.85.
Interpret this value in the context of the relationship between advertising expenditure and monthly
sales.
• Suppose two different regression models are developed to predict sales. Model A has an R² value of
0.75, while Model B has an R² value of 0.85. Which model provides a better fit to the data, and
why?
• Consider two different multiple regression models predicting house prices based on different sets of
independent variables. Model A has an R² value of 0.75, while Model B has an R² value of 0.65.
How would you interpret the differences in R² between these two models?
26
USES OF REGRESSION
ANALYSIS IN PREDICTION
27
CASE STUDY
In a retail chain, regression analysis emerged as a crucial tool for predicting customer demand. Facing
challenges in inventory management and stockouts, the company utilized regression models to forecast
sales accurately. By analyzing historical sales data alongside variables like promotions, seasonality, and
demographic trends, the company gained insights into consumer behavior patterns. Regression analysis
helped identify key drivers impacting sales, enabling the company to optimize inventory levels, plan
promotions effectively, and improve overall supply chain efficiency. As a result, stockouts decreased,
and customer satisfaction levels increased. Moreover, the predictive power of regression analysis
facilitated strategic decision-making, guiding product assortment, pricing strategies, and marketing
initiatives. This case exemplifies the pivotal role of regression analysis in the retail sector, empowering
companies to anticipate customer demand and enhance operational performance.
28
THANK YOU