Data Analytics and Strategy Final Report
Data Analytics and Strategy Final Report
Data Analytics and Strategy Final Report
So for that I will run Frequency, Cross Tabulations, Anova, T-test, K-Means Cluster analysis,
regression etc. test so that I can give an interpretation so that my understanding in this
module can be seen and the company I will work for get a meaningful interpretation for their
development.
Table of Contents
1. Introduction............................................................................................................................1
2. Consumers’ engagement........................................................................................................1
Frequencies.............................................................................................................................2
Interpretation:.........................................................................................................................3
Cross Tabulation.....................................................................................................................3
Interpretation:.........................................................................................................................4
Regression Analysis...............................................................................................................4
Frequencies.............................................................................................................................8
Cross Tabulations.................................................................................................................10
T-test.....................................................................................................................................12
Anova...................................................................................................................................12
Recommendations....................................................................................................................13
Conclusion................................................................................................................................13
1. Introduction
The primary necessary thing for this report is to set up a scene myself and elaborate the report
so I shall not pick any complicated scenario for that and try to keep it as simple as I can. As
the timeframe is really important for the report because when the time changes in analytics
then the data also changes so my time duration is from February 12, 2021 to February 18,
2021. I am also going to specify what I want from the data and also recommend and describe
the useful improvements needed for the website.
Each techniques of the SPSS has its own purpose and I will apply those techniques and
explain where necessary because the purpose of this module is to set up a Marketing strategy
so we need to know about the Micro environment which are the consumers and analyse then
interpret the data so that we can give it to the Marketing department so that they can use it
effectively for their business upgradation purpose in future if I work as an analyst.
Objectives:
2. Consumers’ engagement
In order to understand the consumers’ engagement with the website by a country or cities
I am going to pick a subject first and subjects can be individuals, firms, places, objects,
websites or time etc. but in this case I will choose a website. I will also choose a variable
and a variable is the characteristics of the object being studied just as age or height of the
individuals, likes in Facebook page or sales on each month etc. Variables always need not
to be numbers. There are two types of variables and they are continuous and
discontinuous variables. In a discontinuous variable there are no continuity. There are
1
both dependent and independent variables. As there are two types of analysis, the one is
the correlation and the other one is the multiple regression.
Frequencies
In order to analyse the new users and the page per session of the website now I am going
to do the frequency test on SPSS and interpret the data. From Google Analytics to
Microsoft excel I took the data to IBM SPSS software and then when I went to analyse,
then descriptive statistics and frequencies then I took the new users and page per session
then set up all the format, statistics and graph requirements I got the following results.
Statistics
Valid 39 39
N
Missing 0 0
Mean 261.8462 2.9967
Median 117.0000 2.8300
a
Mode 44.00 2.64a
Std. Deviation 659.35397 .74739
Variance 434747.660 .559
Skewness 5.626 2.482
Std. Error of .378 .378
Skewness
2
Interpretation: After analysing the number of new users we got both from page per
session and also the new users. The mean I got for the new users and page per session is
261.85 and 2.99 respectively. The median is 117 and 2.83 respectively. The standard
deviation of new users is 659.35 and for the page per sessions is .747. The number of new
users is 39 in both the cases.
Cross Tabulation
Cases
3
Interpretation: When I went to analyse the crosstabs I went to the descriptive statistics
and then clicked on crosstabs and put new users and page per session to analyse the results
after selecting the row, column and total and also selected the bar chart I found out the above
results and in both cases the new users are 39 and the highest number of new users is 966 and
the highest percentage of the page per session is 6.35 whereas the lowest number of users and
page per session is 40 and 1.87 respectively.
Regression Analysis
After going to the regression analysis I have selected linear and put average session duration
as the dependent variable and new users as the independent variable then going on to the
statistics and adding R squared change and descriptive I clicked on ok and got the following
results.
Descriptive Statistics
Correlations
Model Summary
4
a. Predictors: (Constant), New Users
ANOVAa
Total 165804.886 38
Coefficients
Interpretation: In the model summary I got R square and adjusted R square which is .
034 which indicates this amount of dependent variables are defined by independent variables.
Then on the anova I got a statistically significant result which is below than the P value and it
is .135. Now on the coefficient table I got unstandardized coefficients where new users are .
024 and on the p value I got .135 which is statistically significant. This analysis indicates that
each day the new users increase by .24.
The first thing before doing the K Means Cluster analysis is to standardise the session
duration and the session.
Descriptive Statistics
5
Avg# Session Duration 39 44.85 411.13 138.0926 66.05518
Valid N (listwise) 39
Now here I have two new variables where the session has the maximum number of mean 370
and the highest standard deviation.
For analysing and then interpreting K means cluster of countries I have selected two variables
which are the Z scores of the sessions and average session duration and then I went to analyse
tab on SPSS and classify then selected the K-Means Cluster. Then I kept the number of
variables by default 2 and maximum iterations by default 10. I just added the Anova table
because I will interpret that and then when I clicked ok I got the following tables.
Cluster
1 2
Iteration Historya
Zscore(Sessions) 5.92983 -.15605
Zscore: Avg# Session 1.54821 -.04074 Iteration Change in Cluster Centers
Duration 1 2
ANOVA
1 .000 1.218
Cluster Error F Sig.
2 .000 .000
Mean Square df Mean Square df
Now I have got the Initial and final cluster centres where I have cluster 1 and 2. I also have 2
Iteration history. Now if I select the final cluster centres after double clicking them and them
right click on it then select bar chart then I will get a bar chart which is given below.
6
Now here in the cluster table 1 it is showing that the sessions are on the very high side and
the average session duration is on the very lower side. At cluster 2 it is showing that the
session and the session duration are both low but the session is slightly on the higher side.
Now on the Anova table I can see that I have statistical insignificance on the average session
duration which is .118 but I have significant results on the sessions.
As an analyst working on a company my primary task is to analyse and interpret the data and
also identify the group of consumers so that I can identify the potential value and number of
consumers in a certain area where the business can be developed. I will interpret the
frequencies of users in a certain area and identify the age groups so that I can understand
which age group spends more time browsing on the internet and make a report to my
companies marketing department so that they can identify their target markets much easily.
Frequencies
7
As my task is to finding out the customer groups the variables I am going to use is the age
and the new users in various particular cities because by taking these variables I will have a
clear view of the demographics of the audiences who visit the website and how many of them
are starting to visit the site as new users.
Statistics
Valid 60 59 59
N
Missing 0 1 1
Mean 16.97 20.12
Median 12.00 14.00
Mode 10 12
City
8
Paris 2 3.3 3.3 60.0
NewUsers
9
50 1 1.7 1.7 96.6
Interpretation: The number of new users of the website is 59 where the mean is 16.97
and the median is 12 and the mode is 10 for the new users. The maximum frequency of new
users is 7. On the histogram the mean is 16.97 and the standard deviation is 16.6 for the new
users and for the users it is 20.12 and the standard deviation is 18.67. The highest frequency
of users are located in Bangladesh which is 6.
Cross Tabulations
Before I start analysing crosstabs I will first transform the time per session into categorical
variables. After that I analysed the age and time per session in order to see which age group
visits the website more.
Cases
10
Age * time per session Crosstabulation
1.00 3.00
Count 0 1 1
Count 1 21 22
Count 0 26 26
Count 0 5 5
Count 0 4 4
Count 0 1 1
Count 0 1 1
65+ % within Age 0.0% 100.0% 100.0%
Here it is seen that the lowest number of users are over the age group of 55 and the highest
number of users are from the age group 25-34 year range.
T-test
T-test actually compares two of the variables with the similarity of their mean and it is an
easy test to do and find out differences between different variables. For doing this test I am
going to pick. So in this case I went to analyse and one sample T-test and then put the new
11
users on the box and then took 21 as a known value. Then when we pressed ok I can see two
tables came out.
One-Sample Statistics
One-Sample Test
Test Value = 21
Lower Upper
Here the t value is -1.88 and the df is 58. 59 number of new users and the mean is 16.97. The
mean difference is -4.034 standard error mean is 2.15.
Anova
I will perform a one way Anova test by picking up one dependent variable and another
variable as the factor. So I took goal completions as my dependent variable and took bounce
rate as a factor. Then I went to options I have selected all the results except fixed and random
effects. Then I pressed ok to analyse.
ANOVA
GoalCompletions
12
The most important outcome of the one way Anova is the mean square which is 13 within
groups.
Recommendations
Bounce rate of the consumers can fall due to the design and tough usage or
less device friendly websites so to retain customer’s website design must be
developed. (Venvidel, 2014)
Age group between 18-35 is mostly seen active and viewing websites on
social media much frequently. (Schultz, 2018)
Conclusion
After running many tests in SPSS I find out that this software is the easiest for analysing
critical statistical data in a very short moment of time and is easy to operate, although I am
not a statistician but I could easily run different types of tests with an ease and was able to
interpret them for a company’s future development programs.
13
References
1. Arikan, A. (2008) Multichannel Marketing: Metrics and Methods for On and Offline
Success, Wiley Publishing, Indiana.
2. Cutler, M. and Sterne, J. (2000), E-Metrics: Business Metrics for the New Economy,
NetGenesis, and Chicago, IL.
4. Phippen A., Sheppad L., Furnell S. (2004), “A practical evaluation of web analytics”,
Internet Research, Volume 14, Number 4, pp. 284-293
5. Sterne, Jim (2002) Web Metrics: Proven Methods for Measuring Web Site Success, John
Wiley & Sons, New York.
14