0% found this document useful (0 votes)

58 views

Assignment 2

This document summarizes the results of market research assignment 2 conducted by Apurva Negi. The summary analyzes the data collected through various statistical tests and data cleaning methods. It identifies outliers, checks for normality and multicollinearity. Factor analysis and cluster analysis are conducted to understand underlying patterns in the data. Cronbach's alpha is used to test reliability. Discriminant analysis and structural equation modeling are also performed. The document concludes with identifying distinct properties and clusters in the data.

Uploaded by

Apurva Negi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views

Assignment 2

Uploaded by

Apurva Negi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

2019

Market Research Assignment 2

Apurva Negi (2018011)

Section B
8/14/2019
1. Error of Commission:

No error of commission as the means of all the variables are close to the scale points.

2. Missing Values
Respondent IDs 288 and 303 contain missing values in all the variables.
Thus, these two responses are removed.
Other missing values which are replaced by the mean whole values:
3,10,12,28,31,172,184,275

3. Outliers:
Removal of respondent cases which fall in extreme outliers:
56 unique cases removed due to extreme outlying property
Remaining cases: 327/384
4. Skewness and Kurtosis:
The likert scale data will be skewed or kurtotic and thus, removing the variables which
have high variability will make no sense. If the data was continuous, then we would
have considered the standard error x 3 criteria to remove the variables.

5. Normality:
Shapiro-Wilk test tells us whether our data represents a normal distribution or not. As
we can see that all the significance values are less than 0.05, thus, all our data is a
normal distribution (with 95% confidence level). Though, it makes no sense to prove
normality for likert scale data as the data can be skewed and still we will process
without analysis.
6. Correlation:
Checking multicollinearity in the variables if the Karl Pearson value >0.9, then we will
remove those variables. On seeing the correlation matrix, we found that there are no
variables with very high correlation values, thus, no multicollinearity.

Also, the determinant value is significant <0.01 at 99% confidence level

7. Factor Analysis:

KMO value 0.923, which means that the data and the sample size is good enough for
the test to proceed. Bartlett’s Test is significant and thus proves that the correlation
matrix is not an identity matrix.
All the variables have something related to each other, thus, the communalities
among them are >0.4.
Total variances explained by top 7 components is approximately 69%.

Rotations converged after 6 iterations, and we got variables clubbed into the 7 factors
or components. These 7 constructs are the same as defined earlier. Two variables
exhibited cross-loadings which are not acceptable as the factor analysis checks the
uni-dimentionality of the data.
Thus, removing InfoAcq_4 and InforAcq_5 will be a good step, and run factor
analysis again. After removing those two variables, and running factor analysis again,
the following were observed:
KMO and Bartlett’s value remain significantly good to proceed with the test.

Variances explained increased to approximately 70%.

All the variables are now clubbed in their respective 7 constructs.

Factor 1: Useful
Factor 2: Joy
Factor 3: Decision Quality
Factor 4: Playful
Factor 5: Usage Type
Factor 6: Competency
Factor 7: Acquired Information
8. Cronbach’s Alpha:

Factor 1: Useful

Factor 2: Joy

Factor 3: Decision Quality

Factor 4: Playful

Factor 5: Usage Type

Factor 6: Competency
Factor 7: Acquired Information

2. Discriminant Analysis
Experience as an dependent variable by recoding it into a median split variable. New variable
created is Exp_Split. The median value for Experience (through descriptives and frequencies)
is 3. Thus, Experience values less than and equal to 3 are termed as ‘Low’ with value 0, and
Experience values greater than 3 are termed as ‘High’ with value 1.

The variables Education, Experience, Playful, Competence, Type Usage and Information
Acquired are significant at 95% confidence level (sig value < 0.05). Wilk’s lambda tells us the
relative importance among the independent variables. Smaller the Wilk’s lambda values,
more the importance. Thus, importance:
Experience > Comp_Mean > Type_Mean > Playful_Mean > Education > others

The null hypothesis is accepted as the covariance matrices

are equal.
Sig<0.05

Pearson's correlation between the discriminant scores and the two groups is high (0.8)
36% of variances are not explained by the differences in the two groups (High and Low Exp),
thus, there is a greater discriminatory ability of the function.

The associated chi-square statistic tests the hypothesis that the means of the functions listed
are equal across groups. The small significance value indicates that the discriminant function
does better than chance at separating the groups High and Low Exp.

Larger the value, the greater the discriminatory ability of the

variables. The ability can have positive as well as negative
effect. Experience has the highest ability.
The ordering in the structure matrix is the same as that
suggested by the tests of equality of group means and is
different from that in the standardized coefficients table.
Variables having values greater than 0.4 is only Experience.

Tells us the mean cut-offs for yes and no

probabilities. From -1.211 to 1.464 lies the Zone of Confusion.

98.2% accuracy of the model

(Low-Low + High-High) / Total => (179+142)/327 = 98.18%
3. Cluster Analysis

Hierarchical Cluster Analysis

Variables: Experience, Playful_Mean, Comp_Mean, Joy_Mean, Info_Mean, Useful_Mean,

Type_Mean, Decision_Mean
There is huge jump at stage two
as the distance coefficients
have a huge jump, but for safe
purpose we can consider that
the clusters form can be three
or four as this is a subjective
test. Thus, when we go to the
next step of cluster analysis, we
can input 4 clusters.
Also, we get a long dendrogram, which shows 3-4 distinct clusters
K-Means Cluster Analysis

Though the cluster size is significantly varied, but when we perform K-Means Cluster
Analysis with 3 or 5 clusters, the cluster size vary drastically.

We can see distinct properties of the Four clusters.

TwoStep Cluster Analysis

Cluster sizes are considerably different. Thus, decreasing the cluster input in the test.
When the clusters were two, the ratio of sizes were relatively less than the other tests we did
earlier.
Conclusion: 2 Clusters
Predictor importance of frequency is the highest. We can see what happens if we remove it.
Removing frequency brings us to 3 clusters.
After removing gender too, we are left with 2 clusters.

The cluster size is relatively different.

Two clusters of 128 and 199 respondents.

After this, all the TwoStep Cluster analysis were giving significantly varied cluster sizes.
Thus, we stop here and conclude that only 2 clusters can be formed.
Structural Equation Modelling
Looking at the estimates, we see that all the observed variables significantly measure the
unobserved variables with confidence greater than 99.9% (CR value > 2.54). The relative
importance of each measure for the construct can be seen through Estimate value or from
the standardized regression weights.

Estimate S.E. C.R. P

Playful_1 <--- Playful 1.000
Playful_2 <--- Playful 1.343 .102 13.139 ***
Playful_3 <--- Playful 1.220 .099 12.301 ***
Playful_4 <--- Playful 1.262 .102 12.373 ***
Playful_5 <--- Playful 1.370 .106 12.966 ***
Playful_6 <--- Playful 1.418 .107 13.305 ***
Playful_7 <--- Playful 1.072 .098 10.958 ***
CompLatent_5 <--- Competency 1.000
CompLatent_4 <--- Competency .955 .090 10.595 ***
CompLatent_3 <--- Competency 1.019 .089 11.509 ***
CompLatent_1 <--- Competency 1.009 .090 11.212 ***
AtypUse_1 <--- Usage 1.000
AtypUse_2 <--- Usage 1.075 .060 17.844 ***
AtypUse_3 <--- Usage .980 .050 19.752 ***
AtypUse_4 <--- Usage 1.026 .053 19.227 ***
AtypUse_5 <--- Usage .969 .052 18.772 ***
Useful_7 <--- Useful 1.000
Useful_6 <--- Useful .976 .072 13.646 ***
Useful_5 <--- Useful 1.104 .075 14.752 ***
Useful_4 <--- Useful 1.216 .073 16.706 ***
Useful_3 <--- Useful 1.230 .076 16.273 ***
Useful_2 <--- Useful 1.150 .074 15.584 ***
Useful_1 <--- Useful 1.109 .076 14.553 ***
Joy_1 <--- Joy 1.000
Joy_2 <--- Joy 1.133 .070 16.078 ***
Joy_3 <--- Joy 1.062 .069 15.439 ***
Joy_4 <--- Joy 1.008 .062 16.353 ***
Joy_5 <--- Joy 1.162 .074 15.666 ***
Joy_6 <--- Joy 1.014 .065 15.559 ***
Joy_7 <--- Joy 1.061 .066 16.115 ***
InfoAcq_3 <--- InfoAcq 1.000
InfoAcq_2 <--- InfoAcq 1.125 .091 12.339 ***
InfoAcq_1 <--- InfoAcq 1.178 .087 13.586 ***
DecQual_8 <--- DecQual 1.000
DecQual_7 <--- DecQual 1.178 .135 8.729 ***
DecQual_6 <--- DecQual 1.050 .132 7.964 ***
DecQual_5 <--- DecQual 1.223 .135 9.069 ***
DecQual_4 <--- DecQual 1.236 .139 8.866 ***
DecQual_3 <--- DecQual 1.353 .143 9.427 ***
DecQual_2 <--- DecQual 1.240 .134 9.271 ***
DecQual_1 <--- DecQual 1.143 .134 8.551 ***
Correlations among the constructs can also be seen that no two construct is significantly
correlated with each other, which means that the constructs hold true for themselves.

Estimate
DecQual <--> InfoAcq .652
DecQual <--> Joy .419
DecQual <--> Useful .553
DecQual <--> Usage .186
DecQual <--> Competency .166
DecQual <--> Playful .291
Joy <--> InfoAcq .501
Useful <--> InfoAcq .576
Usage <--> InfoAcq .188
Competency <--> InfoAcq .290
Playful <--> InfoAcq .373
Useful <--> Joy .450
Usage <--> Joy .240
Competency <--> Joy .322
Playful <--> Joy .509
Usage <--> Useful .210
Competency <--> Useful .255
Playful <--> Useful .322
Competency <--> Usage .426
Playful <--> Usage .414
Playful <--> Competency .437

A snippet of Standardized total effect tables shows us the exact same result that we got from
our Exploratory Factor Analysis. Thereby confirming our constructs and measures.

Model Fit Summary

CMIN

Model NPAR CMIN DF P CMIN/DF

Default model 144 1531.829 758 .000 2.021

Saturated model 902 .000 0

Independence model 82 10471.692 820 .000 12.770

CMIN/DF value has to be less than 3 to be a good fit model.

Baseline Comparisons

NFI RFI IFI TLI

Model CFI
Delta1 rho1 Delta2 rho2

Default model .854 .842 .920 .913 .920

Saturated model 1.000 1.000 1.000

Independence model .000 .000 .000 .000 .000

CFI value need to be near 1 to be a goodness of fit model.

RMSEA

Model RMSEA LO 90 HI 90 PCLOSE

Default model .056 .052 .060 .008

Independence model .190 .187 .193 .000

RMSEA value need to be near 0 (strict cut-off: 0.05) which tells us the badness of fit.
Thus, through the model fit tests, we come to know that our model is the best fit model.

DAPv9d Mac2011
No ratings yet
DAPv9d Mac2011
36 pages
3 Module 3 Statistics Refresher
No ratings yet
3 Module 3 Statistics Refresher
50 pages
SEM Speed Run
No ratings yet
SEM Speed Run
7 pages
8 Factor Analysis
No ratings yet
8 Factor Analysis
30 pages
Data Analysis
100% (2)
Data Analysis
87 pages
Hypotheses:: 1) Case Screening
No ratings yet
Hypotheses:: 1) Case Screening
12 pages
Assignment 1 SOLUTION
No ratings yet
Assignment 1 SOLUTION
11 pages
Final Analysis - Sample
No ratings yet
Final Analysis - Sample
9 pages
Data Analysis & Interpretation
No ratings yet
Data Analysis & Interpretation
18 pages
JASP
No ratings yet
JASP
8 pages
Week 6.1 - Factorial Validity
No ratings yet
Week 6.1 - Factorial Validity
49 pages
Week10 - SPSS LAB - DAY2
No ratings yet
Week10 - SPSS LAB - DAY2
13 pages
Lecture 7.descriptive and Inferential Statistics
No ratings yet
Lecture 7.descriptive and Inferential Statistics
44 pages
Analysis and Results Final
No ratings yet
Analysis and Results Final
9 pages
Data Analysis - Selecting a Test
No ratings yet
Data Analysis - Selecting a Test
5 pages
Week5 Assumptions 1
No ratings yet
Week5 Assumptions 1
41 pages
Research Methods Unit 4
No ratings yet
Research Methods Unit 4
6 pages
Session 11 - Quantitative Data Analysis Part 1
No ratings yet
Session 11 - Quantitative Data Analysis Part 1
17 pages
Descriptive Descriptive Analysis and Histograms 1.1 Recode 1.2 Select Cases & Split File 2. Reliability
100% (1)
Descriptive Descriptive Analysis and Histograms 1.1 Recode 1.2 Select Cases & Split File 2. Reliability
6 pages
Summary | Raena AI
No ratings yet
Summary | Raena AI
12 pages
Final Exam
No ratings yet
Final Exam
5 pages
9 Data Analysis
No ratings yet
9 Data Analysis
43 pages
Bpcc 108 English Assignment 2024-25
No ratings yet
Bpcc 108 English Assignment 2024-25
19 pages
Statistics in Research Analysis
No ratings yet
Statistics in Research Analysis
12 pages
Lec448B 20160406
No ratings yet
Lec448B 20160406
30 pages
Psychological Assessment Outline Summary
No ratings yet
Psychological Assessment Outline Summary
9 pages
dataanalysistechniquesSHSOctober (1)
No ratings yet
dataanalysistechniquesSHSOctober (1)
58 pages
CPC Factoranalysis
No ratings yet
CPC Factoranalysis
50 pages
Business Research CH-6
No ratings yet
Business Research CH-6
28 pages
INFE StatsModule Part-3 T-Test ANOVA
No ratings yet
INFE StatsModule Part-3 T-Test ANOVA
15 pages
LECTURE: September 12, 2018: Interstitial
No ratings yet
LECTURE: September 12, 2018: Interstitial
16 pages
Lesson 09 - Tagged
No ratings yet
Lesson 09 - Tagged
34 pages
Session 1
No ratings yet
Session 1
51 pages
IPS 333 - Quantitative Data Analysis-1
No ratings yet
IPS 333 - Quantitative Data Analysis-1
28 pages
CEE 105 Inferential Stat Parametric Test Feb22
No ratings yet
CEE 105 Inferential Stat Parametric Test Feb22
132 pages
S3 Notes
No ratings yet
S3 Notes
9 pages
Key Points - STATS
No ratings yet
Key Points - STATS
15 pages
Rational Theoretical Approach to Test Construction given by Goldberg (1972)
No ratings yet
Rational Theoretical Approach to Test Construction given by Goldberg (1972)
29 pages
Basic Stat Tests
No ratings yet
Basic Stat Tests
33 pages
L4 Chapter 3 Data Analysis
No ratings yet
L4 Chapter 3 Data Analysis
21 pages
Statistics - Exam Reviewer (Final)
No ratings yet
Statistics - Exam Reviewer (Final)
10 pages
Block 05d ControChartAdvanced
No ratings yet
Block 05d ControChartAdvanced
98 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
18 pages
HR Analytics Session34
No ratings yet
HR Analytics Session34
22 pages
Course Unit 8 - Summary of Basic Statistical Tests III-2
No ratings yet
Course Unit 8 - Summary of Basic Statistical Tests III-2
21 pages
Lesson 7
No ratings yet
Lesson 7
10 pages
Essential Stats For Decision Making-1 Descriptive Stats-2011
No ratings yet
Essential Stats For Decision Making-1 Descriptive Stats-2011
116 pages
Psychological Assessment HW #3
No ratings yet
Psychological Assessment HW #3
7 pages
Document
No ratings yet
Document
1 page
Correlation JASP Guide
No ratings yet
Correlation JASP Guide
11 pages
BRM Report Group 8
No ratings yet
BRM Report Group 8
51 pages
Data Analysis - FRO - BW - 4 Slides - ST
No ratings yet
Data Analysis - FRO - BW - 4 Slides - ST
9 pages
Inferential Statistics Parametric and Non Parametric Student Workbook
No ratings yet
Inferential Statistics Parametric and Non Parametric Student Workbook
42 pages
RM Module 4
No ratings yet
RM Module 4
22 pages
Ilovepdf Merged Removed
No ratings yet
Ilovepdf Merged Removed
232 pages
Chap-11 Data Analysis
No ratings yet
Chap-11 Data Analysis
22 pages
BRM Report
No ratings yet
BRM Report
16 pages
SPSS Def + Example - New - 1!1!2011
No ratings yet
SPSS Def + Example - New - 1!1!2011
43 pages
Hypotheses Tests
No ratings yet
Hypotheses Tests
4 pages
Seminar 3
No ratings yet
Seminar 3
69 pages
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
From Everand
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
Stuart A. Klugman
4/5 (1)
Lampiran 1. Analisa Deskriptif: Case Processing Summary
No ratings yet
Lampiran 1. Analisa Deskriptif: Case Processing Summary
5 pages
Learning Probabilistic Graphical Models in R - Sample Chapter
No ratings yet
Learning Probabilistic Graphical Models in R - Sample Chapter
37 pages
Multivariate Analysis
No ratings yet
Multivariate Analysis
25 pages
Applied Statistics and Probability For Engineers Chapter - 7
No ratings yet
Applied Statistics and Probability For Engineers Chapter - 7
8 pages
PRP UNIT IV Markove Process
No ratings yet
PRP UNIT IV Markove Process
52 pages
Case Processing Summary
No ratings yet
Case Processing Summary
3 pages
Stochastic Hydrology: Indian Institute of Science
No ratings yet
Stochastic Hydrology: Indian Institute of Science
56 pages
Muestras de Distribucion
No ratings yet
Muestras de Distribucion
7 pages
ch05 230926 Student
No ratings yet
ch05 230926 Student
51 pages
Applications of Ito's Formula: 1. L Evy's Martingale Characterization of Brownian Motion
No ratings yet
Applications of Ito's Formula: 1. L Evy's Martingale Characterization of Brownian Motion
24 pages
Probability & Statistics: MATH F113
No ratings yet
Probability & Statistics: MATH F113
20 pages
Module 2 - Probability Concepts and Applications
No ratings yet
Module 2 - Probability Concepts and Applications
67 pages
Cito Proefschrift Maarten Marsman PDF
No ratings yet
Cito Proefschrift Maarten Marsman PDF
114 pages
Stats ch6
No ratings yet
Stats ch6
16 pages
Solution of Questions of Reliability
No ratings yet
Solution of Questions of Reliability
2 pages
如何理解Gumbel-Max trick？ - 知乎 PDF
No ratings yet
如何理解Gumbel-Max trick？ - 知乎 PDF
1 page
Prelim Quiz 2 - Attempt Review
No ratings yet
Prelim Quiz 2 - Attempt Review
6 pages
A Review Constructing Priors That Penalizes The Complexity of Gaussian Random Fields - Fuglstad Et Al
No ratings yet
A Review Constructing Priors That Penalizes The Complexity of Gaussian Random Fields - Fuglstad Et Al
8 pages
MBA-015 Business Statistics
No ratings yet
MBA-015 Business Statistics
7 pages
Answer Key Quizactivity - Mansci
No ratings yet
Answer Key Quizactivity - Mansci
10 pages
Probability & Random Process: Formulas
No ratings yet
Probability & Random Process: Formulas
10 pages
Business Statistics MBA IB (2024-27)
No ratings yet
Business Statistics MBA IB (2024-27)
6 pages
Simple Probability
No ratings yet
Simple Probability
45 pages
Qualitative Response Regression Models 1
No ratings yet
Qualitative Response Regression Models 1
29 pages
ECON1280 Analysis of Economic Data: Tutorial 3
No ratings yet
ECON1280 Analysis of Economic Data: Tutorial 3
28 pages
Stat 101.
No ratings yet
Stat 101.
3 pages
Stat Samp
No ratings yet
Stat Samp
3 pages
37 4 Hypergeometric Dist
No ratings yet
37 4 Hypergeometric Dist
8 pages
Simulation Input Data Analysis
No ratings yet
Simulation Input Data Analysis
43 pages