0% found this document useful (0 votes)

1 views

Rm Unit 4 - Overview

Uploaded by

heritagebuildconprivatelimited

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

Rm Unit 4 - Overview

Uploaded by

heritagebuildconprivatelimited

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

UNIT 4

1
UNIT IV
Syllabus
• Summarizing the Data: Mean, Median,
Mode and Standard Deviation
• Data Analysis Techniques: Univariate and
Bivariate Analysis (Chi Square, ANOVA, Sign
test); Multivariate Analysis (Discriminant
Analysis, Cluster Analysis, Factor Analysis,
Multiple Linear Regression).

2
Central Tendency

3
Central Tendency

Mean Mode

Median

4
Deviation
• Mean Deviation
• Standard Deviation

5
Univariate and Bivariate Analysis

Univariate Analysis
Univariate Analysis refers to the analysis of one variable like average weight of
employees in an organisation. Here, there is no relationship of this variable with
any other variable.

Bivariate Analysis
Bivariate Analysis refers to the analysis of two variable like Age and Weight of
Employees. Here, the correlation can be determined between the two variables.

6
Data Analysis and Data Type

Goodness of Fit
Chi Square Test
(Non-Parametric Test) Independent
Nominal/ Categorical Data Attributes

ANOVA One Way

(Parametric Test)
Interval/ Ratio Scaled Data Two Way

Single Sample
Small Sample
Sign Test
Paired Samples
(Non-Parametric Test)
Single Sample
Ordinal/ Interval/ Ratio Scaled Data Large Sample
Paired Samples

7
Chi-Square: General Structure
• For analysis of categorical data
– Test for equality of percentages (goodness of Fit)
– Test for independence
• The chi-square statistic measures the difference between
the actual counts and the expected counts (assuming
validity of null hypothesis) as follows:
n
Oi  E i 
2


2
sta t   Ei
i 1

When we need to find out if the two more qualitative facets are
independent
8
Test for Goodness of Fit

Example Coin
A coin is tossed for 50 times and head appears to be 30 times. Test at five
percent level of significance if the coin is unbiased.

9
Solution
H0: The coin is unbiased
H1: The coin is biased n
2 stat  
Oi  E i 2
i1 Ei
df  n 1

Facets Observed Frequency Expected Frequency (O-E)2/E

H 30 25 1
T 20 25 1
2

Vc =2, Vt = 3.841
Vc <Vt
H0 is accepted
The coin is unbiased.
10
Chi Square Independent Attributes
Synergy Ltd. is organising a training programme for its 500 employee to improve
performance. Some attend the programme while others not. The observation is as
follows:

View Improve Not Improve

Attend 132 91
Not Attend 140 137

Is the training effective? Test at 5% level of significance.

11
H0 : Training is not effective.
H1 : Training is effective.
View Improve Not Improve Row Total
Attend 132 91 223
Not Attend 140 137 277
Column Total 272 228 Grand Total = 500
(Row Total)X(Column Total)
Expected Frequency 
Grand Total 223 272
a11   221.31
df  (c 1)(r 1); Where c  Column, r  Row 500
223 228
a12   101.69
Calculation of Expected Frequency 500
277  272
View Improve Not Improve a 21   150.69
Attend a11 a12 500
Not Attend a21 a22 277  228
a 22   126.31
500

12
O E (O – E)2 (O – E)2/E
132 121.31 114.23 0.94
91 101.69 114.23 1.12
140 150.69 114.23 0.76
137 126.31 114.23 0.90
3.73

2 3.73
Vc = 3.73
df  (c 1)(r 1) Vt = 3.841
df  (2 1)  (2 1) Vc < Vt
df  1 H0 is accepted.
The training is not effective.
Table Value 2 at 5% for 1 df  3.841

13
ANOVA: General Structure
• One-Way ANOVA
Variance SS DF MS F-ratio F-table
Between Sample SSB k-1 MSB=SSB/(k-1) MSB/MSW
Within Sample SSW n-k MSW=SSW/(n-k)
Total SST n-1

• Two-Way ANOVA
Variance SS DF MS F-ratio F-table
Between Columns SSC c-1 MSC=SSC/(c-1) MSC/MSE
Between Rows SSR r-1 MSR=SSR/(r-1) MSR/MSE
Residual Error SSE (c-1)(r-1) MSE=SSE/(c-1)(r-1)
Total SST n-1

14
One-Way ANOVA
Variance SS DF MS F-ratio F-table
Between Sample SSB V1 = k – 1 MSB=SSB/V1
MSB/MSW
Within Sample SSW V2 = N – k MSW=SSW/V2
Total SST N–1
T  Sum Total
T2
Correction Factor 
N
SST  SSB  SSW

SSB 
 X   X   X 
2 2 2

 ... 
 X  2
T2
  
1 2 3 n

n1 n2 n3 nn N
n T2
SST   ( X i )  2

i1 N

df  v1, v 2
v1  k 1
v2  N  k
15
Example
Healthy Agro Ltd. sows three samples of three kinds of seeds in a farm. The
productivity in tons is observed as follows:

Seed1 Seed2 Seed3

5 7 18
5 7 18
7 12 18

Is there any difference in productivity level of seeds? Test at 5% level of

significance.

16
H0: There is no difference in the seeds.
H1: There is difference in the seeds.
Seed Seed Seed S12 S22 S32
1 2 3
5 7 18 25 49 324
5 7 18 25 49 324
7 12 18 49 144 324
SUM 17 26 54 99 242 972
Total 97 Total 1313
N  X i2 
SST   X i  CF
k
T  97 2
SSB      CF

2 i1  ni 
CF  T i1

N SST  13131045.44 SSB 

X 1  X 2  X 3 
2 2

2
 CF

CF 
9 7  2
SST  267.55 n1 n2 n3
9
SSB 

17 2 26 2 54 2
  1045.44
CF 
3 3 3
1 0 4 5 .4 4
SSB  248.22
17
SST  SSB  SSW v1  k 1 v2  N  k
SSW  SST  SSB v1  3 1 v2  9  3
SSW  267.55  248.22 v1  2 v2  6
SSW  19.33
ANOVA Table
SV SS df MS F Fcrit
SST  267.556
Between SSB = 248.22 V1=2 MSB=124.11
SSB  248.222 38.5 5.14
Within SSW = 19.33 V2=6 MSW=3.22
SSW  19.3333
Total SST = 267.55
v1  2
Vc = 38.5
v2  6
Vt = 5.14
Vc > Vt
H0 is rejected
There is difference in the seeds.

18
Two-Way ANOVA
Variance SS DF MS F-ratio
Between
SSC V1 = c-1 MSC=SSC/(c-1) MSC/MSE
Columns
Between Rows SSR V1 = r-1 MSR=SSR/(r-1) MSR/MSE
Residual Error SSE V2 = (c-1)(r-1) MSE=SSE/(c-1)(r-1)
Total SST N-1
T  S u m To t a l
2
C o r r e c t i o n F a c t o r C F  
T df  v 1 , v 2
N
SST  SSC  SSR  SSE v1 (For Column)  c  1
n
v1 (For Row)  r  1
SST   ( Xi )2  CF
i 1
v 2  (c 1)(r 1)
 X 
2
 X 
2
 X 
2
 X 
2

SSC     ...   CF
1 2 3 n

n1 n2 n3 nn
 Y1     Y3   Yn 
2 2 2 2
Y2
SSR     ...   CF
n1 n2 n3 nn
19
Example
Three brands of detergents have been used in three water
temperatures to wash similar kinds of cloths. The cleanliness is observed
as follows:
Surf Excel Tide Wheel
Cold 5 7 18
Normal 7 12 21
Warm 10 14 25

Test if there is any difference because of

• Brands
• Water temperature

20
H01: There is no significant difference in cleanliness due to the water temperatures.

H11: There is significant difference in cleanliness due to the water temperatures.

H02: There is no significant difference in cleanliness due to the different detergent

brands.

H12: There is significant difference in cleanliness due to the detergent brands.

Surf Excel (X1) Tide (X2) Wheel (X3) Total

Cold (Y1) 5 7 18 30
Normal (Y2) 7 12 21 40
Warm (Y3) 10 14 25 49
Total 22 33 64 119

21
Surf Excel (X1) Tide (X2) Wheel (X3) Total X12 X22 X32
Cold (Y1) 5 7 18 30
25 49 324 398
Normal (Y2) 7 12 21 40 49 144 441 634
Warm (Y3) 10 14 25 49 100 196 625 921
Total 22 33 64 119 1953

T  119
119  2 Y  Y  Y 
2 2 2

 X    X    X  SSR    CF
1 2 3

SSC  
CF   1573.44
2 2 2

9
1 2 3
 CF n1 n2 n3
n
n1 n2 n3
SST   ( X i )  CF
2 2 2
2 (30) (40) (49)
SSR    1573.44
2 2 2
(22) (33) (64)
SSC    1573.44 3 3 3
i1 3 3 3
SSC  316.22 SSR  60.22
SST 19531573.44
SST  379.55 SSE  SST  (SSC  SSR)
SSE  3.12

22
Source of
SS df MS F F crit
Variation
Fc = MSC/MSE
Columns SSC = 316.22 V1(C) = 2 MSC = (316.22)/(2) = 158.11 6.94
= (158.11)/(0.78) = 203.29
Fr = MSR/MSE
Rows SSR = 60.22 V1(r) = 2 MSR = (60.22)/(2) = 30.11 6.94
(30.11)/(0.78) = 38.71

Error SSE = 3.12 V2 =(C-1)*(r-1)= 4 MSE = (3.12)/(4) = 0.78

Total SST = 379.55 8

V c c  203.29 V cc  V tc H01 and H02 are rejected. So there is

V c r  38.71 H02 is rejected significant difference in cleanliness due
V c r  38.71 to the water temperatures and there is
V t
c  6.94 significant difference in cleanliness due
V t r  6.94 V cr  V tr to the detergent brands.
H01 is rejected

23
Sign Test
Small Samples
Single Sample

For the following data Test if the Median is 15.

1.00 8.90
2.00 9.00
3.00 9.30
4.00 9.70
5.00 12.00
6.00 12.25
6.70 14.25
7.00 14.45
7.10 18.00
7.25 19.00

24
p  q r  n C r p r  q nr
Solution
H0: Median = 15
H1: Median ≠ 15
P = 2(20C2p2q18+20C1p1q19+20C0p0q20)
1.00 – 8.90 –
Cr  n
2.00 – 9.00 – n

rn  r 
3.00 – 9.30 –
C2  20
4.00 – 9.70 – 20
220  2 
5.00 – 12.00 –
201918
6.00 – 12.25 – 20
C2 
218
6.70 – 14.25 – 20
C2  190
7.00 – 14.45 –
7.10 – 18.00 + =2((190x(0.5)20)+(20x(0.5)20)+(1x(0.5)20)
7.25 – 19.00 + =0.00040245
P<0.05
+ = 2, – = 18 H0 is rejected
Median ≠ 15

25
Sign Test
Small Sample
Paired Sample
In an institute two scientists Mr. Goldsworthy and Mr. Sheraton develop
two methods of giving training to the new employees. Since it is a typical
type of data for a very limited type of employees, the data is not supposed
to be normally distributed which is as follows:

Sr. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Method1 20 24 28 24 20 29 19 27 20 30 18 28 26 24
Method2 16 26 18 17 20 21 23 22 23 20 18 21 17 26

Test whether there is a significant difference between the two methods.

26
Solution
H0: Median 1 = Median 2
H1: Median 1 ≠ Median 2
Sr. No. Method1 Method2 d
1 20 16 - P = 2(12C4p4q8+12C3p3q9+12C2p2q10 +12C 1p1q11+12C0p0q12)
2 24 26 + = 0.3876953125
3 28 18 - P > 0.05
4 24 17 -
5 20 20 =
H0 is accepted
6 29 21 - Median 1 = Median 2
7 19 23 + There is no significant difference between the two
8 27 22 -
methods.
9 20 23 +
10 30 20 -
11 18 18 =
12 28 21 -
13 26 17 -
14 24 26 +

+ = 04, – = 08, = = 02

27
Sign Test
For Large Samples

Normal Variate Method

It is used when n > 25

x - np o x - o.5 n
Z Or Z 
np 0 (1- p0) o.5 n

When X  np, It should be X  0.5

When X  np, It should be X  0.5

(X  0.5)  o.5 n (X  0.5)  o.5 n

Z Or Z 
o.5 n o.5 n

28
Sign Test
For Large Samples Single Sample
Gannet India Ltd. estimates the age of its employees to be 45 years. A sample
of 100 employees is taken out of which 60 are above 45 years, 8 are of 45 years and
32 are less than 45 years of age. Test if the age median of the employees of the
company is 45 years.
Solution
H0: Med = 45
H1: Med ≠ 45 Z
(X  0.5)  o.5 n V c  2.81
o.5 n
V t  1.96
Here (32  0.5)  o.5  92
Z Vc Vt
n  100  8  92 o.5 92

Z
13.5 H 0 is rejected
X  32 4.795 Median is not 45 years
p  0.5 Z  2.81

29
Sign Test
For Large Samples Paired Samples
Gannet India Ltd. provides training to its 100 employees. The findings are as follows:
Improved = 70
Worse than Previous = 25
No Change = 5
Test if the training programme is effective.
Solution
H0: The training programme is not effective.
H1: The training programme is effective.
V c  4.51
(X  0.5)  o.5 n
Here Z V t  1.645
o.5 n
n  100  5  95 (25  0.5)  o.5  95
Vc Vt
Z
X  25 o.5 95
H 0 is rejected
The training programme is effective.
p  0.5 Z
22
4.87
Z  4.51

30
Factor Analysis
It is used

• To minimise the unnecessary bulk of variables

and components

• Checking Weak Loads

• Checking Reloads

• Identifying Similar Variables

31
Factor Analysis
It discovers how the original variables are organised in a particular way
reflecting another ‘latent variable’

Factor Analysis

Principal Component Method Centroid Method Maximum Likelihood Method

32
• Most often it is used in multivariate technique of research
studies, particularly in social and behavioural sciences

• It is applicable when there is a systematic

interdependence among a set of observed or manifest
variables and the researcher is interested in finding out
something more fundamental or latent behaviour.

33
Construct = Latent Variable = Factor

Observed/ Manifest Variable Latent Variable

• Memory Test (X1)
• Verbal Test (X2)
• Written Test (X3)
• Reading Test (X4)
• Comprehension Test (X5)
≅ Intelligence
• Speed Test (X6)
• Decision Making Test (X7)

34
Error

X1 e1
Degree of
ei (I = 1, 2, 3, … ) is the unique
Correlation X2 e2 contribution of the variable
that cannot be predicted from
the remaining variables. It is
X3 e3 equal to 1 – R2.

Intelligence X4 e4

X5 e5

X6 e6

X7 e7

35
Factor Analysis
Why do we look at “dimensions”?
• We study phenomena that can not be directly observed
– (ego, personality, intelligence)
• We have too many observations
– need to “reduce” them to a smaller set of factors
• Items are representations of underlying or latent factors.
– We want to know what these factors are.
– We have an idea of the phenomena that a set of items represent (construct
validity).
• To find underlying latent constructs
– As manifested in individual items
• To assess the association between these factors
• To produce usable scores that reflect critical aspects of any complex
phenomenon
– (e.g. personality, intelligence, values, air)
• An end in itself and a major step toward creating error free measures

36
Factor Analysis
Basic Concept

• If two items are highly correlated

– They must represent the same phenomenon
– If they tell us about the same underlying variance, combining them to form a
single measure is reasonable for two reasons
• Parsimony
• Reduction in Error
– Represented by a regression line

• BUT suppose one is just a little better than the other at representing this
underlying phenomena?
• FACTOR ANALYSIS looks for the phenomena underlying the observed variance
and covariance in a set of variables.
• These phenomena are called “factors” or “principal components.”

37
Condition
• For only Interval or Ratio Scaled Data
• Usually Sample Size should be five times higher than total
number of variables.
Adequacy
• Bertlett’s Chi Square p value should be less than 0.05
• KMO Statistic should be greater than 0.5
• Determinant of Correlation Matrix R > 0.00001 (Field 2012 p771)

KMO Interpretation
> 0.90 Marvellous
0.80 – 0.90 Meritorious
0.70 – 0.79 Middling
0.60 – 0.69 Mediocre
0.50 – 0.59 Miserable
< 0.50 Unacceptable

38
Statistics Associated with Factor Analysis
• Bartlett's test of sphericity. Bartlett's test of sphericity is a test statistic used
to examine the hypothesis that the variables are uncorrelated in the
population. In other words, the population correlation matrix is an identity
matrix; each variable correlates perfectly with itself (r = 1) but has no
correlation with the other variables (r = 0).

• The Bartlett Test of Sphericity compares the correlation matrix with a matrix of
zero correlations (technically called the identity matrix, which consists of all
zeros except the 1’s along the diagonal).

39
 2 p  5
   n 1
2
x ln R

 6 
p( p 1)
df 
2

where
n  Sample Size
p  Number of Variables
ln  Natural Log (log(x,2.71828))
R  Deter min ant of Correlation Matrix

40
Kaiser-Meyer-Olkin Measure of Sampling Adequacy

 ij
r 2

j i
KMO  i

 ij  ij
r 2
 a 2

i j i i j i

Where
a  Partial Correlation
Vij
aij  
Vii.V jj
Vij  Inverse Matrix of r
Vii  Diagonal of Vij
V jj  Transpose of Vii

Sum of Square of r(Except Diagonal)

 KMO 
Sum of Square of r(Except Diagonal)  Sum of Square of a(Except Diagonal)

41
Role of Correlation in Factor Analysis
• Holzinger and Swineford (1939) – Variables sets must be clustered with high Correlation
• Tabachmick and Fidell (2001) – Maximum Correlation must be greater than 0.3
• Correlation matrix. A correlation matrix is a lower triangle matrix showing the simple
correlations, r, between all possible pairs of variables included in the analysis. The
diagonal elements, which are all 1, are usually omitted.
Name of Matrix Elements Good Signs Bad Signs
Many above 0.3 and
Correlation R Correlations Few above 0.3
possible clustering
Few above 0.3 and
Partial correlation Partial correlation Few above 0.3
possible clustering
Partial correlation Few above 0.3 and
Anti Image Few above 0.3
reversed possible clustering

42
X1 X2 X3 X4 X5 X6 X7
X1 1.000 0.770 0.810 0.210 0.180 0.190 0.210
X2 0.770 1.000 0.870 0.250 0.170 0.210 0.220
X3 0.810 0.870 1.000 0.180 0.210 0.240 0.410
X4 0.210 0.250 0.180 1.000 0.270 0.240 0.210
X5 0.180 0.170 0.210 0.270 1.000 0.870 0.900
X6 0.190 0.210 0.240 0.240 0.870 1.000 0.870
X7 0.210 0.220 0.410 0.210 0.900 0.870 1.000

43
Elements in Principal Component Method

• Eigenvalue. The eigenvalue represents the total variance explained by each

factor. Norman and Streiner (2008) state that Eigenvalue should be greater than
1 for a estimated latent factor.
Eigenvalue > 1
• Total Variance Explained
It indicates how much of the variability in the data has been modelled by the
extracted factors. It is estimated as the proportion of sum of Eigenvalues more
than 1 and total number of variables.
n

E i
Total Variance Explained  1
n

44
• Factor loadings. Factor loadings are simple correlations between the variables and the factors. It is
correlation between a specific observed variable and a specific factor. Higher values mean a closer
relationship. They are equivalent to standardised regression coefficients (β weights) in multiple regression.
Higher the value the better.
• Communality. Communality is the amount of variance a variable shares with all the other variables being
considered. This is also the proportion of variance explained by the common factors. It is the total influence
on a single observed variable from all the factors associated with it. It is individual variability of the variable
and equal to the sum of all the squared factor loadings for all the factors of Eigenvalue greater than 1 related
to the observed variable and this value is the same as R2 in multiple regression. Higher the value the better.
1 – Communality of a variable is not explained or predicted by the model.

k
F1 F2 F3 --- Fk
h X
2 2
ik
X1 X11 X12 X13 --- X1k i1

• Factor loading plot. A factor loading plot is a plot of the original variables using the factor loadings as
coordinates.
• Factor matrix. A factor matrix contains the factor loadings of all the variables on all the factors extracted.

45
• Rotation
Rotation is selected as per the nature of interrelations of variables. It is of two types:
• Oblique; and
• Orthogonal
• Oblique
If the variables are assumed to be dependent or related, Oblique Rotation is
selected. It consists of Direct Oblimin or Promax.

X Y θ is constant
θ may be  or -

X Y Z

θ θ θ

46
• Orthogonal
If the variables are assumed to be Independent, Orthogonal Rotation is selected.
It consists of Varimax, Quatrimax and Equimax.

The variables are uncorrelated and

considered perpendicular
 

 Y
Z

47
Rotated Component Matrix
The Rotated Component Matrix identifies the group of similar variables
with the latent factor. In initial Eigenvalues the difference of Eigenvalues between
the factors is higher than in rotated loadings.

Transformation Matrix
It explains how much a particular factor is rotated. For example, if the
value is 0.707, it means the rotation is 450 because Cos 450 = 0.707.

Cos  X Excel
 CosRadians   X
Cos 1 X  
 ACOS X * 180 / PI    

48
Eigenvalue in Matrix
Eigenvalue can be determined by λ, when for matrix A with Vector V
A  I V  0 Where
A  Square Matrix
I  Identity Matrix
V  Vectors
Eigenvalue in Factor Analysis is the sum of the squares of coefficients of a particular factor.

F1 F2 F3 --- --- Fi
X1 X11 X12 X13 --- --- X1i
X2 X21 X22 X23 --- --- X2i
7
X3 X31 X32 X33 --- --- X3i
Eigenvalue for Factor1   X i12
X4 X41 X42 X43 --- --- X4i i1
X5 X51 X52 X53 --- --- X5i
X6 X61 X62 X63 --- --- X6i
X7 X71 X72 X73 --- --- X7i
49
Multiple Regression Analysis
• A procedure for analyzing associative relationships between a metric
dependent variable and one or more independent variable
– Existence of a relationship
– Strength of the relationship
– Predict the values of the dependent variable
– Control for other independent variable when evaluating the contributions
of a specific variable or a set of variables

50
Multiple Regression
Multiple Regression allows us to:
 Use several variables at once to explain the variation in a continuous
dependent variable.
 Isolate the unique effect of one variable on the continuous dependent
variable while taking into consideration that other variables are affecting it
too.
 Write a mathematical equation that tells us the overall effects of several
variables together and the unique effects of each on a continuous
dependent variable.
 Control for other variables to demonstrate whether bivariate relationships
are spurious

51
Regression Analysis

Deterministic Model
Yi = β0 + β1X1 + β2X2 + β3X3 + … + βiXi

Probabilistic Model
Yi = β0 + β1X1 + β2X2 + β3X3 + … + βiXi + μ

52
Multiple Regression Analysis
The general form of the multiple regression model is as follows:

which is estimated by the following equation:

Ŷ = a + b1X1 + b2X2 + b3X3+ . . . + bkXk

The coefficient a represents the intercept, but the b's are the partial regression
coefficients i.e. slope.

53
Cluster Analysis

Age Income

Gender

54
Cluster Analysis
Introduction
• A technique used to classify objects or cases into relatively homogeneous
groups called clusters

• Objects in the cluster show same behavioral pattern

• Also called classification analysis, or numerical taxonomy

55
Cluster Analysis
Application
• Market segmentation based on benefits sought by the customers

• Buyer behavior – Identifying homogeneous groups of buyers

56
Statistics Associated with Cluster
Analysis
• Agglomeration schedule. An agglomeration schedule gives information on
the objects or cases being combined at each stage of a hierarchical clustering
process.

• Cluster centroid. The cluster centroid is the mean values of the variables for
all the cases or objects in a particular cluster.

• Cluster centers. The cluster centers are the initial starting points in
nonhierarchical clustering. Clusters are built around these centers, or seeds.

• Cluster membership. Cluster membership indicates the cluster to which each

object or case belongs.

57
Discriminant Analysis
Introduction
• Need to classify people into two or more groups
– Buyers / Non-Buyers
– Good / Bad credit risk
– Superior / Average / Poor Products
• Goal
– To establish a procedure to find the predictors that best classify subjects
• Uses
– Market segmentation research

58
Discriminant Analysis
Introduction
• Dependent variable is categorical (nominal or non metric)
– Nominal: Gender, Religion
– Promotion: Low, Medium, High
• Predictor variable is interval in nature
• Involves deriving a variate, the linear combination of the two (or more)
independent variables that will discriminate best between a priori defined
groups
• Hypothesis: Group means of a set of independent variables for two or more
groups are equal

59
Discriminant Analysis
Objectives
• Development of discriminant functions, or linear combinations of the predictor or
independent variables, which will best discriminate between the categories of the
criterion or dependent variable (groups).

• Examination of whether significant differences exist among the groups, in terms of the
predictor variables.

• Determination of which predictor variables contribute the most of the intergroup

differences.

• Classification of cases to one of the groups based on the values of the predictor
variables.

• Evaluation of the accuracy of classification.

60
Discriminant Analysis
Discriminant Function
• Discriminant Analysis is done by calculating a linear function of the form
Di = d0 + d1X1 + d2X2 + d3X3 + . . . + dpXp

Where
– Di is the score on discriminant function i.
– The di’s are weighting coefficients; do is constant.
– The X’s are the values of the discriminating variable used in the analysis
• No. of discriminant equations required
– Two groups – One; Three groups – Two; N groups – N-1 equations

61
Discriminant Analysis

It is used when dependent variable is a categorical data

System Discipline Flexibility Facility

Good 5 5 5

Bad 7 5 7

Normal 3 5 7

Solid Starts - First 100 Days
94% (18)
Solid Starts - First 100 Days
287 pages
Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
89% (45)
12 Week Program: Summer Body Starts Now
70 pages
The Hold Me Tight Workbook - Dr. Sue Johnson
100% (16)
The Hold Me Tight Workbook - Dr. Sue Johnson
187 pages
Read People Like A Book by Patrick King-Edited
62% (65)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Cheat Code To The Universe
94% (77)
Cheat Code To The Universe
34 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
94% (212)
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
212 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (541)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (28)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
75% (12)
27 Feedback Mechanisms Pogil Key
6 pages
(Ebook) The Art and Science of Social Research (Second Edition) by Deborah Carr, Elizabeth Heger Boyle, Benjamin Cornwell, Shelley Correll, Robert Crosnoe, Jeremy Freese, Mary C. Waters ISBN 9780393537529, 0393537528 - Download the ebook now for the best reading experience
100% (2)
(Ebook) The Art and Science of Social Research (Second Edition) by Deborah Carr, Elizabeth Heger Boyle, Benjamin Cornwell, Shelley Correll, Robert Crosnoe, Jeremy Freese, Mary C. Waters ISBN 9780393537529, 0393537528 - Download the ebook now for the best reading experience
86 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
36 Questions To Fall in Love 1
97% (31)
36 Questions To Fall in Love 1
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
ALCHEMIST
64% (14)
ALCHEMIST
4 pages
1001 Songs
71% (69)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Math 1060 - Lecture 7
No ratings yet
Math 1060 - Lecture 7
26 pages
C2 MODULE2 UNITE 3
No ratings yet
C2 MODULE2 UNITE 3
22 pages
Session 1.2 SPSS Notes
No ratings yet
Session 1.2 SPSS Notes
14 pages
CH 9 10
No ratings yet
CH 9 10
48 pages
M07 StockWatson123635 03 Econ Ch07
No ratings yet
M07 StockWatson123635 03 Econ Ch07
62 pages
Non para Tests-1
No ratings yet
Non para Tests-1
28 pages
Lesson 8_ Regression-T
No ratings yet
Lesson 8_ Regression-T
54 pages
Sample Size Determination Text, Section 3-7, Pg. 101
No ratings yet
Sample Size Determination Text, Section 3-7, Pg. 101
15 pages
Class X: Bivariate Association & The Chi Square Test
No ratings yet
Class X: Bivariate Association & The Chi Square Test
27 pages
Praktikum Statistika 1
No ratings yet
Praktikum Statistika 1
2 pages
Topic 4 - Further Work On One-Way ANOVA
No ratings yet
Topic 4 - Further Work On One-Way ANOVA
20 pages
Classification Trees - CART and CHAID
No ratings yet
Classification Trees - CART and CHAID
50 pages
Sample Question Econometrics
No ratings yet
Sample Question Econometrics
11 pages
uji homogenitas Analisis Penelitian
No ratings yet
uji homogenitas Analisis Penelitian
2 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
71 pages
Lecture set 2
No ratings yet
Lecture set 2
47 pages
Experiments With One Factor (Ch.3. Analysis of Variance - Anova)
No ratings yet
Experiments With One Factor (Ch.3. Analysis of Variance - Anova)
34 pages
24-01-22 Marked Slides
No ratings yet
24-01-22 Marked Slides
50 pages
Unit - I Introduction To ANN: S. Vivekanandan Cabin: TT 319A E-Mail: Svivekanandan@vit - Ac.in Mobile: 8124274447
No ratings yet
Unit - I Introduction To ANN: S. Vivekanandan Cabin: TT 319A E-Mail: Svivekanandan@vit - Ac.in Mobile: 8124274447
14 pages
2024 Module Test 2 - 2
No ratings yet
2024 Module Test 2 - 2
6 pages
Multiple Regression: Model and Interpretation
No ratings yet
Multiple Regression: Model and Interpretation
10 pages
2 Hypothesis-Testing
No ratings yet
2 Hypothesis-Testing
43 pages
1-Multiple Regression
No ratings yet
1-Multiple Regression
38 pages
CEP933 Lab 2 Presentation
No ratings yet
CEP933 Lab 2 Presentation
44 pages
Time Series Analysis Using e Views
100% (1)
Time Series Analysis Using e Views
131 pages
DOE Mistake Proofing Poka Yoke FMEA
No ratings yet
DOE Mistake Proofing Poka Yoke FMEA
115 pages
4 Regression Issues
No ratings yet
4 Regression Issues
44 pages
F-207 stat 2 ppt
No ratings yet
F-207 stat 2 ppt
44 pages
Topic 4.1: Chi-Squared Test: Worked Example
No ratings yet
Topic 4.1: Chi-Squared Test: Worked Example
1 page
ECON6001: Applied Econometrics S&W: Chapter 5
No ratings yet
ECON6001: Applied Econometrics S&W: Chapter 5
62 pages
Module 3 - MultipleLinearRegression - Afterclass1b
No ratings yet
Module 3 - MultipleLinearRegression - Afterclass1b
34 pages
2017 Level II SS 3 Quantitative Methods (Lecture)
No ratings yet
2017 Level II SS 3 Quantitative Methods (Lecture)
10 pages
Chi Square
No ratings yet
Chi Square
37 pages
M05 StockWatson123635 03 Econ Ch05
No ratings yet
M05 StockWatson123635 03 Econ Ch05
42 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
71 pages
TDHGG19EFA
No ratings yet
TDHGG19EFA
26 pages
Course Dairy For V Semester
No ratings yet
Course Dairy For V Semester
37 pages
Lecture 4. Dispersion
No ratings yet
Lecture 4. Dispersion
6 pages
2012-Assumption and Data Transformationnew
No ratings yet
2012-Assumption and Data Transformationnew
57 pages
Hypothesis Testing II
No ratings yet
Hypothesis Testing II
48 pages
Hypothesis Tests and Confidence Intervals in Multiple Regression
No ratings yet
Hypothesis Tests and Confidence Intervals in Multiple Regression
44 pages
Unit 5
No ratings yet
Unit 5
104 pages
Ch3 Multiple Regression
No ratings yet
Ch3 Multiple Regression
56 pages
STA102 - Simple Corr - Regression
No ratings yet
STA102 - Simple Corr - Regression
37 pages
Week03 Assignmentt - Lima-Gonzalez, C. (Extension)
No ratings yet
Week03 Assignmentt - Lima-Gonzalez, C. (Extension)
6 pages
Basic Econometrics III
No ratings yet
Basic Econometrics III
23 pages
Uji Anova - B Ind
No ratings yet
Uji Anova - B Ind
5 pages
Chi Square
No ratings yet
Chi Square
50 pages
Hypo-Test 2 Sample ss1
No ratings yet
Hypo-Test 2 Sample ss1
13 pages
Log Reg
No ratings yet
Log Reg
32 pages
MMW PFTaskPerformance
No ratings yet
MMW PFTaskPerformance
7 pages
Additional Hypothesis Testing
0% (1)
Additional Hypothesis Testing
31 pages
Homework 6 (Answer)
No ratings yet
Homework 6 (Answer)
13 pages
Applied Statistics II-SLR
100% (1)
Applied Statistics II-SLR
23 pages
STATISTICS Unit 7 Chi-Squared Introduction
No ratings yet
STATISTICS Unit 7 Chi-Squared Introduction
2 pages
Presentation On: Neural Network
No ratings yet
Presentation On: Neural Network
30 pages
Statistics 2nd Sem Numerical Solutions
No ratings yet
Statistics 2nd Sem Numerical Solutions
11 pages
Constrained Statistical Inference: Order, Inequality, and Shape Constraints
From Everand
Constrained Statistical Inference: Order, Inequality, and Shape Constraints
Mervyn J. Silvapulle
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Quantitative Methods in Population Health: Extensions of Ordinary Regression
From Everand
Quantitative Methods in Population Health: Extensions of Ordinary Regression
Mari Palta
No ratings yet
Using Basic Statistics in The Behavioral and Social Sciences Download PDF
No ratings yet
Using Basic Statistics in The Behavioral and Social Sciences Download PDF
49 pages
Pant D. Statistics for Data Scientists and Analysts...using Python 2025
No ratings yet
Pant D. Statistics for Data Scientists and Analysts...using Python 2025
508 pages
Predective Modelling
No ratings yet
Predective Modelling
28 pages
IDA Question Bank Ch2
No ratings yet
IDA Question Bank Ch2
26 pages
D1UA401B Research Methodology-UNIT-4 Pazhanisamy-BBA IV Semester Section19
No ratings yet
D1UA401B Research Methodology-UNIT-4 Pazhanisamy-BBA IV Semester Section19
108 pages
AD3301
No ratings yet
AD3301
2 pages
12 Ai Data Story 3
No ratings yet
12 Ai Data Story 3
20 pages
DS Unit 2
No ratings yet
DS Unit 2
42 pages
Univariate Bivariavte Multivariate
No ratings yet
Univariate Bivariavte Multivariate
10 pages
MCS-226 Data Science & Big Data-PCTI PPT
No ratings yet
MCS-226 Data Science & Big Data-PCTI PPT
177 pages
Data Science Report
No ratings yet
Data Science Report
35 pages
RESEARCH Chapter 6
No ratings yet
RESEARCH Chapter 6
5 pages
Assignment 2 B
No ratings yet
Assignment 2 B
10 pages
(Ebook) Process Capability Analysis: Estimating Quality by Neil W. Polhemus ISBN 9781138030152, 9781315405728, 1138030155, 1315405725 - Experience the full ebook by downloading it now
100% (1)
(Ebook) Process Capability Analysis: Estimating Quality by Neil W. Polhemus ISBN 9781138030152, 9781315405728, 1138030155, 1315405725 - Experience the full ebook by downloading it now
52 pages
PredictiveAnalysis U1 U2
No ratings yet
PredictiveAnalysis U1 U2
7 pages
BCA IBM 3 Years
No ratings yet
BCA IBM 3 Years
36 pages
Data Exploration
No ratings yet
Data Exploration
23 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Practical Research G12
No ratings yet
Practical Research G12
3 pages
Classification and Tabulation: Dr. Vijay Kumar SBS, Su
No ratings yet
Classification and Tabulation: Dr. Vijay Kumar SBS, Su
102 pages
SAS Quiz - SAS Multiple Choice Questions and Answers
No ratings yet
SAS Quiz - SAS Multiple Choice Questions and Answers
40 pages
Capstone Soical Media Tourism Venkat Final
No ratings yet
Capstone Soical Media Tourism Venkat Final
35 pages
MGT782 Individual Assignment 1
No ratings yet
MGT782 Individual Assignment 1
10 pages
Statistics Elect
No ratings yet
Statistics Elect
8 pages
Using Basic Statistics in the Behavioral and Social Sciences - The latest ebook version is now available for instant access
100% (1)
Using Basic Statistics in the Behavioral and Social Sciences - The latest ebook version is now available for instant access
56 pages
EasyChair Preprint 12726
No ratings yet
EasyChair Preprint 12726
6 pages
Exploratory Data Analysis with Python Cookbook: Over 50 recipes to analyze, visualize, and extract insights from structured and unstructured data Oluleye - Download the ebook and explore the most detailed content
100% (1)
Exploratory Data Analysis with Python Cookbook: Over 50 recipes to analyze, visualize, and extract insights from structured and unstructured data Oluleye - Download the ebook and explore the most detailed content
58 pages
data_analysis__visualisations_in_excel_printable
No ratings yet
data_analysis__visualisations_in_excel_printable
39 pages
Chemometrics and Intelligent Laboratory
No ratings yet
Chemometrics and Intelligent Laboratory
19 pages