Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Discriminant Analysis

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 37

Discriminant Analysis

18-2

Discriminant Analysis
Discriminant analysis is a technique for analyzing data

when the criterion or dependent variable is categorical and

the predictor or independent variables are interval in nature.

18-3

Application in MR

In terms of demographic characteristics, how do customers who exhibit store loyalty differ from those who do not? Do heavy, medium, and light users of soft drinks differ in terms of their consumption of frozen foods? Do various market segments differ in their media consumption habits?

18-4

Objectives

Development of discriminant functions, or linear combinations of the independent variables, which will best discriminate between the categories of the dependent variable (groups). Examination of whether significant differences exist among the groups Determination of which predictor variables contribute to most of the intergroup differences Classification of cases to one of the groups based on the values of the predictor variables Evaluation of the accuracy of classification

18-5

Similarities and Differences

ANOVA Similarities Number of dependent Variables Number of independent variables Differences Nature of the dependent Variables Nature of the independent variables One

REGRESSION One

DISCRIMINANT ANALYSIS One

Multiple

Multiple

Multiple

Metric

Metric

Categorical

Categorical

Metric

Metric

18-6

Discriminant Analysis

When the criterion variable has two categories, the technique is known as two-group discriminant analysis

When three or more categories are involved, the technique is referred to as multiple discriminant analysis

The main distinction is that, in the two-group case, it is possible to derive only one discriminant function. In multiple discriminant analysis, more than one function may be computed.

18-7

Discriminant Analysis Model


The discriminant analysis model involves linear combinations of the following form:

D = b0 + b1X1 + b2X2 + b3X3 + . . . + bkXk


where

D b 's X 's

= = =

discriminant score discriminant coefficient or weight predictor or independent variable

The coefficients, or weights (b), are estimated so that the groups differ as much as possible on the values of the discriminant function.

18-8

Tibetan Skull Case

18-9

The data consist of five measurements on each of 32 skulls found in the southwestern and eastern districts of Tibet. The five measurements (all in millimeters) are as follows:

Greatest length of skull (Length) Greatest horizontal breadth of skull (Breadth) Height of skull (Height) Upper face length (Flength) Face breadth between outermost points of cheekbones (Fbreadth)

18-10

The first comprises skulls 1 to 17 found in graves in Sikkim and the neighboring area of Tibet (Type A skulls). The remaining 15 skulls (Type B skulls) were picked up on a battlefield in the Lhasa district and are believed to be those of native soldiers from the eastern province of Khams. These skulls were of particular interest since it was thought at the time that Tibetans from Khams might be survivors of a particular human type, unrelated to the Mongolian and Indian types that surrounded them.

18-11

Questions that might be of interest for these data:

Do the five measurements discriminate between the two assumed groups of skulls and can they be used to produce a useful rule for classifying other skulls that might become available?

18-12

Statistics Associated with Discriminant Analysis

Canonical correlation. Canonical correlation measures the extent of association between the discriminant scores and the groups. It is a measure of association between the single discriminant function and the set of dummy variables that define the group membership. Centroid. The centroid is the mean values for the discriminant scores for a particular group. There are as many centroids as there are groups, as there is one for each group. The means for a group on all the functions are the group centroids. Classification matrix. Sometimes also called confusion or prediction matrix, the classification matrix contains the number of correctly classified and misclassified cases.

18-13

Statistics Associated with Discriminant Analysis

Discriminant function coefficients. The discriminant function coefficients (unstandardized) are the multipliers of variables, when the variables are in the original units of measurement. Discriminant scores. The unstandardized coefficients are multiplied by the values of the variables. These products are summed and added to the constant term to obtain the discriminant scores. Eigenvalue. For each discriminant function, the Eigenvalue is the ratio of between-group to withingroup sums of squares. Large Eigenvalues imply superior functions.

18-14

Statistics Associated with Discriminant Analysis

F values and their significance. These are calculated from a one-way ANOVA, with the grouping variable serving as the categorical independent variable. Each predictor, in turn, serves as the metric dependent variable in the ANOVA. Group means and group standard deviations. These are computed for each predictor for each group. Pooled within-group correlation matrix. The pooled within-group correlation matrix is computed by averaging the separate covariance matrices for all the groups.

18-15

Statistics Associated with Discriminant Analysis

Standardized discriminant function coefficients. The standardized discriminant function coefficients are the discriminant function coefficients and are used as the multipliers when the variables have been standardized to a mean of 0 and a variance of 1. Structure correlations. Also referred to as discriminant loadings, the structure correlations represent the simple correlations between the predictors and the discriminant function. Total correlation matrix. If the cases are treated as if they were from a single sample and the correlations computed, a total correlation matrix is obtained. for Wilks' . Sometimes also called the U statistic, Wilks' each predictor is the ratio of the within-group sum of squares to the total sum of squares. Its value varies between 0 and 1. Large values of (near 1) indicate that group means do not seem to be different. Small values of (near 0) indicate that the group means seem to be different.

18-16

Conducting Discriminant Analysis


Formulate the Problem

Estimate the Discriminant Function Coefficients

Determine the Significance of the Discriminant Function

Interpret the Results

Assess Validity of Discriminant Analysis

Conducting Discriminant Analysis Formulate the Problem

18-17

Identify the objectives, the criterion variable, and the independent variables. The criterion variable must consist of two or more mutually exclusive and collectively exhaustive categories. The predictor variables should be selected based on a theoretical model or previous research, or the experience of the researcher. One part of the sample, called the estimation or analysis sample, is used for estimation of the discriminant function. The other part, called the holdout or validation sample, is reserved for validating the discriminant function. Often the distribution of the number of cases in the analysis and validation samples follows the distribution in the total sample.

18-18

Information on Resort Visits: Analysis Sample


Annual Attitude Family Importance Household Age of Amount Toward Attached Size Head of No. Visit Income Travel Household Family Vacation

Spent on Vacation

Resort to Family

($000)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

50.2 70.3 62.9 48.5 52.7 75.0 46.2 57.0 64.1 68.1 73.4 71.9 56.2 49.3 62.0

5 6 7 7 6 8 5 2 7 7 6 5 1 4 5

8 7 5 5 6 7 3 4 5 6 7 8 8 2 6

3 4 6 5 4 5 3 6 4 5 5 4 6 3 2

43 61 52 36 55 68 62 51 57 45 44 64 54 56 58

M (2) H (3) H (3) L (1) H (3) H (3) M (2) M (2) H (3) H (3) H (3) H (3) M (2) H (3) H (3)

18-19

Information on Resort Visits: Analysis Sample


Resort to Family Annual Attitude Family Importance Household Age of Amount Toward Attached Size Head of No. Visit Income Travel Household Family Vacation 4 3 5 2 6 6 2 5 4 7 1 3 8 2 3 3 2 2 4 3 2 2 3 5 4 3 2 2 3 2 58 55 57 37 42 45 57 51 64 54 56 36 50 48 42 L L M M M L M L L L M M L L L (1) (1) (2) (2) (2) (1) (2) (1) (1) (1) (2) (2) (1) (1) (1)

Spent on Vacation 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

($000) 32.1 36.2 43.2 50.4 44.1 38.3 55.0 46.1 35.0 37.3 41.8 57.0 33.4 37.5 41.3 5 4 2 5 6 6 1 3 6 2 5 8 6 3 3

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

Information on Resort Visits: Holdout Sample


Table 18.3
Amount Annual Resort ($000) Attitude Family to Family Importance Household Age of Toward Attached Size Head of No. Visit Income Household Family Vacation

18-20

Spent on Travel Vacation

1 2 3 4 5 6 7 8 9 10 11 12

1 1 1 1 1 1 2 2 2 2 2 2

50.8 63.6 54.0 45.0 68.0 62.1 35.0 49.6 39.4 37.0 54.5 38.2

4 7 6 5 6 5 4 5 6 2 7 2

7 4 7 4 6 6 3 3 5 6 3 2

3 7 4 3 6 3 4 5 3 5 3 3

45 55 58 60 46 56 54 39 44 51 37 49

M(2) H (3) M(2) M(2) H (3) H (3) L (1) L (1) H (3) L (1) M(2) L (1)

Conducting Discriminant Analysis Estimate the Discriminant Function Coefficients

18-21

The direct method involves estimating the discriminant function so that all the predictors are included simultaneously. In stepwise discriminant analysis, the predictor variables are entered sequentially, based on their ability to discriminate among groups.

18-22

Results of Two-Group Discriminant Analysis


Table 18.4
GROUP MEANS VISIT 1 2 Total INCOME 60.52000 41.91333 51.21667 TRAVEL VACATION 5.40000 4.33333 4.86667 5.80000 4.06667 4.9333 HSIZE 4.33333 2.80000 3.56667 AGE 53.73333 50.13333 51.93333

Group Standard Deviations 1 2 Total 9.83065 7.55115 12.79523 1.91982 1.95180 1.97804 1.82052 2.05171 2.09981 1.23443 .94112 1.33089 HSIZE 8.77062 8.27101 8.57395 AGE

Pooled Within-Groups Correlation Matrix INCOME TRAVEL VACATION INCOME TRAVEL VACATION HSIZE AGE 1.00000 0.19745 0.09148 0.08887 - 0.01431 1.00000 0.08434 -0.01681 -0.19709

1.00000 0.07046 0.01742

1.00000 -0.04301

1.00000

Wilks' (U-statistic) and univariate F ratio with 1 and 28 degrees of freedom Variable INCOME TRAVEL VACATION HSIZE AGE Wilks' 0.45310 0.92479 0.82377 0.65672 0.95441 F 33.800 2.277 5.990 14.640 1.338 Significance 0.0000 0.1425 0.0209 0.0007 0.2572

Contd.

18-23

Results of Two-Group Discriminant Analysis


Table 18.4 cont.
CANONICAL DISCRIMINANT FUNCTIONS Function 1* Eigenvalue 1.7862 % of Variance 100.00 Cum Canonical After Wilks' % Correlation Function Chi-square : 0 0 .3589 26.130 100.00 0.8007 : df Significance 5 0.0001

* marks the 1 canonical discriminant functions remaining in the analysis. Standard Canonical Discriminant Function Coefficients FUNC INCOME TRAVEL VACATION HSIZE AGE 0.74301 0.09611 0.23329 0.46911 0.20922 1

Structure Matrix: Pooled within-groups correlations between discriminating variables & canonical discriminant functions (variables ordered by size of correlation within function) FUNC INCOME HSIZE VACATION TRAVEL AGE 0.82202 0.54096 0.34607 0.21337 0.16354 Contd. 1

18-24

Results of Two-Group Discriminant Analysis


Table 18.4 cont.
Unstandardized Canonical Discriminant Function Coefficients INCOME TRAVEL VACATION HSIZE AGE (constant) FUNC 1 0.8476710E-01 0.4964455E-01 0.1202813 0.4273893 0.2454380E-01 -7.975476 Canonical discriminant functions evaluated at group means (group centroids) Group 1 2 FUNC 1 1.29118 -1.29118

Classification results for cases selected for use in analysis Actual Group Group Group 1 2 Predicted No. of Cases 15 15 Group Membership 1 2 12 80.0% 0 0.0% 3 20.0% 15 100.0% Contd.

Percent of grouped cases correctly classified: 90.00%

18-25

Results of Two-Group Discriminant Analysis


Table 18.4 cont.
Classification Results for cases not selected for use in the analysis (holdout sample) Actual Group Group Group 1 2 Predicted Group Membership No. of Cases 1 6 6 4 66.7% 0 0.0% 2 2 33.3% 6 100.0%

Percent of grouped cases correctly classified: 83.33%.

Conducting Discriminant Analysis

18-26

Determine the Significance of Discriminant Function


The null hypothesis that, in the population, the means of all discriminant functions in all groups are equal can be statistically tested. In SPSS this test is based on Wilks' If several . functions are tested simultaneously (as in the case of multiple discriminant analysis), the Wilks' statistic is the product of the univariate for each function. The significance level is estimated based on a chi-square transformation of the statistic. If the null hypothesis is rejected, indicating significant discrimination, one can proceed to interpret the results.

Conducting Discriminant Analysis Interpret the Results

18-27

The interpretation of the discriminant weights, or coefficients, is similar to that in multiple regression analysis. Given the multicollinearity in the predictor variables, there is no unambiguous measure of the relative importance of the predictors in discriminating between the groups. With this caveat in mind, we can obtain some idea of the relative importance of the variables by examining the absolute magnitude of the standardized discriminant function coefficients. Some idea of the relative importance of the predictors can also be obtained by examining the structure correlations, also called canonical loadings or discriminant loadings. These simple correlations between each predictor and the discriminant function represent the variance that the predictor shares with the function. Another aid to interpreting discriminant analysis results is to develop a characteristic profile for each group by describing each group in terms of the group means for the predictor variables.

Conducting Discriminant Analysis Access Validity of Discriminant Analysis

18-28

Many computer programs, such as SPSS, offer a leaveone-out cross-validation option. The discriminant weights, estimated by using the analysis sample, are multiplied by the values of the predictor variables in the holdout sample to generate discriminant scores for the cases in the holdout sample. The cases are then assigned to groups based on their discriminant scores and an appropriate decision rule. The hit ratio, or the percentage of cases correctly classified, can then be determined by summing the diagonal elements and dividing by the total number of cases. It is helpful to compare the percentage of cases correctly classified by discriminant analysis to the percentage that would be obtained by chance. Classification accuracy achieved by discriminant analysis should be at least 25% greater than that obtained by chance.

18-29

Results of Three-Group Discriminant Analysis


Table 18.5
Group Means AMOUNT INCOME 1 2 3 Total 38.57000 50.11000 64.97000 51.21667 TRAVEL VACATION 4.50000 4.00000 6.10000 4.86667 4.70000 4.20000 5.90000 4.93333 HSIZE 3.10000 3.40000 4.20000 3.56667 AGE 50.30000 49.50000 56.00000 51.93333

Group Standard Deviations 8.09732 9.25263 7.60117 8.57395 1 2 3 Total 5.29718 1.71594 6.00231 2.35702 8.61434 1.19722 12.79523 1.97804 1.88856 2.48551 1.66333 2.09981 1.19722 1.50555 1.13529 1.33089

Pooled Within-Groups Correlation Matrix INCOME TRAVEL VACATION INCOME TRAVEL VACATION 1.00000 0.05120 0.30681 1.00000 0.03588

HSIZE

AGE Contd.

1.00000

18-30

Results of Three-Group Discriminant Analysis


Table 18.5 cont.
Wilks' (U-statistic) and univariate F ratio with 2 and 27 degrees of freedom. Variable INCOME TRAVEL VACATION HSIZE AGE Wilks' Lambda 0.26215 0.78790 0.88060 0.87411 0.88214

F
38.00 3.634 1.830 1.944 1.804

Significance 0.0000 0.0400 0.1626 0.1840

0.1797

CANONICAL DISCRIMINANT FUNCTIONS Function Eigenvalue Significance 1* 4 2* 0.24 3.8190 0.2469 % of Variance 93.93 6.07 Cum Canonical After % Correlation Function 93.93 100.00 : 0 0.8902 0.4450 Wilks' Chi-square df 0.1664 : 1 : 44.831 0.8020 10 0.00 5.517

* marks the two canonical discriminant functions remaining in the analysis. Standardized Canonical Discriminant Function Coefficients INCOME TRAVEL VACATION HSIZE AGE FUNC 1 1.04740 0.33991 -0.14198 -0.16317 0.49474 FUNC 2 -0.42076 0.76851 0.53354 0.12932 0.52447

Contd.

18-31

Results of Three-Group Discriminant Analysis


Table 18.5 cont.
Structure Matrix: Pooled within-groups correlations between discriminating variables and canonical discriminant functions (variables ordered by size of correlation within function) INCOME HSIZE VACATION TRAVEL AGE FUNC 1 0.85556* 0.19319* 0.21935 0.14899 0.16576 FUNC 2 -0.27833 0.07749 0.58829* 0.45362* 0.34079*

Unstandardized canonical discriminant function coefficients FUNC 1 FUNC 2 INCOME 0.1542658 -0.6197148E-01 TRAVEL 0.1867977 0.4223430 VACATION -0.6952264E-01 0.2612652 HSIZE -0.1265334 0.1002796 AGE 0.5928055E-01 0.6284206E-01 (constant) -11.09442 -3.791600 Canonical discriminant functions evaluated at group means (group centroids) Group FUNC 1 FUNC 2 1 -2.04100 0.41847 2 -0.40479 -0.65867 3 2.44578 0.24020

Contd.

18-32

Results of Three-Group Discriminant Analysis


Table 18.5 cont.
Classification Results: Actual Group Group Group Group 1 2 3 Predicted Group Membership No. of Cases 1 2 10 10 10 9 90.0% 1 10.0% 1 10.0% 9 90.0% 0 0.0% 0 0.0% 8 80.0% 3

0 2 0.0% 20.0% Percent of grouped cases correctly classified: 86.67%

Classification results for cases not selected for use in the analysis Predicted Group Membership Actual Group No. of Cases 1 2 3 Group Group Group 1 2 3 4 4 4 3 75.0% 0 0.0% 1 25.0% 3 75.0% 0 0.0% 1 25.0% 3 75.0%

1 0 25.0% 0.0% Percent of grouped cases correctly classified: 75.00%

18-33

All-Groups Scattergram
Fig. 18.2
Across: Function 1 Down: Function 2 4.0 1 1 *1 1 1 12 1 1 1 2 1 23 * 2 2 2 2 3 3* 3 3 3 3 3

0.0

-4.0 * indicates a group centroid -6.0 -4.0 -2.0 0.0 2.0 4.0 6.0

18-34

Territorial Map
Fig. 18.3
13 13 Across: Function 1 13 Down: Function 2 13 13 * Indicates a 13 group centroid 13 113 1 1 2 3 1 2 2 3 3 1 3 *1 1 1 2 2 2 2 3 * 1 1 2 * 2 2 3 2 3 3 2 1 1 2 2 2 3 3 1 12 2 2 2 3 2 1 1 1 2 2 3 3 12 2 2 1 2 2 3 2 1 2 1 2 3 3 1 12 2 1 2 2 3 3 2 1 1 1 2 2 2 2 3 1 1 2 1 1 12 2 2 2 3 3

8.0

4.0

0.0 -4.0

-8.0 -8.0

-6.0

-4.0

-2.0

0.0

2.0

4.0

6.0

8.0

18-35

Stepwise Discriminant Analysis

Stepwise discriminant analysis is analogous to stepwise multiple regression (see Chapter 17) in that the predictors are entered sequentially based on their ability to discriminate between the groups. An F ratio is calculated for each predictor by conducting a univariate analysis of variance in which the groups are treated as the categorical variable and the predictor as the criterion variable. The predictor with the highest F ratio is the first to be selected for inclusion in the discriminant function, if it meets certain significance and tolerance criteria. A second predictor is added based on the highest adjusted or partial F ratio, taking into account the predictor already selected.

18-36

Stepwise Discriminant Analysis


Each predictor selected is tested for retention based on its association with other predictors selected. The process of selection and retention is continued until all predictors meeting the significance criteria for inclusion and retention have been entered in the discriminant function. The selection of the stepwise procedure is based on the optimizing criterion adopted. The Mahalanobis procedure is based on maximizing a generalized measure of the distance between the two closest groups. The order in which the variables were selected also indicates their importance in discriminating between the groups.

18-37

SPSS Windows
The DISCRIMINANT program performs both twogroup and multiple discriminant analysis. To select this procedure using SPSS for Windows click: Analyze>Classify>Discriminant

You might also like