Discriminant Analysis
Discriminant Analysis
Discriminant Analysis
Discriminant Analysis With 2-Groups Comparison of 2-Group Discriminant Analysis With Logistical Regression Discriminant Analysis With More Than 2-Groups
Notes on Discriminant Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Discriminant function A priori categories or groups Homogeneity of variance/covariance matrices Differences between discriminant analysis and logistical regression Partitioning of sums of squares in discriminant analysis TSS = BSS = WSS Discriminant score Discriminant weight or coefficient Discriminant constant Discriminant analysis assumptions Steps in the discriminant analysis process Box's M test and its null hypothesis Wilks' lambda Stepwise method in discriminant analysis Pin and Pout criteria F-test to determine the effect of adding or deleting a variable from the model Unstandardized and standardized discriminant weights Measures of goodness-of-fit Eigenvalue Canonical correlation Model Wilks' lambda Classification table and hit ratio t-test for a hit ratio Maximum chance criteria Proportional chance criteria Press's Q statistic Histogram of discriminant scores Casewise plot of the predictions Calculation of the cutting score: equal and unequal groups Prior probability Conditional probability Bayes' theorem and posterior probability Structure coefficient or discriminant loading Group centroid Testing the collinearity of the predictor variables Assumptions about multiple discriminant functions Number: g-1 or k whichever is less Functions may be collinear Discriminant scores must be independent KEYCONCEPTS(cont.)
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
3 Interpretation of multiple discriminant functions Territorial map Scatterplot of the discriminant scores across the discriminant functions
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Lecture Outline
What is discriminant analysis The concept of partitioning sums of squares Discriminant assumptions Stepwise discriminant analysis with Wilks' lambda Testing the goodness-of-fit of the model Determining the significance of the predictor variables A 2-group discriminant problem A multi-group discriminant problem
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Discriminant Analysis
Z = a + W1X1 + W2X2 + ... + WkXk Dependency Technique Dependent variable is nonmetric Independent variables can be metric and/or nonmetric Used to predict or explain a nonmetric dependent variable with two or more a priori categories Assumptions Xk are multivariate normally distributed Homogeneity of variance-covariance matrices of Xk across groups Xk are independent, non-collinear The relationship is linear Absence of outliers
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
In Discriminant Analysis The Total SS ( Zi- Z) 2 is partitioned into: Between Group SS ( Zj- Z) 2 Within Groups SS ( Zij- Zj) 2 ( Zi- Z) 2 = ( Zj- Z) 2 + ( Zij- Zj) 2 i = an individual case, j = group j Zi = individual discriminant score Z = grand mean of the discriminant scores Zj = mean discriminant score for group j Goal Estimate parameters that minimize the Within Group SS
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Z = discriminant score, a number used to predict group membership of a case a = discriminant constant Wk = discriminant weight or coefficient, a measure of the extent to which variable Xk discriminates among the groups of the DV Xk = an IV or predictor variable. Can be metric or nonmetric. Discriminant analysis uses OLS to estimate the values of the parameters (a) and Wk that minimize the Within Group SS
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
10
Predicting whether a felony offender will receive a probated or prison sentence as a function of various background factors. Dependent Variable Type of sentence (type_sent) (0 = probation, 1 = prison) Independent Variables Degree of drug dependency (dr_score) Age at first arrest (age_firs) Level of work skill (skl_index) The seriousness of the crime (ser_indx)
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
11
Z1
Probation (0)
Prison (1)
Between SS = (Z0 - Z)2 + (Z1 - Z)2 = (Zj - Z)2 = BSS Within SS = (Zi0 - Z0)2 + (Zi1 - Z1)2 = (Zij - Zj)2 =WSS Total SS = (Zi - Z)2 = TSS Z0 and Z1 are called centroids, the mean discriminant score for each group
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
12
The predictor variables are multivariate normal, ipso facto univariate normal The variance-covariance matrices of the predictor variables across the various groups are the same in the population, i.e. homogeneous The groups defined by the DV exist a priori The predictor variables are noncollinear The relationship is linear in its parameters Absence of leverage point outliers The sample is large enough, say 30 cases for each predictor variable
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
13
Specify the dependent & the predictor variables Test the models assumptions a priori Determine the method for selection and criteria for entering the predictor variables into the model Estimate the parameters of the model Determine the goodness-of-fit of the model and examine the residuals Determine the significance of the predictors Test the assumptions ex post facto Validate the results
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
14
12
10
0 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5
DR_SCORE
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
15 The SentenceTypeDiscriminantModel(cont.)
20
10
SKL_INDX
AGE_FIRS
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
16
The SentenceTypeDiscriminantModel(cont.)
14
12
10
SER_INDX
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
17
Covariance Matrices TYPE_SEN .00 DR_SCORE AGE_FIRS SKL_INDX SER_INDX DR_SCORE AGE_FIRS SKL_INDX SER_INDX DR_SCORE 7.590 -1.945 1.932 2.110 6.466 -2.688 -.648 1.474 AGE_FIRS -1.945 5.553 -.329 -1.225 -2.688 4.500 .469 -2.219 SKL_INDX 1.932 -.329 8.632 -.857 -.648 .469 7.922 -.507 SER_INDX 2.110 -1.225 -.857 3.441 1.474 -2.219 -.507 3.405
1.00
The variances are on the diagonals, and the covariances are on the off-diagonals.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
18
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Box's M test H0: the variance/covariance matrices of the two groups are the same in the population.
Log Determinants Log Determinant 3.076 2.988 3.040
Rank
2 2 2
The ranks and natural logarithms of determinants printed are those of the group covariance matrices.
Test Results Box's M F Approx. df1 df2 Sig. .361 .116 3 1476249 .951
Box's M = 0.361, Approximate F = 0.116, p = 0.951 Conclusion: The null hypothesis with respect to the homogeneity of variance/covariance matrices in the population is accepted.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
20
How Will the Predictor Variables Be Entered into the Discriminant Model?
SPSS offers two methods for building a discriminant model Entering all the variables simultaneously Stepwise method In this example, the variables will be entered in a stepwise fashion using Wilks' lambda criterion
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Step 1 2
Step 2: Identify the predictor variable that has the lowest significant Wilks' lambda () and enter it into the discriminant model, i.e. ser_indx. (Pin default = 0.05)
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Step 4: Of the variables not in the model, select the predictor that has the lowest significant and enter it into the model. Determine if the addition of the variable was significant. Now check if the predictor(s) previously entered are still significant. (Pout default = 0.10) Step 5: Repeat Step 4 until all the predictor variables are entered into the model or until none the variables outside the model have significant 's.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
23
How Is the Significance of Change Determined When a Variable is Entered Into the Discriminant Function?
Use an F-ratio comparing the Wilks' lambda of the model with the greater number of predictors (k) with the one with the lesser number of predictors (k-1)
F=
(N - g - 1) (g - 1)
= WSS / TSS of the function N = total sample size g = number of DV groups df = (N - g - 1) and (g - 1)
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
24
What values of the constant (a) and the discriminant coefficients Wk best predict whether a case will receive a probated or a prison sentence? After variable selection by a stepwise process using Wilks' , the best equation was found to be
Canonical Discriminant Function Coefficients Function 1 -.235 .564 -.706
Unstandardized coefficients
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Prediction of a case Take a case with dr_score = 9, ser_indx = 1, and an actual sentence = 0, (i.e. a probated case) Z = -0.706 - 0.235 (9) + 0.564 (1) = -2.25 Since -2.25 is closer to the code 0 than the code 1, the case would be predicted a probated case, i.e. code 0.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
26
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
27
What is an Eigenvalue?
In matrix algebra, an eigenvalue is a constant, which if subtracted from the diagonal elements of a matrix, results in a new matrix whose determinant equals zero. An example
Given the matrix: 4 A= 2 5 1
(4 - x) A = 2
1 = 0.0 (5 - x)
Calculating the determinant of the matrix A: ( 4 - x) (5 - x) - (2) (1) = 0.0 (20 - 4x - 5x + x2 - 2) = 0.0 (18 - 9x + x2) = 0.0 (x2 - 9x + 18) = 0.0 This quadratic equation has two solutions or eigenvalues: + 6 and + 3
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
28
= BSS / WSS = [ ( Zj- Z)2 / ( Zij- Zj)2 ] Interpretation If = 0.00, the model has no discriminatory power, BSS = 0.0 The larger the value of , the greater the discriminatory power of the model
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
29
Function 1
Cumulative % 100.0
The eigenvalue of the discriminant function = 0.305 The % of the variance explained that is explained by this discriminant function = 100%* The cumulative percentage of the variance explained by the 1st discriminant function = 100%*
* With two DV groups, only one discriminant function can be extracted, which will therefore explain all the variance explained by the model. But with three groups, two functions can be extracted, with g groups, (g - 1) functions can be extracted, or k functions if k is less than g. Therefore, a different % of the total variance explained will be explained by each of the successive functions extracted.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
30
= the correlation of the predictor(s) with the discriminant scores produced by the model 2 = coefficient of determination 1 - 2 = coefficient of non-determination For the sentence-type example = 0.3050 / (1 + 0.3050) = 0.483
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Wilks'
Chi-Square
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
33
Original
Count %
Total
37 33 100.0 100.0
Overall results Overall hit ratio = 65.7% Correctly classified probationers = 73.0% Correctly classified prisoners = 57.6%
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
34
35 Doesthe ModelPredictAnyBetterThanChance?(cont.)
Proportional chance criterion (Cpro) Randomly classify the cases proportionate to the number of cases in either group.
Cpro =p2 + (1 - p)2 p = proportion of subjects in one group (1 - p) = proportion of cases in the other group Proportion of probationers = (37 / 70) = 0.5286 Proportion of prisoners = (33 / 70) = 0.4714 Cpro = 0.5286 2 + (1 - 0.4714)2 = 0.5588 or a hit ratio of 55.88% Comparison of hit ratios The model MCC Cpro 65.71% 52.86% 55.88%
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
36
df = (N - 2)
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Press's Q statistic Q = [ N - (n) (g) ] 2 / [ N * (g - 1)] N = total number of subjects n = number of cases correctly classified g = number of groups Q is chi-square distributed for df = 1 For the sentence-type example Q = [ 70 - (46) (2) ] 2 / [ 70 - (2 - 1)] = 7.0145 p < 0.01 Decision The null hypothesis that the model hit ratio is no better than chance is rejected
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
38
How Can a Cutting Score Be Established to Sort the Cases Into Either Group Based on Their Discriminant Scores?
When n0 = n1 Zcutting = (Z0 + Z1) / 2
(Zj = mean discriminant score for group j)
For the sentencing-type study Z0 = -0.5141 and Z1 = +0.5764 Zcutting = [ (37) (-0.5141 )+ (33) (+0.5764) ] / 2 Zcutting = -0.00025, or slightly less than 0.0
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
39
Box-Whisker plot of the distributions of discriminant scores for probation and prison cases with the cutting score set at -0.00025
-1
-2
-3
N= 37 33
.0
1.0
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
40
What is the Best Way to See the Predictions Made on Individual Cases
Casewise Statistics Discrimin ant Scores
Highest Group Squared Mahalanobis Distance to Centroid 3.040 2.000 3.914 1.621 1.390 .144 3.040 .722 .225 1.390
P(D>d | G=g) p df .081 .157 .048 .203 .238 .704 .081 .395 .635 .238
1 1 1 1 1 1 1 1 1 1
P(G=g | D=d) .932 .905 .946 .891 .880 .755 .932 .837 .773 .880
Group 1 1 1 1 1 1 1 1 1 1
P(G=g | D=d) .068 .095 .054 .109 .120 .245 .068 .163 .227 .120
Squared Mahalanobis Distance to Centroid Function 1 8.031 -2.258 6.273 -1.928 9.418 -2.493 5.588 -1.787 5.151 -1.693 2.162 -.894 8.031 -2.258 3.765 -1.364 2.448 -.988 5.151 -1.693
Discriminant scores the column on the extreme right hand side of the table For case 1, discriminant score = -2.2575
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
41
D = the discriminant score (i.e. Z) P (Gi D) = posterior probability that a case is in group i, given that it has a specific discriminant score D P (D Gi) = conditional probability that a case has a discriminant score of D, given that it is in group i P (Gi) = prior probability that a case is in group i, which would be equal to (ni / N)
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
42 HowDoesSPSSClassifyCases?(cont.)
The Bayesian probabilities associated with being in either group are calculated, and the greater of the two probabilities is used to classify the case. Example: Case 1 Posterior probability of being in the probation group P (Gprobation D) = 0.932
Posterior probability of being in the prison group P (Gprison D) = 0.068 Since P (Gprobation D) > P (Gprison D), the case is classified as a probation case. (0.932 > 0.068) The column labeled "Actual Group" shows the group the case actually belongs to. If the Bayesian probability misclassifies the case, the case is marked with two asterisks (**). These are the errors produced by the model, which can also be seen in the classification table in a previous exhibit.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
43
= WSS / TSS
Variables in the Analysis Step 1 2 Tolerance 1.000 .864 .864 Sig. of F to Remove .000 .000 .019 Wilks' Lambda .983 .832
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
44
45
For dr-score
As dr_score increases by one unit, the discriminant score Z decreases by 0.235 Holding the seriousness of the offence (ser_indx) constant, the more drug dependent the defendant, the more likely he/she will be granted probation (code = 0)
For ser_indx
As ser_indx increases by one unit, the discriminant score Z increases by 0.564 Holding the drug dependency (dr_score) constant, the more serious the offence, the more likely the defendant will be sent to prison (code = 1)
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
46
How Can the Relative Impact on the DV of the Different Predictor Variables be Compared?
Two ways Compare the standardized discriminant weights, i.e. coefficients Compare the structure coefficients, also called the discriminant loadings
Standardized discriminant coefficient (Ck) The relative difference among the discriminant coefficients can not be compared If the predictors variables are in different units of measurement. The discriminant coefficients must first be converted to standardized coefficients (Ck)
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Ck = Wk
(Xk - Xk)2 / (N - g)
Wk = the unstandardized discriminant coefficient of variable k (Xk - Xk)2 = SS of the predictor variable N = total sample size g = number of DV groups
Since +1.044 is greater in absolute value than -0.6345, ser_indx has greater discriminatory impact than dr_score.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
Unstandardized coefficients
DR_SCORE SER_INDX
Notice that there is no constant (a) in a standardized discriminant function equation since the mean of a standardized variable equals zero.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
49
Pooled within-groups correlations between discriminating variables and standardized canonical discriminant functions Variables ordered by absolute size of correlation within function. a. This variable not used in the analysis.
Ser_index has the highest correlation with the discriminant scores, followed by dr_score, skl_indx and age_firs. The algebraic sign () indicates the direction of the relationship.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
50
On Average, How Well Did the Discriminant Function Divide the Two Groups?
Group Centroids One way to determine the degree of separation between the two groups is to compute the mean discriminant score for either group. These means are called the group centroids
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
51
Q Are
dr_score correlated?
and
ser_indx
significantly
Not withstanding the stepwise process, the final two predictors are significantly correlated. (r = 0.2791, p = 0.019).
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
52
Hit ratios
Discriminant analysis Logistical Regression 65.71% 65.71%
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
53
---------------------- Variables in the Equation ----------------------Variable DR_SCORE SER_INDX Constant B -.2720 .6111 -.8011 S.E. .1183 .1716 .7311 Wald 5.2810 12.6776 1.2006 df 1 1 1 Sig .0216 .0004 .2732 R -.1841 .3321 Exp(B) .7619 1.8424
--------------- Variables not in the Equation ----------------Residual Chi Square 1.360 with 2 df Sig = .5067 Variable AGE_FIRS SKL_INDX Score 1.0003 .3970 df 1 1 Sig .3172 .5287 R .0000 .0000
18.338 5.931
Classification Table for TYPE_SEN Predicted .0 1.0 Percent Correct 0 | 1 Observed +-------+-------+ .0 0 | 27 | 10 | 72.97% +-------+-------+ 1.0 1 | 14 | 19 | 57.58% +-------+-------+ Overall 65.71%
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
54
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
55
When there are three groups Two functions can be extracted from the data
When there are g-number of groups (g - 1) functions can be extracted from the data, Or k-number of functions if the number of predictor variables (k) is less than the number of groups (g)
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
56
57 Geometryof TwoDiscriminantFunctions(cont.)
X1
Two vectors are fit to the data Z1 Z2 reasonably good fit for groups 1 and 3, but a bad fit to group 2 (1st discriminant function) reasonably good fit for group 2, but a bad fit for groups 1 & 3 (2nd discriminant function)
The two vectors taken together better explain the three groups than either one by itself.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
58
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
59
Assumptions About Multiple Discriminant Functions Q Must the various discriminant functions be
independent of each other, i.e. noncollinear? No, they may be collinear or noncollinear, whatever best fits the data. Geometrically, the functions can be other than 90 apart.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
60
Covariance Matrices PRE_STAT 1.00 AGE_FIRS AGE DR_SCORE PR_ARRST COUNSEL AGE_FIRS AGE DR_SCORE PR_ARRST COUNSEL AGE_FIRS AGE DR_SCORE PR_ARRST COUNSEL AGE_FIRS 3.415 -2.579 -.391 -1.639 .232 5.190 1.557 -2.326 -.152 .210 5.882 -1.305 -3.890 -.938 .298 AGE -2.579 19.080 4.389 -1.729 -.229 1.557 5.957 .457 .814 -.257 -1.305 6.382 -.515 2.313 -.327 DR_SCORE -.391 4.389 4.770 -1.510 .103 -2.326 .457 8.490 .631 -7.381E-02 -3.890 -.515 10.154 1.125 -9.559E-02 PR_ARRST -1.639 -1.729 -1.510 6.814 -.138 -.152 .814 .631 .662 -.148 -.938 2.313 1.125 3.000 -.500 COUNSEL .232 -.229 .103 -.138 .136 .210 -.257 -7.381E-02 -.148 .190 .298 -.327 -9.559E-02 -.500 .154
2.00
3.00
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
61 DiscriminantAnalysisof Per-DispositionStatus(cont.)
Rank
2 2 2 2
The ranks and natural logarithms of determinants printed are those of the group covariance matrices.
Test Results Box's M F 12.391 Approx. 1.968 df1 6 df2 38111.579 Sig. .066
Decision
The null hypothesis that the variance/covariance matrices are equal in the population is accepted.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
62
Unstandardized coefficients
1st Function Z1 = 2.375 - 0.146 (age) + 1.946 (counsel) 2nd Function Z2 = -6.655 + 0.253 (age) + 1.682 (counsel) Given a 22-year-old offender with retained counsel Z1 = 2.375 - 0.146 (22) + 1.946 (1) = 1.109 Z2 = -6.655 + 0.253 (22) + 1.682 (1) = 0.593 These two discriminant scores will be used to classify the offender into one of the three pre-disposition groups.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
63
Function 1 2
df
4 1
1st Function Eigenvalue = 0.8819 Of the variance explained by the two functions, the 1st explains 97.97% The canonical correlation () between the two predictor variables and the discriminant scores produced by the 1st function = 0.6846 The chi-square test of the Wilks' is significant (2 = 43.254, p < 0.0001). The null hypothesis that in the population the BSS = 0, = 0, is rejected.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
2nd Function Eigenvalue = 0.0183 Of the variance explained by the two functions, the 2nd explains 2.03% The canonical correlation () between the two predictor variables and the discriminant scores produced by the 2nd function = 0.1341 The chi-square test of the Wilks' is not significant (2 = 1.206, p = 0.272). The null hypothesis that in the population the BSS = 0, = 0, is accepted. Decision Since the second function is not significant, its associated statistics will not be used in the interpretation of the affect of age and counsel on pre-disposition status.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
65
1st Function Zz1 = -0.508 (age) + 0.770 (counsel) 2nd Function Zz2 = 0.883 (age) + 0.666 (counsel) Nota Bene Recall that the 2nd function was found not to be significant. Of the two variables in the 1st function, counsel has the greater impact.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
66
What is the Correlation Between Each of the Predictor Variables and the Discriminant Scores Produced By the Two Functions?
Structure coefficients, or loadings
Structure Matrix Function COUNSEL a AGE_FIRS PR_ARRSTa AGE DR_SCOREa 1 .867* .291* -.219* -.654 -.109 2 .499 .067 -.190 .757* .195*
Pooled within-groups correlations between discriminating variables and standardized canonical discriminant functions Variables ordered by absolute size of correlation within function. *. Largest absolute correlation between each variable and any discriminant function a. This variable not used in the analysis.
The predictors counsel, age_firs, and pr_arrst load highest on the 1st function, while age and dr_score load highest on the 2nd function.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
67
What is the Mean Discriminant Score for Each Pre-Disposition Group on Each Discriminant Function?
Recall that these mean discriminant scores are called centroids and that the 2nd discriminant function is
not significant.
Functions at Group Centroids Function 1 2 -1.001 -1.64E-03 .856 -.160 .827 .201
Notice how numerically similar the centroids of the 1st function are for groups 2 and 3, i.e. bail and ROR. This means that the 1st function, while significant, will do a poor job discriminating between the bail and ROR groups, and most of its discriminatory power will be discriminating between the jail group versus the other two groups.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
68
What Would a Scatterplot of the Discrimanant Scores of the Three PreDisposition Groups Reveal?
Canonical Discriminant Functions
3 2
PRE_STAT
-1 Group Centroids Group 3 -2 -3 -3 -2 -1 0 1 2 Group 2 Group 1
Function 1
Reading across horizontally, notice how the 1st discriminant function separates the centroid-pair of the jail group (1) from that of the bail (2) and ROR (3) groups. Reading vertically, however, notice that the 2nd discriminant function fails to separate the three centroidpairs of the three groups. This is why the 2nd function was not found to be significant.
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
69
Predicted Group 2 2 2** 2 2 2** 2 2** 3 2** 1** 2 3** 1** 1** 1 1 1** 1 1**
P(D>d | G=g) p df .682 .787 .682 .833 .787 .811 .833 .811 .800 .682 .154 .682 .662 .551 .392 .154 .914 .711 .842 .085
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
P(G=g | D=d) .578 .550 .578 .521 .550 .490 .521 .490 .465 .578 .596 .578 .470 .773 .721 .596 .910 .818 .855 .526
Group
3 3 3 3 3 3 3 3 2 3 2 3 2 2 2 2 2 2 2 2
P(G=g | D=d) .391 .409 .391 .426 .409 .442 .426 .442 .426 .391 .283 .391 .392 .144 .184 .283 .049 .112 .086 .341
Squared Mahalanobis Distance to Centroid Function 1 Function 2 1.127 1.694 -.411 .649 1.548 -.158 1.127 1.694 -.411 .342 1.403 .096 .649 1.548 -.158 .206 1.257 .349 .342 1.403 .096 .206 1.257 .349 1.044 .965 .856 1.127 1.694 -.411 4.394 -.398 -1.840 1.127 1.694 -.411 1.613 .819 1.109 3.707 -.836 -1.080 3.765 -.690 -1.333 4.394 -.398 -1.840 5.185 -1.419 -.067 3.820 -.981 -.827 4.104 -1.127 -.573 4.965 -.252 -2.094
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
70
a Classification Results
Original
Count
Predicted Group Membership 1.00 2.00 3.00 27 1 4 5 14 2 3 11 3 84.4 3.1 12.5 23.8 66.7 9.5 17.6 64.7 17.6
Hit Ratio = (44 / 70) (100) = 62.9% Errors = (26 / 70) (100) = 37.14%
Notes on Discriminat Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University