Tải bản đầy đủ (.ppt) (28 trang)

discriminant y học

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (725.3 KB, 28 trang )

Discriminan
t Analysis


What Is It?
 Analysis when the dependent variable is categorical or
nominal and independent variables metric
 Discriminate or Classify individuals into groups on the
basis of independent variables.
 Involves deriving a variate which is a linear combination
of the independent variables.
 This variate obtained by maximizing Between-Group
Variance relative to Within-Variance Group.
 The linear combination is called Discriminant Function
as follows:
Z = a + W1X1 + W2X2 + ... + WnXn
 Can also be used to test the hypothesis that group
means of a set of independent variables for 2 or more
groups are equal.
Muhamad Jantan & T. Ramayah

Discriminant Analysis

2


Objectives
♦ Profile Analysis
♦ Predictive Technique
♦ Test differences between groups on average score


profiles of a set of variables
♦ Determine the impact of independent variables on the
differences in average score profiles of two or more
groups.
♦ Classify units on the basis of their scores on a set of
independent variables.
♦ Establishing the number and composition of the
dimensions of discrimination between groups formed
from the set of independent variables

Muhamad Jantan & T. Ramayah

Discriminant Analysis

3


Assumptions
♦ Independent variables distributed as Multivariate Normal









with unknown but equal Covariance matrices across the
groups.

Non-normality affects estimation of the discriminant
function. Use logistic regression instead.
Inequality of Covariance Matrices across groups affects
classification process. If sample size is small, estimation
process affected, where groups with larger covariance will
be overclassified. Rectify by using larger sample sizes or
quadratic classification techniques.
Multicollinearity of Independent Variables: Especially
when stepwise procedure is used. Especially critical when
used for explanation purposes. When interpreting the
results, always be aware of the level of collinearity.
Linearity of relationship
Outliers may have substantial impact on results

Muhamad Jantan & T. Ramayah

Discriminant Analysis

4


Estimation And Assessment
♦ Estimation
 Simultaneous Method, or
 Stepwise Method

♦ Assessment
 Statistical Significance of Discriminant Function
 Wilk’s λ, Hotelling trace, Pillai Criteria - evaluate
the discriminatory power of the function

 Roy’s max. root - evaluate the first discriminant
function only
 For Stepwise procedure: Mahalanobis D 2
and
Rao’s V ;
 D2 - uses distances, adjusts for unequal
covariances
 When 3 or more groups - evaluate overall
significance and significance of individual function

Muhamad Jantan & T. Ramayah

Discriminant Analysis

5


SPSS Commands

 Dividing the Sample into Estimation and Split/Holdout

Sample: Random Selection Command:
TRANSFORM ⇒ RANDOM NUMBER SEED
TRANSFORM ⇒ COMPUTE Randz = UNIFORM(1) >
0.65 ⇒ will give ≈ 65% of respondent for estimation
and the remainder for holdout sample
 Estimating the Discriminant Function(s):
SPSS ⇒ CLASSIFY ⇒ DISCRIMINANT:
 This will give you a dialogue box for
Discriminant: Select the grouping (dependent)

variable and the independent variables. Also
need the SELECT option to identify units for
estimation sample: In this case use Randz with
SET VALUE at 0.
 Options Available:
• Method: Stepwise or Simultaneous
• Classify: Provide options for Prior Probabilities, Using VarCov. Matrices, Plots and Display

Muhamad Jantan & T. Ramayah

Discriminant Analysis

6


SPSS: Discriminant
Analysis

SPSS Command: Analyze  Classify  Discriminant 
Select Cases  Statistics  Classify

Muhamad Jantan & T. Ramayah

Discriminant Analysis

7


SPSS: Results for Two
groups

Wilks' Lambda
Test of Funct ion(s)
1

Wilks'
Lambda
.283

Chi-square
67.544

df
7

Sig.
.000

Wilks’ is significant indicating that we
have a significant discriminant function

Muhamad Jantan & T. Ramayah

Discriminant Analysis

8


SPSS: Results for Two
groups
Eigenvalues

Funct ion
1

Eigenvalue
2.534 a

% of
Variance
100.0

Cumulat ive %
100.0

Canonical
Correlat ion
.847

a. First 1 canonical discriminant funct ions were used in t he
analysis.

Indicates that (0.847)2 = 72% of variance in the
dependent variable is explained by the
independent variables

Muhamad Jantan & T. Ramayah

Discriminant Analysis

9



Descriptives
Group St at ist ics

Art iculat ion Of Needs
Specificat ion Buying

Tot al Value Analysis

Tot al

Muhamad Jantan & T. Ramayah

Delivery Speed
Price Level
Price Flexibilit y
Manufact urer Image
Service
Salesforce Image
Product Qualit y
Delivery Speed
Price Level
Price Flexibilit y
Manufact urer Image
Service
Salesforce Image
Product Qualit y
Delivery Speed
Price Level
Price Flexibilit y

Manufact urer Image
Service
Salesforce Image
Product Qualit y

Mean
2.404
2.946
6.707
5.343
2.646
2.693
8.421
4 .24 2
2.048
8.613
5.123
3.132
2.587
6.023
3.369
2.475
7.708
5.227
2.902
2.637
7.161

Discriminant Analysis


St d.
Deviat ion
1.0772
1.1971
.8602
.9155
.9913
.6248
.8883
1.0810
1.04 91
1.14 33
1.3732
.5369
.8628
1.2672
1.4149
1.2004
1.3935
1.1738
.8163
.7547
1.6302

Valid N (list wise)
Unweight ed
Weight ed
28
28.000
28

28.000
28
28.000
28
28.000
28
28.000
28
28.000
28
28.000
31
31.000
31
31.000
31
31.000
31
31.000
31
31.000
31
31.000
31
31.000
59
59.000
59
59.000
59

59.000
59
59.000
59
59.000
59
59.000
59
59.000

10


Assessment Of Overall Fit
♦ Why?
 Concept of R2 - Classification matrix measures the
predictive ability of the discriminant functions.
 Hits ratio - how well the functions classify the units; % of
correct classification
 Chi-square of D2 equivalence to F test for R2
♦ Cutting Score: The score used for constructing the classification
matrix. Optimal cutting score depends on sizes of groups. If equal, it
is halfway between the two groups centroid.
Z CU =

N Z
N
A

B

A

+ NBZA
+NB

♦ Probabilities of Classification: Need to be specified by
researcher.
 Default is equal probabilities. Used when unsure if sample
proportions are representative
 Proportional to group size: When sample drawn randomly
from population

Muhamad Jantan & T. Ramayah

Discriminant Analysis

11


Measures of Predictive
Accuracy
♦ How good is the Hit Ratio? Compute
Hit Ratio for
split sample and compare it against
 Maximum Chance Criterion: This is just the size of


the largest group. Minimum criterion to be met by the
Hit Ratio
Proportional Chance Criterion: Should be used when

group sizes are unequal. If two groups this is given as
follows:
Cpro = p2 + (1 - p)2



p = proportion in group

Press’s Q: Compares No. of correct classification
(n) against Total Sample (N) and Number of
Groups (k)

[N - (n * k)]2
Press Q =
N(k - 1)

Q ∼ χ2 with 1 degree of freedom.
Muhamad Jantan & T. Ramayah

Discriminant Analysis

12


SPSS Results: Assessment
Classificat ion Resultsa ,b

Cases Selected

Cases not

selected
for
validation
purposes

Original

Count
%

Cases Not Select ed

Original

Count
%

Art iculat ion Of Needs
Specification Buying
Total Value Analysis
Specification Buying
Total Value Analysis
Specification Buying
Total Value Analysis
Specification Buying
Total Value Analysis

Predicted Group Membership
Specificat ion
Total Value

Buying
Analysis
27
1
4
27
96.4
3.6
12.9
87.1
10
2
4
25
83.3
16.7
13.8
86.2

a. 91.5% of selected original grouped cases correct ly classified.
b. 85.4% of unselected original grouped cases correctly classified.

Hits ratio = %
of correct
classification

For Selected Cases:

Tot al
28

31
100.0
100.0
12
29
100.0
100.0

When
comparing
hits ratio with
chance
criteria use
the hold-out
sample and
and that the
model
accuracy
should be
25% better
than chance

COMPARED Maximum Chance Criterion: 70.7%;
TO
Proportional Chance Criterion: 58.4%
Press Q = 99.90

Muhamad Jantan & T. Ramayah

Discriminant Analysis


13


SPSS Results: Assessment
St ruct ure Mat rix
Funct ion
1
Product Qualit y
-.693
Price Flexibilit y
.597
Delivery Speed
.544
Price Level
-.256
Service
.197
Manufact urer Image
-.060
Salesforce Image
-.044
Pooled wit hin-groups correlat ions bet ween discriminat ing
variables and st andardized canonical discriminant funct ions
Variables ordered by absolut e size of correlat ion wit hin
funct ion.

Linear
Correlation
between PQ and

the discriminant
function

Muhamad Jantan & T. Ramayah

Discriminant Analysis

14


SPSS Results: Assessment
Canonical Discriminant Funct ion Coefficient s
Funct ion
1
Delivery Speed
.419
Price Level
.116
Price Flexibilit y
.560
Manufact urer Image
-.049
Service
-.141
Salesforce Image
.342
Product Qualit y
-.623
(Const ant )
-1.788

Unst andardized coefficient s

Coefficients used in
the discriminant
function to
calculate the
Discriminant scores
which is used to
classify the
individuals

Discriminant Function:
Z = -1.788 - .419X1 + .116X2 + .560X3 - .049X4 - .141X5 + .342X6 - .623X7

Muhamad Jantan & T. Ramayah

Discriminant Analysis

15


SPSS Results: Assessment
Funct ions at Group Cent roids
Art iculat ion Of Needs
Specificat ion Buying
Tot al Value Analysis

Funct ion
1
-1.646

1.487

Unst andardized canonical discriminant
funct ions evaluat ed at group means

Discriminant scores
evaluated at the
means of
(x1,x2,x3,x4,x5,x6,x7)
for the two groups

Cutting Score:

This means all
respondents with Z
31( −1.646) + 28(1.487)
ZCU =
= − - 0.15915 scores less than
28 + 31
-0.15915 will be
classified into
Note: No. in Group 0 (Specification)
Specification Buying
= 27+1=28 and Group 1 (Total) =
and Total Value
27+4=31
Analysis otherwise
Muhamad Jantan & T. Ramayah

Discriminant Analysis


16


Interpretation Of Results
 Relative importance of each of the independent

variable in discriminating the groups.

• Discrimination weight (or coefficient): relative
contribution of the variable to the function;
equivalent to beta of regression; but weight is
unstable - thus caution
 Discriminant Loading (or structure correlation):
measures the simple linear correlation between the
independent and the discriminant function
 Partial F-values: Only when stepwise procedure is
used. Large F values indicate large contribution

 Potency

Index: Relative measure amongst
variables; Composite contribution of a variable to
all the significant discriminant function; Used when
more than one significant discriminant function

Muhamad Jantan & T. Ramayah

Discriminant Analysis


17


SPSS Results – 3 groups
Interpretation
♦ Use Low, Moderate and High Satisfaction
♦ How to assess the results?

 Significance of Discriminant Function
 Wilk’s λ, Hotelling trace, Pillai Criteria - evaluate the
discriminatory power of the function
♦ Predictive Accuracy: Classification Table – summary and
individual
 Hits Ratio
 Classification Results;
 Determinant Function
 Cutoff Points – Territorial Map
♦ Relative Importance of Variables
 Discriminant Weights
 Discriminant Loadings
 Potency Index;

Muhamad Jantan & T. Ramayah

Discriminant Analysis

18


Discriminant Analysis – 3

group example
Profile Analysis: Who are the companies that treats each
purchase from HATCO as a straight rebuy, modified
rebuy, and new task
Method: Enter

Muhamad Jantan & T. Ramayah

Discriminant Analysis

19


SPSS: Results for 3
groups
Eige nvalue s

Funct ion
1
2

Eigenvalue
3.952a
.948a

% of
Variance
80.7
19.3


Cumulat ive %
80.7
100.0

Canonical
Correlat ion
.893
.698

a. First 2 canonical discriminant funct ions were used in t he
analysis.

Wilks' Lambda
Test of Funct ion(s)
1 t hrough 2
2

Wilks'
Lambda
.104
.513

Chi-square
120.131
35.342

df
14
6


Sig.
.000
.000

Both discriminant functions are significant

Muhamad Jantan & T. Ramayah

Discriminant Analysis

20


Descriptives
Group St at ist ics

Type of Buying Sit uat ion
New Task

Modified Rebuy

St raight Rebuy

Tot al

Muhamad Jantan & T. Ramayah

Delivery Speed
Price Level
Price Flexibilit y

Manufact urer Image
Service
Salesforce Image
Product Qualit y
Delivery Speed
Price Level
Price Flexibilit y
Manufact urer Image
Service
Salesforce Image
Product Qualit y
Delivery Speed
Price Level
Price Flexibilit y
Manufact urer Image
Service
Salesforce Image
Product Qualit y
Delivery Speed
Price Level
Price Flexibilit y
Manufact urer Image
Service
Salesforce Image
Product Qualit y

Mean
2.213
2.183
6.84 3

4 .991
2.14 8
2.591
7.983
3.371
3.879
7.007
5.814
3.607
2.886
7.900
4 .577
1.886
9.059
5.100
3.24 1
2.527
5.832
3.369
2.4 75
7.708
5.227
2.902
2.637
7.161

Discriminant Analysis

St d.
Deviat ion

.9593
.8239
.9154
1.0004
.5720
.6230
1.2561
.9770
1.1081
1.2332
1.0399
.6522
.74 61
1.2521
.9904
.8593
.6967
1.334 2
.3996
.8751
1.3275
1.4 14 9
1.2004
1.3935
1.1738
.8163
.754 7
1.6302

Valid N (list wise)

Unweight ed
Weight ed
23
23.000
23
23.000
23
23.000
23
23.000
23
23.000
23
23.000
23
23.000
14
14 .000
14
14 .000
14
14 .000
14
14 .000
14
14 .000
14
14 .000
14
14 .000

22
22.000
22
22.000
22
22.000
22
22.000
22
22.000
22
22.000
22
22.000
59
59.000
59
59.000
59
59.000
59
59.000
59
59.000
59
59.000
59
59.000

21



SPSS Results: Assessment
Classificat ion Result sa ,b

Cases Select ed

Original

Count

%

Cases Not Selected

Original

Count

%

Type of Buying Sit uat ion
New Task
Modified Rebuy
St raight Rebuy
New Task
Modified Rebuy
St raight Rebuy
New Task
Modified Rebuy

St raight Rebuy
New Task
Modified Rebuy
St raight Rebuy

Predict ed Group Membership
Modified
St raight
New Task
Rebuy
Rebuy
21
1
1
0
11
3
0
2
20
91.3
4.3
4.3
.0
78.6
21.4
.0
9.1
90.9
7

2
2
3
7
8
0
0
12
63.6
18.2
18.2
16.7
38.9
44.4
.0
.0
100.0

Tot al
23
14
22
100.0
100.0
100.0
11
18
12
100.0
100.0

100.0

a. 88.1% of select ed original grouped cases correct ly classified.
b. 63.4% of unselect ed original grouped cases correct ly classified.

Hits ratio = %
of correct
classification

COMPARED
TO

Muhamad Jantan & T. Ramayah

For Hold-out Sample (Unselected Cases):
Maximum Chance Criterion: 43.90%
Proportional Chance Criterion: 35.10%
[41 - (26 * 3]2
Press Q =
= 35.10
41(3 - 1)
Discriminant Analysis

22


SPSS Results: Assessment
Note:

St ruct ure Mat rix

Funct ion
1
2
Delivery Speed
.545*
-.081
Price Level
-.040
.915*
Service
.487
.703*
Price Flexibility
.520
-.523*
Product Quality
-.365
.395*
Manufacturer Image
.032
.297*
Salesforce Image
-.012
.196*
Pooled within-groups correlations between discriminat ing
variables and st andardized canonical discriminant funct ions
Variables ordered by absolut e size of correlation within
function.
*. Largest absolut e correlation between each variable
and any discriminant function


Muhamad Jantan & T. Ramayah

Discriminant Analysis

• The “*” indicate
the variables that
likely to dominate
the particular
function
• There are 2
functions because
there are 3 groups

23


SPSS Results: Assessment
Canonical Discriminant Funct ion Coefficient s
Function
1
Delivery Speed
-.324
Price Level
-.431
Price Flexibility
.900
Manufacturer Image
.285
Service

2.259
Salesforce Image
-.323
Product Quality
-.255
(Constant )
-10.145
Unstandardized coefficient s

2
1.055
1.770
-.144
.162
-1.463
-.149
.111
-3.830

Coefficients used in
the discriminant
function to
calculate the
Discriminant scores
which is used to
classify the
individuals

Discriminant Function:
Z1 = -10.145 - .324X1 - .431X2 + .900X3 + .285X4 + 2.259X5 – 0.323X6 - .255X7

Z2 = -3.830 + 1.055X1 + 1.770X2 - .144X3 + .162X4 - 1.463X5 – .149X6 + .111X7
Muhamad Jantan & T. Ramayah

Discriminant Analysis

24


SPSS Results: Assessment
Centroid

Territorial Map
Canonical Discriminant
Function 2
-6.0
-4.0
-2.0
.0
2.0
4.0

6.0 
12
23

12
23

12
23

4.0 

12



23


12
23

12
23
2.0 

12

23


12
*
23

12
23
.0 

 12


23 


*
12
23
*

12
23
-2.0 


12 
23



12 23

1223
-4.0 


123



13


13
-6.0 
13

-6.0
-4.0
-2.0
.0
2.0
4.0
Canonical Discriminant Function 1

Group 2

Group 1

Muhamad Jantan & T. Ramayah

Group 3

Discriminant Analysis

6.0




















6.0

25


Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay
×