Tải bản đầy đủ (.ppt) (67 trang)

Statistics for business economics 7th by paul newbold chapter 14

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (204.85 KB, 67 trang )

Statistics for
Business and Economics
7th Edition

Chapter 14
Analysis of Categorical Data

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-1


Chapter Goals
After completing this chapter, you should be able to:


Use the chi-square goodness-of-fit test to determine
whether data fits specified probabilities



Perform tests for the Poisson and Normal distributions



Set up a contingency analysis table and perform a chisquare test of association

 Use the sign test for paired or matched samples
 Recognize when and how to use the Wilcoxon signed
rank test for paired or matched samples
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall



Ch. 14-2


Chapter Goals
(continued)

After completing this chapter, you should be able to:

 Use a sign test for a single population median
 Apply a normal approximation for the Wilcoxon signed
rank test

 Know when and how to perform a Mann-Whitney U-test
 Explain Spearman rank correlation and perform a test
for association

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-3


Nonparametric Statistics


Nonparametric Statistics


Fewer restrictive assumptions about data
levels and underlying probability distributions




Population distributions may be skewed
The level of data measurement may only be
ordinal or nominal

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-4


Goodness-of-Fit Tests

14.1



Does sample data conform to a hypothesized
distribution?
 Examples:






Do sample results conform to specified expected
probabilities?
Are technical support calls equal across all days of

the week? (i.e., do calls follow a uniform
distribution?)
Do measurements from a production process
follow a normal distribution?

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-5


Chi-Square Goodness-of-Fit Test
(continued)


Are technical support calls equal across all days of the
week? (i.e., do calls follow a uniform distribution?)
 Sample data for 10 days per day of week:
Sum of calls for this day:
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday

290
250
238
257

265
230
192

 = 1722
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-6


Logic of Goodness-of-Fit Test


If calls are uniformly distributed, the 1722 calls
would be expected to be equally divided across
the 7 days:

1722
246 expected calls per day if uniform
7


Chi-Square Goodness-of-Fit Test: test to see if
the sample results are consistent with the
expected results

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-7



Observed vs. Expected
Frequencies
Observed
Oi

Expected
Ei

Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday

290
250
238
257
265
230
192

246
246
246
246
246

246
246

TOTAL

1722

1722

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-8


Chi-Square Test Statistic
H0: The distribution of calls is uniform
over days of the week
H1: The distribution of calls is not uniform


The test statistic is
2
(O

E
)
i
 2  i
Ei
i1

K

(where d.f. K  1)

where:
K = number of categories
Oi = observed frequency for category i
Ei = expected frequency for category i
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-9


The Rejection Region
H0: The distribution of calls is uniform
over days of the week
H1: The distribution of calls is not uniform
2
(O

E
)
i
 2  i
Ei
i 1
K




Reject H0 if

2

 

2
α

(with k – 1 degrees
of freedom)
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall



0

2
Do not
reject H0



2

Reject H0

Ch. 14-10



Chi-Square Test Statistic
H0: The distribution of calls is uniform
over days of the week
H1: The distribution of calls is not uniform
2
2
2
(290

246)
(250

246)
(192

246)
2 

 ... 
23.05
246
246
246

k – 1 = 6 (7 days of the week) so
use 6 degrees of freedom:

2.05 = 12.5916
Conclusion:
2 = 23.05 > 2 = 12.5916 so

reject H0 and conclude that the
distribution is not uniform
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

 = .05

0

Do not
reject H0

Reject H0

2.05 = 12.5916

2

Ch. 14-11


14.2

Goodness-of-Fit Tests, Population
Parameters Unknown

Idea:
 Test whether data follow a specified distribution
(such as binomial, Poisson, or normal) . . .



. . . without assuming the parameters of the
distribution are known



Use sample data to estimate the unknown
population parameters

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-12


Goodness-of-Fit Tests, Population
Parameters Unknown
(continued)


Suppose that a null hypothesis specifies category
probabilities that depend on the estimation (from the
data) of m unknown population parameters



The appropriate goodness-of-fit test is the same as in
the previously section . . .
(Oi  Ei )2
 
Ei
i1

2

K



. . . except that the number of degrees of freedom for
the chi-square random variable is
Degrees of Freedom (K  m  1)



Where K is the number of categories

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-13


14.3

Test of Normality



The assumption that data follow a normal
distribution is common in statistics




Normality was assessed in prior chapters (for
example, with Normal probability plots in
Chapter 5)



Here, a chi-square test is developed

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-14


Test of Normality
(continued)


Two population parameters can be estimated using
sample data:
n

3
(x

x
)
 i

Skewness  i 1


ns3

n

4
(x

x
)
 i

Kurtosis  i 1


ns 4

For a normal distribution,
Skewness = 0
Kurtosis = 3

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-15


Jarque-Bera
Test for Normality


Consider the null hypothesis that the population

distribution is normal



The Jarque-Bera Test for Normality is based on the closeness the
sample skewness to 0 and the sample kurtosis to 3



The test statistic is

 (Skewness)2 (Kurtosis  3)2 
JB n


6
24




as the number of sample observations becomes very large, this
statistic has a chi-square distribution with 2 degrees of freedom



The null hypothesis is rejected for large values of the test statistic

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall


Ch. 14-16


Jarque-Bera
Test for Normality
(continued)


The chi-square approximation is close only for very
large sample sizes



If the sample size is not very large, the BowmanShelton test statistic is compared to significance points
from text Table 14.9
Sample
size N

10%
point

5% point

Sample
size N

10%
point

5% point


20
30
40
50
75
100
125
150

2.13
2.49
2.70
2.90
3.09
3.14
3.31
3.43

3.26
3.71
3.99
4.26
4.27
4.29
4.34
4.39

200
250

300
400
500
800


3.48
3.54
3.68
3.76
3.91
4.32
4.61

4.43
4.61
4.60
4.74
4.82
5.46
5.99

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-17


Example: Jarque-Bera
Test for Normality



The average daily temperature has been recorded for
200 randomly selected days, with sample skewness
0.232 and kurtosis 3.319



Test the null hypothesis that the true distribution is
normal

 (Skewness)2 (Kurtosis 3) 2 
 (0.232)2 (3.319  3) 2 
JB n


 200
 2.642
6
24
6
24






From Table 14.9 the 10% critical value for n = 200 is
3.48, so there is not sufficient evidence to reject that the
population is normal


Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-18


14.3

Contingency Tables

Contingency Tables


Used to classify sample observations according
to a pair of attributes



Also called a cross-classification or crosstabulation table



Assume r categories for attribute A and c
categories for attribute B


Then there are (r x c) possible cross-classifications

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall


Ch. 14-19


r x c Contingency Table
Attribute B
Attribute A

1

2

...

C

Totals

1
2
.
.
.
r
Totals

O11

O12

O1c


R1

O21

O22

O2c

R2

.
.
.
Or1

.
.
.
Or2

.
.
.
Orc

.
.
.
Rr


C1

C2









Cc

n

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-20


Test for Association


Consider n observations tabulated in an r x c
contingency table




Denote by Oij the number of observations in
the cell that is in the ith row and the jth column



The null hypothesis is

H0 : No association exists
between the two attributes in the population


The appropriate test is a chi-square test with
(r-1)(c-1) degrees of freedom

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-21


Test for Association
(continued)



Let Ri and Cj be the row and column totals
The expected number of observations in cell row i and
column j, given that H0 is true, is

Eij 



R iC j
n

A test of association at a significance level  is based
on the chi-square distribution and the following decision
rule
r

c

Reject H0 if χ 2 
i1 j1

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

(Oij  Eij )2
Eij

 χ (r2  1)c  1),α
Ch. 14-22


Contingency Table Example
Left-Handed vs. Gender
 Dominant Hand: Left vs. Right
 Gender: Male vs. Female
H0: There is no association between
hand preference and gender
H1: Hand preference is not independent of gender


Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-23


Contingency Table Example
(continued)

Sample results organized in a contingency table:
Hand Preference
sample size = n = 300:
120 Females, 12
were left handed
180 Males, 24 were
left handed

Gender

Left

Right

Female

12

108

120


Male

24

156

180

36

264

300

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-24


Logic of the Test
H0: There is no association between
hand preference and gender
H1: Hand preference is not independent of gender


If H0 is true, then the proportion of left-handed females
should be the same as the proportion of left-handed
males




The two proportions above should be the same as the
proportion of left-handed people overall

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 14-25


×