Tải bản đầy đủ (.pdf) (662 trang)

Ebook Statistics for business and economics (12/E): Part 2

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (19.97 MB, 662 trang )

www.downloadslide.net

CHAPTER

12

Comparing Multiple
Proportions, Test of
Independence and
Goodness of Fit
CONTENTS

12.2 TEST OF INDEPENDENCE

STATISTICS IN PRACTICE:
UNITED WAY

12.3 GOODNESS OF FIT TEST
Multinomial Probability
Distribution
Normal Probability Distribution

12.1 TESTING THE EQUALITY OF
POPULATION PROPORTIONS
FOR THREE OR MORE
POPULATIONS
A Multiple Comparison
Procedure

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.




508

Chapter 12

STATISTICS

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

in PRACTICE

UNITED WAY*

ROCHESTER, NEW YORK

United Way of Greater Rochester is a nonprofit organization dedicated to improving the quality of life for all
people in the seven counties it serves by meeting the
community’s most important human care needs.
The annual United Way/Red Cross fund-raising campaign funds hundreds of programs offered by more than
200 service providers. These providers meet a wide variety
of human needs—physical, mental, and social—and serve
people of all ages, backgrounds, and economic means.
The United Way of Greater Rochester decided to
conduct a survey to learn more about community perceptions of charities. Focus-group interviews were held
with professional, service, and general worker groups
to obtain preliminary information on perceptions. The
information obtained was then used to help develop the

questionnaire for the survey. The questionnaire was
pretested, modified, and distributed to 440 individuals.
A variety of descriptive statistics, including frequency distributions and crosstabulations, were provided from the data collected. An important part of the
analysis involved the use of chi-square tests of independence. One use of such statistical tests was to determine
whether perceptions of administrative expenses were
independent of the occupation of the respondent.
The hypotheses for the test of independence were:

H0: Perception of United Way administrative
expenses is independent of the occupation of
the respondent.
Ha: Perception of United Way administrative
expenses is not independent of the occupation
of the respondent.
Two questions in the survey provided categorical data
for the statistical test. One question obtained data on

*The authors are indebted to Dr. Philip R. Tyler, marketing consultant to
the United Way, for providing this Statistics in Practice.

United Way programs meet the needs of children as
well as adults. © Jim West/Alamy.
perceptions of the percentage of funds going to administrative expenses (up to 10%, 11–20%, and 21% or more).
The other question asked for the occupation of the
respondent.
The test of independence led to rejection of the
null hypothesis and to the conclusion that perception
of United Way administrative expenses is not independent of the occupation of the respondent. Actual administrative expenses were less than 9%, but 35% of the
respondents perceived that administrative expenses were
21% or more. Hence, many respondents had inaccurate

perceptions of administrative expenses. In this group,
production-line, clerical, sales, and professional-technical
employees had the more inaccurate perceptions.
The community perceptions study helped United
Way of Rochester develop adjustments to its programs
and fund-raising activities. In this chapter, you will
learn how tests, such as described here, are conducted.

In Chapters 9, 10, and 11 we introduced methods of statistical inference for hypothesis tests
about the means, proportions, and variances of one and two populations. In this chapter,
we introduce three additional hypothesis-testing procedures that expand our capacity for
making statistical inferences about populations.
Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


12.1

www.downloadslide.net

Testing the Equality of Population Proportions for Three or More Populations

509

The test statistic used in conducting the hypothesis tests in this chapter is based on the
chi-square ( 2) distribution. In all cases, the data are categorical. These chi-square tests are
versatile and expand hypothesis testing with the following applications.
1. Testing the equality of population proportions for three or more populations
2. Testing the independence of two categorical variables
3. Testing whether a probability distribution for a population follows a specific historical or theoretical probability distribution

We begin by considering hypothesis tests for the equality of population proportions for
three or more populations.

12.1

Testing the Equality of Population Proportions
for Three or More Populations
In Section 10.2 we introduced methods of statistical inference for population proportions
with two populations where the hypothesis test conclusion was based on the standard
normal (z) test statistic. We now show how the chi-square ( 2) test statistic can be used to
make statistical inferences about the equality of population proportions for three or more
populations. Using the notation

and

p1 ϭ population proportion for population 1
p2 ϭ population proportion for population 2
pk ϭ population proportion for population k

the hypotheses for the equality of population proportions for k Ն 3 populations are as
follows:
H0: p1 ϭ p2 ϭ . . . ϭ pk
Ha: Not all population proportions are equal
If the sample data and the chi-square test computations indicate H0 cannot be rejected, we
cannot detect a difference among the k population proportions. However, if the sample data
and the chi-square test computations indicate H0 can be rejected, we have the statistical
evidence to conclude that not all k population proportions are equal; that is, one or more
population proportions differ from the other population proportions. Further analyses can
be done to conclude which population proportion or proportions are significantly different
from others. Let us demonstrate this chi-square test by considering an application.

Organizations such as J.D. Power and Associates use the proportion of owners likely to
repurchase a particular automobile as an indication of customer loyalty for the automobile.
An automobile with a greater proportion of owners likely to repurchase is concluded to have
greater customer loyalty. Suppose that in a particular study we want to compare the customer loyalty for three automobiles: Chevrolet Impala, Ford Fusion, and Honda Accord.
The current owners of each of the three automobiles form the three populations for the
study. The three population proportions of interest are as follows:
p1 ϭ proportion likely to repurchase an Impala for the population of
Chevrolet Impala owners
p2 ϭ proportion likely to repurchase a Fusion for the population of Ford Fusion owners
p3 ϭ proportion likely to repurchase an Accord for the population of Honda Accord
owners
Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


510

Chapter 12

TABLE 12.1

WEB

file

AutoLoyalty

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit


SAMPLE RESULTS OF LIKELY TO REPURCHASE FOR THREE POPULATIONS
OF AUTOMOBILE OWNERS (OBSERVED FREQUENCIES)
Automobile Owners
Chevrolet Impala
Ford Fusion
Honda Accord

Likely to
Repurchase

Yes
No

69
56

Total

125

120

Total

123

312

80


52

188

200

175

500

The hypotheses are stated as follows:
H0: p1 ϭ p2 ϭ p3
Ha: Not all population proportions are equal

In studies such as these, we
often use the same sample
size for each population.
We have chosen different
sample sizes in this
example to show that the
chi-square test is not
restricted to equal sample
sizes for each of the k
populations.

To conduct this hypothesis test we begin by taking a sample of owners from each of the
three populations. Thus we will have a sample of Chevrolet Impala owners, a sample of
Ford Fusion owners, and a sample of Honda Accord owners. Each sample provides
categorical data indicating whether the respondents are likely or not likely to repurchase the

automobile. The data for samples of 125 Chevrolet Impala owners, 200 Ford Fusion
owners, and 175 Honda Accord owners are summarized in the tabular format shown in
Table 12.1. This table has two rows for the responses Yes and No and three columns, one
corresponding to each of the populations. The observed frequencies are summarized in the
six cells of the table corresponding to each combination of the likely to repurchase
responses and the three populations.
Using Table 12.1, we see that 69 of the 125 Chevrolet Impala owners indicated that
they were likely to repurchase a Chevrolet Impala. One hundred and twenty of the 200
Ford Fusion owners and 123 of the 175 Honda Accord owners indicated that they were
likely to repurchase their current automobile. Also, across all three samples, 312 of the
500 owners in the study indicated that they were likely to repurchase their current automobile. The question now is how do we analyze the data in Table 12.1 to determine if the
hypothesis H0: p1 ϭ p2 ϭ p3 should be rejected?
The data in Table 12.1 are the observed frequencies for each of the six cells that represent the six combinations of the likely to repurchase response and the owner population. If
we can determine the expected frequencies under the assumption H0 is true, we can use the
chi-square test statistic to determine whether there is a significant difference between the
observed and expected frequencies. If a significant difference exists between the observed
and expected frequencies, the hypothesis H0 can be rejected and there is evidence that not
all the population proportions are equal.
Expected frequencies for the six cells of the table are based on the following rationale.
First, we assume that the null hypothesis of equal population proportions is true. Then we
note that in the entire sample of 500 owners, a total of 312 owners indicated that they were
likely to repurchase their current automobile. Thus, 312/500 ϭ .624 is the overall sample
proportion of owners indicating they are likely to repurchase their current automobile. If
H0: p1 ϭ p2 ϭ p3 is true, .624 would be the best estimate of the proportion responding likely
to repurchase for each of the automobile owner populations. So if the assumption of H0 is true,
we would expect .624 of the 125 Chevrolet Impala owners, or .624(125) ϭ 78 owners to indicate they are likely to repurchase the Impala. Using the .624 overall sample proportion, we
would expect .624(200) ϭ 124.8 of the 200 Ford Fusion owners and .624(175) ϭ 109.2

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.



12.1

www.downloadslide.net

511

Testing the Equality of Population Proportions for Three or More Populations

of the Honda Accord owners to respond that they are likely to repurchase their respective
model of automobile.
Let us generalize the approach to computing expected frequencies by letting eij denote the expected frequency for the cell in row i and column j of the table. With this notation, now reconsider the expected frequency calculation for the response of likely to
repurchase Yes (row 1) for Chevrolet Impala owners (column 1), that is, the expected
frequency e11.
Note that 312 is the total number of Yes responses (row 1 total), 175 is the total sample
size for Chevrolet Impala owners (column 1 total), and 500 is the total sample size.
Following the logic in the preceding paragraph, we can show
e11 ϭ

Row 1 Total

312

΂Total Sample Size΃ (Column 1 Total) ϭ ΂500΃125 ϭ (.624)125 ϭ 78

Starting with the first part of the above expression, we can write
e11 ϭ

(Row 1 Total)(Column 1 Total)

Total Sample Size

Generalizing this expression shows that the following formula can be used to provide the
expected frequencies under the assumption H0 is true.
EXPECTED FREQUENCIES UNDER THE ASSUMPTION H0 IS TRUE

eij ϭ

(Row i Total)(Column j Total)
Total Sample Size

(12.1)

Using equation (12.1), we see that the expected frequency of Yes responses (row 1) for
Honda Accord owners (column 3) would be e13 ϭ (Row 1 Total)(Column 3 Total)/(Total
Sample Size) ϭ (312)(175)/500 ϭ 109.2. Use equation (12.1) to verify the other expected
frequencies are as shown in Table 12.2.
The test procedure for comparing the observed frequencies of Table 12.1 with the
expected frequencies of Table 12.2 involves the computation of the following chi-square
statistic:

CHI-SQUARE TEST STATISTIC
2

ϭ

͚͚
i

j


( fij Ϫ eij)2
eij

(12.2)

where
fij ϭ observed frequency for the cell in row i and column j
eij ϭ expected frequency for the cell in row i and column j under the assumption
H0 is true
Note: In a chi-square test involving the equality of k population proportions, the
above test statistic has a chi-square distribution with k – 1 degrees of freedom provided the expected frequency is 5 or more for each cell.

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


512

Chapter 12

TABLE 12.2

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

EXPECTED FREQUENCIES FOR LIKELY TO REPURCHASE FOR THREE
POPULATIONS OF AUTOMOBILE OWNERS IF H0 IS TRUE
Automobile Owners

Chevrolet Impala
Ford Fusion
Honda Accord

Likely to
Repurchase

The chi-square test
presented in this section is
always a one-tailed test
with the rejection of H0
occurring in the upper tail
of the chi-square
distribution.

TABLE 12.3

Yes
No

78

124.8

109.2

312

47


75.2

65.8

188

Total

125

200

175

500

Reviewing the expected frequencies in Table 12.2, we see that the expected frequency is
at least five for each cell in the table. We therefore proceed with the computation of the chisquare test statistic. The calculations necessary to compute the value of the test statistic are
shown in Table 12.3. In this case, we see that the value of the test statistic is 2 ϭ 7.89.
In order to understand whether or not 2 ϭ 7.89 leads us to reject H0: p1 ϭ p2 ϭ p3, you
will need to understand and refer to values of the chi-square distribution. Table 12.4 shows
the general shape of the chi-square distribution, but note that the shape of a specific
chi-square distribution depends upon the number of degrees of freedom. The table shows
the upper tail areas of .10, .05, .025, .01, and .005 for chi-square distributions with up to
15 degrees of freedom. This version of the chi-square table will enable you to conduct the
hypothesis tests presented in this chapter.
Since the expected frequencies shown in Table 12.2 are based on the assumption that
H0: p1 ϭ p2 ϭ p3 is true, observed frequencies, fij, that are in agreement with expected
frequencies, eij, provide small values of (fij Ϫeij)2 in equation (12.2). If this is the case, the
value of the chi-square test statistic will be relatively small and H0 cannot be rejected. On

the other hand, if the differences between the observed and expected frequencies are large,
values of (fij Ϫeij)2 and the computed value of the test statistic will be large. In this case,
the null hypothesis of equal population proportions can be rejected. Thus a chi-square test
for equal population proportions will always be an upper tail test with rejection of H0 occurring when the test statistic is in the upper tail of the chi-square distribution.
We can use the upper tail area of the appropriate chi-square distribution and the p-value
approach to determine whether the null hypothesis can be rejected. In the automobile brand
loyalty study, the three owner populations indicate that the appropriate chi-square

COMPUTATION OF THE CHI-SQUARE TEST STATISTIC FOR THE TEST OF EQUAL
POPULATION PROPORTIONS

Observed
Frequency
( fi j)

Expected
Frequency
(ei j)

Impala
Fusion
Accord
Impala
Fusion
Accord

69
120
123
56

80
52

78.0
124.8
109.2
47.0
75.2
65.8

Total

500

500

Likely to
Automobile
Repurchase? Owner
Yes
Yes
Yes
No
No
No

Total

Difference
( fij ؊ ei j)

Ϫ9.0
Ϫ4.8
13.8
9.0
4.8
Ϫ13.8

Squared
Difference
( fij ؊ ei j)2

Squared Difference
Divided by
Expected Frequency
( fij ؊ ei j)2/eij

81.00
23.04
190.44
81.00
23.04
190.44

1.04
0.18
1.74
1.72
0.31
2.89
2


ϭ 7.89

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


12.1

www.downloadslide.net

513

Testing the Equality of Population Proportions for Three or More Populations

TABLE 12.4

SELECTED VALUES OF THE CHI-SQUARE DISTRIBUTION

Area or
probability

0

χ2

Area in Upper Tail
Degrees
of Freedom


.10

.05

.025

.01

.005

1
2
3
4
5

2.706
4.605
6.251
7.779
9.236

3.841
5.991
7.815
9.488
11.070

5.024
7.378

9.348
11.143
12.832

6.635
9.210
11.345
13.277
15.086

7.879
10.597
12.838
14.860
16.750

6
7
8
9
10

10.645
12.017
13.362
14.684
15.987

12.592
14.067

15.507
16.919
18.307

14.449
16.013
17.535
19.023
20.483

16.812
18.475
20.090
21.666
23.209

18.548
20.278
21.955
23.589
25.188

11
12
13
14
15

17.275
18.549

19.812
21.064
22.307

19.675
21.026
22.362
23.685
24.996

21.920
23.337
24.736
26.119
27.488

24.725
26.217
27.688
29.141
30.578

26.757
28.300
29.819
31.319
32.801

distribution has k Ϫ 1 ϭ 3 Ϫ 1 ϭ 2 degrees of freedom. Using row two of the chi-square
distribution table, we have the following:

Area in Upper Tail
2

Value (2 df)

.10

.05

.025

.01

.005

4.605

5.991

7.378

9.210

10.597

2

ϭ 7.89

We see the upper tail area at 2 ϭ 7.89 is between .025 and .01. Thus, the corresponding

upper tail area or p-value must be between .025 and .01. With p-value Յ .05, we reject
H0 and conclude that the three population proportions are not all equal and thus there is
a difference in brand loyalties among the Chevrolet Impala, Ford Fusion, and Honda
Accord owners. Minitab or Excel procedures provided in Appendix F can be used to show
2
ϭ 7.89 with 2 degrees of freedom yields a p-value ϭ .0193.
Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


514

Chapter 12

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Instead of using the p-value, we could use the critical value approach to draw the same
conclusion. With ␣ ϭ .05 and 2 degrees of freedom, the critical value for the chi-square test
statistic is 2 ϭ 5.991. The upper tail rejection region becomes
Reject H0 if

2

Ն 5.991

With 7.89 Ն 5.991, we reject H0. Thus, the p-value approach and the critical value approach
provide the same hypothesis-testing conclusion.
Let us summarize the general steps that can be used to conduct a chi-square test for the

equality of the population proportions for three or more populations.
A CHI-SQUARE TEST FOR THE EQUALITY OF POPULATION PROPORTIONS
FOR k Ն 3 POPULATIONS
1. State the null and alternative hypotheses
H0: p1 ϭ p2 ϭ . . . ϭ pk
Ha: Not all population proportions are equal
2. Select a random sample from each of the populations and record the observed
frequencies, fij, in a table with 2 rows and k columns
3. Assume the null hypothesis is true and compute the expected frequencies, eij
4. If the expected frequency, eij, is 5 or more for each cell, compute the test
statistic:
2

ϭ

͚͚
i

j

( fij Ϫ eij )2
eij

5. Rejection rule:
p-value approach:
Reject H0 if p-value Յ α
Critical value approach: Reject H0 if 2 Ն 2α
where the chi-square distribution has k Ϫ 1 degrees of freedom and ␣ is the
level of significance for the test.


A Multiple Comparison Procedure
We have used a chi-square test to conclude that the population proportions for the three populations of automobile owners are not all equal. Thus, some differences among the population proportions exist and the study indicates that customer loyalties are not all the same for
the Chevrolet Impala, Ford Fusion, and Honda Accord owners. To identify where the
differences between population proportions exist, we can begin by computing the three
sample proportions as follows:
Brand Loyalty Sample Proportions
Chevrolet Impala
Ford Fusion
Honda Accord

p¯1 ϭ 69/125 ϭ .5520
p¯2 ϭ 120/200 ϭ .6000
p¯3 ϭ 123/175 ϭ .7029

Since the chi-square test indicated that not all population proportions are equal, it is
reasonable for us to proceed by attempting to determine where differences among the
Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


12.1

www.downloadslide.net

515

Testing the Equality of Population Proportions for Three or More Populations

population proportions exist. For this we will rely on a multiple comparison procedure that
can be used to conduct statistical tests between all pairs of population proportions. In the following, we discuss a multiple comparison procedure known as the Marascuilo procedure.

This is a relatively straightforward procedure for making pairwise comparisons of all pairs
of population proportions. We will demonstrate the computations required by this multiple
comparison test procedure for the automobile customer loyalty study.
We begin by computing the absolute value of the pairwise difference between sample
proportions for each pair of populations in the study. In the three-population automobile
brand loyalty study we compare populations 1 and 2, populations 1 and 3, and then populations 2 and 3 using the sample proportions as follows:
Chevrolet Impala and Ford Fusion
Η¯
p1 Ϫ p¯2Η ϭ Η.5520 Ϫ .6000Η ϭ .0480
Chevrolet Impala and Honda Accord
Η¯
p1 Ϫ p¯3Η ϭ Η.5520 Ϫ .7029Η ϭ .1509
Ford Fusion and Honda Accord
Η¯
p2 Ϫ p¯3Η ϭ Η.6000 Ϫ .7029Η ϭ .1029
In a second step, we select a level of significance and compute the corresponding critical
value for each pairwise comparison using the following expression.
CRITICAL VALUES FOR THE MARASCUILO PAIRWISE COMPARISON
PROCEDURE FOR k POPULATION PROPORTIONS
For each pairwise comparison compute a critical value as follows:
CVij ϭ ͙

ͱ

2
α

p¯i(1 Ϫ p¯i)
p¯ (1 Ϫ p¯j)
ϩ j

ni
nj

(12.3)

where
2
α

ϭ chi-square with a level of significance ␣ and k – 1 degrees of freedom

p¯i and p¯j ϭ sample proportions for populations i and j
ni and nj ϭ sample sizes for populations i and j
Using the chi-square distribution in Table 12.4, k Ϫ 1 ϭ 3 Ϫ 1 ϭ 2 degrees of freedom,
and a .05 level of significance, we have ␹2.05 ϭ 5.991. Now using the sample proportions
p¯1 ϭ .5520, p¯2 ϭ .6000, and p¯3 ϭ .7029, the critical values for the three pairwise comparison tests are as follows:
Chevrolet Impala and Ford Fusion

ͱ

CV12 ϭ ͙5.991

.5520(1 Ϫ .5520)
.6000(1 Ϫ .6000)
ϩ
ϭ .1380
125
200

Chevrolet Impala and Honda Accord


ͱ

CV13 ϭ ͙5.991

.5520(1 Ϫ .5520)
.7029(1 Ϫ .7029)
ϩ
ϭ .1379
125
175

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


516
TABLE 12.5

Chapter 12

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

PAIRWISE COMPARISON TESTS FOR THE AUTOMOBILE BRAND LOYALTY STUDY

Pairwise Comparison

Η¯

pi Ϫ p¯jΗ

CVij

Significant if
Η¯
pi Ϫ p¯jΗ Ͼ CVij

.0480
.1509
.1029

.1380
.1379
.1198

Not significant
Significant
Not significant

Chevrolet Impala vs. Ford Fusion
Chevrolet Impala vs. Honda Accord
Ford Fusion vs. Honda Accord

Ford Fusion and Honda Accord

ͱ

CV23 ϭ ͙5.991


.6000(1 Ϫ .6000)
.7029(1 Ϫ .7029)
ϭ .1198
ϩ
200
175

pi Ϫ p¯jΗ exceeds its
If the absolute value of any pairwise sample proportion difference Η¯
corresponding critical value, CVij, the pairwise difference is significant at the .05 level of
significance and we can conclude that the two corresponding population proportions are
different. The final step of the pairwise comparison procedure is summarized in Table 12.5.
The conclusion from the pairwise comparison procedure is that the only significant
difference in customer loyalty occurs between the Chevrolet Impala and the Honda Accord.
Our sample results indicate that the Honda Accord had a greater population proportion of
owners who say they are likely to repurchase the Honda Accord. Thus, we can conclude
p3 ϭ .7029) has a greater customer loyalty than the Chevrolet
that the Honda Accord (¯
Impala (¯
p1 ϭ .5520).
The results of the study are inconclusive as to the comparative loyalty of the Ford Fusion.
While the Ford Fusion did not show significantly different results when compared to the
Chevrolet Impala or Honda Accord, a larger sample may have revealed a significant difference
between Ford Fusion and the other two automobiles in terms of customer loyalty. It is not
uncommon for a multiple comparison procedure to show significance for some pairwise
comparisons and yet not show significance for other pairwise comparisons in the study.
NOTES AND COMMENTS
1. In Chapter 10, we used the standard normal

distribution and the z test statistic to conduct

hypothesis tests about the proportions of two
populations. However, the chi-square test introduced in this section can also be used to
conduct the hypothesis test that the proportions
of two populations are equal. The results will
be the same under both test procedures and the
value of the test statistic 2 will be equal to the
square of the value of the test statistic z. An
advantage of the methodology in Chapter 10 is
that it can be used for either a one-tailed or a
two-tailed hypothesis about the proportions of
two populations whereas the chi-square test in
this section can be used only for two-tailed
tests. Exercise 12.6 will give you a chance to
use the chi-square test for the hypothesis that
the proportions of two populations are equal.
2. Each of the k populations in this section had
two response outcomes, Yes or No. In effect,

each population had a binomial distribution
with parameter p the population proportion
of Yes responses. An extension of the chisquare procedure in this section applies when
each of the k populations has three or more
possible responses. In this case, each population is said to have a multinomial distribution. The chi-square calculations for the
expected frequencies, eij, and the test statistic, 2, are the same as shown in expressions
(12.1) and (12.2). The only difference is that
the null hypothesis assumes that the multinomial distribution for the response variable
is the same for all populations. With r responses for each of the k populations, the chisquare test statistic has (r Ϫ 1)(k Ϫ 1)
degrees of freedom. Exercise 12.8 will give
you a chance to use the chi-square test to
compare three populations with multinomial

distributions.

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


12.1

www.downloadslide.net

Testing the Equality of Population Proportions for Three or More Populations

517

Exercises

Methods

SELF test

1.

Use the sample data below to test the hypotheses
H0: p1 ϭ p2 ϭ p3
Ha: Not all population proportions are equal
where pi is the population proportion of Yes responses for population i. Using a .05 level
of significance, what is the p-value and what is your conclusion?

Populations
Response

Yes
No

SELF test

2.

1
150
100

2
150
150

3
96
104

Reconsider the observed frequencies in exercise 1
a. Compute the sample proportion for each population.
b. Use the multiple comparison procedure to determine which population proportions
differ significantly. Use a .05 level of significance.

Applications
3.

The sample data below represent the number of late and on time flights for Delta, United,
and US Airways (Bureau of Transportation Statistics, March 2012).
Airline

Flight
Late
On Time

a.
b.
c.

SELF test

4.

Delta
39
261

United
51
249

US Airways
56
344

Formulate the hypotheses for a test that will determine if the population proportion of
late flights is the same for all three airlines.
Conduct the hypothesis test with a .05 level of significance. What is the p-value and
what is your conclusion?
Compute the sample proportion of late flights for each airline. What is the overall
proportion of late flights for the three airlines?


Benson Manufacturing is considering ordering electronic components from three different
suppliers. The suppliers may differ in terms of quality in that the proportion or percentage
of defective components may differ among the suppliers. To evaluate the proportion of
defective components for the suppliers, Benson has requested a sample shipment of 500
components from each supplier. The number of defective components and the number of
good components found in each shipment are as follows.
Supplier
Component
Defective
Good

A
15
485

B
20
480

C
40
460

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


518


Chapter 12

a.
b.
c.
5.

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Formulate the hypotheses that can be used to test for equal proportions of defective
components provided by the three suppliers.
Using a .05 level of significance, conduct the hypothesis test. What is the p-value and
what is your conclusion?
Conduct a multiple comparison test to determine if there is an overall best supplier or
if one supplier can be eliminated because of poor quality.

Kate Sanders, a researcher in the department of biology at IPFW University, studied the
effect of agriculture contaminants on the stream fish population in Northeastern Indiana
(April 2012). Specially designed traps collected samples of fish at each of four stream
locations. A research question was, Did the differences in agricultural contaminants found
at the four locations alter the proportion of the fish population by gender? Observed
frequencies were as follows.
Stream Locations
Gender
Male
Female

a.

b.
Exercise 6 shows a
chi-square test can be
used when the hypothesis
is about the equality of two
population proportions.

6.

A
49
41

B
44
46

C
49
36

D
39
44

Focusing on the proportion of male fish at each location, test the hypothesis that the
population proportions are equal for all four locations. Use a .05 level of significance.
What is the p-value and what is your conclusion?
Does it appear that differences in agricultural contaminants found at the four locations
altered the fish population by gender?


A tax preparation firm is interested in comparing the quality of work at two of its regional
offices. The observed frequencies showing the number of sampled returns with errors and
the number of sampled returns that were correct are as follows.

Regional Office
Return
Error
Correct

a.
b.

c.

7.

Office 1
35
215

Office 2
27
273

What are the sample proportions of returns with errors at the two offices?
Use the chi-square test procedure to see if there is a significant difference between
the population proportion of error rates for the two offices. Test the null hypothesis
H0: p1 ϭ p2 with a .10 level of significance. What is the p-value and what is your
conclusion? Note: We generally use the chi-square test of equal proportions when

there are three or more populations, but this example shows that the same chi-square
test can be used for testing equal proportions with two populations.
In the Section 10.2, a z test was used to conduct the above test. Either a 2 test statistic
or a z test statistic may be used to test the hypothesis. However, when we want to make
inferences about the proportions for two populations, we generally prefer the z test
statistic procedure. Refer to the Notes and Comments at the end of this section and
comment on why the z test statistic provides the user with more options for inferences
about the proportions of two populations.

Social networking is becoming more and more popular around the world. Pew Research
Center used a survey of adults in several countries to determine the percentage of adults
who use social networking sites (USA Today, February 8, 2012). Assume that the results
for surveys in Great Britain, Israel, Russia, and United States are as follows.

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


12.2

www.downloadslide.net

519

Test of Independence

Country
Use Social
Networking Sites
Yes

No

a.
b.
c.
Exercise 8 shows a
chi-square test can also
be used for multiple
population tests when the
categorical response
variable has three or more
outcomes.

8.

Great
Britain
344
456

Israel
265
235

United
States
500
500

Russia

301
399

Conduct a hypothesis test to determine whether the proportion of adults using social
networking sites is equal for all four countries. What is the p-value? Using a .05 level
of significance, what is your conclusion?
What are the sample proportions for each of the four countries? Which country has the
largest proportion of adults using social networking sites?
Using a .05 level of significance, conduct multiple pairwise comparison tests among
the four countries. What is your conclusion?

A manufacturer is considering purchasing parts from three different suppliers. The parts
received from the suppliers are classified as having a minor defect, having a major defect,
or being good. Test results from samples of parts received from each of the three suppliers
are shown below. Note that any test with these data is no longer a test of proportions for
the three supplier populations because the categorical response variable has three
outcomes: minor defect, major defect, and good.
Supplier
Part Tested
Minor Defect
Major Defect
Good

A
15
5
130

B
13

11
126

C
21
5
124

Using the data above, conduct a hypothesis test to determine if the distribution of defects
is the same for the three suppliers. Use the chi-square test calculations as presented in
this section with the exception that a table with r rows and c columns results in a chisquare test statistic with (r – 1)(c – 1) degrees of freedom. Using a .05 level of significance, what is the p-value and what is your conclusion.

12.2

Test of Independence
An important application of a chi-square test involves using sample data to test for the independence of two categorical variables. For this test we take one sample from a population and record the observations for two categorical variables. We will summarize the data
by counting the number of responses for each combination of a category for variable 1 and
a category for variable 2. The null hypothesis for this test is that the two categorical
variables are independent. Thus, the test is referred to as a test of independence. We will
illustrate this test with the following example.
A beer industry association conducts a survey to determine the preferences of beer
drinkers for light, regular, and dark beers. A sample of 200 beer drinkers is taken with each
person in the sample asked to indicate a preference for one of the three types of beers: light,
regular, or dark. At the end of the survey questionnaire, the respondent is asked to provide information on a variety of demographics including gender: male or female. A research question of interest to the association is whether preference for the three types of beer is
independent of the gender of the beer drinker. If the two categorical variables, beer preference

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.



520

Chapter 12

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

and gender, are independent, beer preference does not depend on gender and the preference
for light, regular, and dark beer can be expected to be the same for male and female beer
drinkers. However, if the test conclusion is that the two categorical variables are not independent, we have evidence that beer preference is associated or dependent upon the gender of
the beer drinker. As a result, we can expect beer preferences to differ for male and female beer
drinkers. In this case, a beer manufacturer could use this information to customize its promotions and advertising for the different target markets of male and female beer drinkers.
The hypotheses for this test of independence are as follows:
H0: Beer preference is independent of gender
Ha: Beer preference is not independent of gender
The sample data will be summarized in a two-way table with beer preferences of light,
regular, and dark as one of the variables and gender of male and female as the other variable. Since an objective of the study is to determine if there is difference between the beer
preferences for male and female beer drinkers, we consider gender an explanatory variable
and follow the usual practice of making the explanatory variable the column variable in the
data tabulation table. The beer preference is the categorical response variable and is shown
as the row variable. The sample results of the 200 beer drinkers in the study are summarized in Table 12.6.
The sample data are summarized based on the combination of beer preference and
gender for the individual respondents. For example, 51 individuals in the study were males
who preferred light beer, 56 individuals in the study were males who preferred regular beer,
and so on. Let us now analyze the data in the table and test for independence of beer preference and gender.
First of all, since we selected a sample of beer drinkers, summarizing the data for each
variable separately will provide some insights into the characteristics of the beer drinker
population. For the categorical variable gender, we see 132 of the 200 in the sample were
male. This gives us the estimate that 132/200 ϭ .66, or 66%, of the beer drinker population

is male. Similarly we estimate that 68/200 ϭ .34, or 34%, of the beer drinker population is
female. Thus male beer drinkers appear to outnumber female beer drinkers approximately
2 to 1. Sample proportions or percentages for the three types of beer are
Prefer Light Beer
Prefer Regular Beer
Prefer Dark Beer

90/200 ϭ .450, or 45.0%
77/200 ϭ .385, or 38.5%
33/200 ϭ .165, or 16.5%

Across all beer drinkers in the sample, light beer is preferred most often and dark beer is
preferred least often.
Let us now conduct the chi-square test to determine if beer preference and gender
are independent. The computations and formulas used are the same as those used for the
TABLE 12.6

SAMPLE RESULTS FOR BEER PREFERENCES OF MALE AND FEMALE
BEER DRINKERS (OBSERVED FREQUENCIES)
Gender

WEB

file

BeerPreference

Beer Preference

Male


Female

Total

Light
Regular
Dark

51
56
25

39
21
8

90
77
33

Total

132

68

200

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content

may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


12.2

TABLE 12.7

www.downloadslide.net

521

Test of Independence

EXPECTED FREQUENCIES IF BEER PREFERENCE IS INDEPENDENT
OF THE GENDER OF THE BEER DRINKER
Gender
Light
Regular
Dark

Beer Preference

Total

Male

Female

Total


59.40
50.82
21.78

30.60
26.18
11.22

90
77
33

132

68

200

chi-square test in Section 12.1. Utilizing the observed frequencies in Table 12.6 for row
i and column j, fij, we compute the expected frequencies, eij, under the assumption that
the beer preferences and gender are independent. The computation of the expected
frequencies follows the same logic and formula used in Section 12.1. Thus the expected
frequency for row i and column j is given by
eij ϭ

(Row i Total)(Column j Total)
Sample Size

(12.4)


For example, e11 ϭ (90)(132)/200 ϭ 59.40 is the expected frequency for male beer drinkers
who would prefer light beer if beer preference is independent of gender. Show that
equation (12.4) can be used to find the other expected frequencies shown in Table 12.7.
Following the chi-square test procedure discussed in Section 12.1, we use the following
expression to compute the value of the chi-square test statistic.
2

ϭ

͚͚
i

j

( fij Ϫ eij)2
eij

(12.5)

With r rows and c columns in the table, the chi-square distribution will have (r – 1)(c – 1) degrees of freedom provided the expected frequency is at least 5 for each cell. Thus, in this
application we will use a chi-square distribution with (3 – 1)(2 – 1) ϭ 2 degrees of freedom.
The complete steps to compute the chi-square test statistic are summarized in Table 12.8.
We can use the upper tail area of the chi-square distribution with 2 degrees of freedom
and the p-value approach to determine whether the null hypothesis that beer preference
TABLE 12.8

COMPUTATION OF THE CHI-SQUARE TEST STATISTIC FOR THE TEST
OF INDEPENDENCE BETWEEN BEER PREFERENCE AND GENDER

Beer

Preference

Gender

Light
Light
Regular
Regular
Dark
Dark

Male
Female
Male
Female
Male
Female
Total

Observed
Frequency
fij
51
39
56
21
25
8
200


Expected
Frequency
eij
59.40
30.60
50.82
26.18
21.78
11.22
200

Difference
( fij ؊ eij )
Ϫ8.40
8.40
5.18
Ϫ5.18
3.22
Ϫ3.22

Squared
Difference
( fij ؊ eij )2

Squared Difference
Divided by
Expected Frequency
( fij ؊ eij )2/eij

70.56

70.56
26.83
26.83
10.37
10.37

1.19
2.31
.53
1.02
.48
.92
2

ϭ 6.45

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


522

Chapter 12

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

is independent of gender can be rejected. Using row two of the chi-square distribution
table shown in Table 12.4, we have the following:

Area in Upper Tail
2

Value (2 df)

.10

.05

.025

.01

.005

4.605

5.991

7.378

9.210

10.597

2

ϭ 6.45

Thus, we see the upper tail area at 2 ϭ 6.45 is between .05 and .025, and so the corresponding upper tail area or p-value must be between .05 and .025. With p-value Յ .05, we reject H0

and conclude that beer preference is not independent of the gender of the beer drinker. Stated
another way, the study shows that beer preference can be expected to differ for male and female beer drinkers. Minitab or Excel procedures provided in Appendix F can be used to show
2
ϭ 6.45 with two degrees of freedom yields a p-value ϭ .0398.
Instead of using the p-value, we could use the critical value approach to draw the same
conclusion. With ␣ ϭ .05 and 2 degrees of freedom, the critical value for the chi-square test
statistic is 2.05 ϭ 5.991. The upper tail rejection region becomes
Reject H0 if Ն 5.991
With 6.45 Ն 5.991, we reject H0. Again we see that the p-value approach and the critical
value approach provide the same conclusion.
While we now have evidence that beer preference and gender are not independent, we
will need to gain additional insight from the data to assess the nature of the association between these two variables. One way to do this is to compute the probability of the beer preference responses for males and females separately. These calculations are as follows:
Beer Preference
Light
Regular
Dark

Male
51/132 ϭ .3864, or 38.64%
56/132 ϭ .4242, or 42.42%
25/132 ϭ .1894, or 18.94%

Female
39/68 ϭ .5735, or 57.35%
21/68 ϭ .3088, or 30.88%
8/68 ϭ .1176, or 11.76%

The bar chart for male and female beer drinkers of the three kinds of beer is shown in
Figure 12.1.
FIGURE 12.1


BAR CHART COMPARISON OF BEER PREFERENCE BY GENDER
0.7

Male
Female

0.6
Probability

0.5
0.4
0.3
0.2
0.1
0

Light

Regular
Beer Preference

Dark

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


12.2


www.downloadslide.net

523

Test of Independence

What observations can you make about the association between beer preference and
gender? For female beer drinkers in the sample, the highest preference is for light beer at
57.35%. For male beer drinkers in the sample, regular beer is most frequently preferred at
42.42%. While female beer drinkers have a higher preference for light beer than males, male
beer drinkers have a higher preference for both regular beer and dark beer. Data visualization through bar charts such as shown in Figure 12.1 is helpful in gaining insight as to how
two categorical variables are associated.
Before we leave this discussion, we summarize the steps for a test of independence.

CHI-SQUARE TEST FOR INDEPENDENCE OF TWO CATEGORICAL
VARIABLES
1. State the null and alternative hypotheses.
The expected frequencies
must all be 5 or more for
the chi-square test to be
valid.

This chi-square test is
also a one-tailed test with
rejection of H0 occurring in
the upper tail of a
chi-square distribution
with (r – 1)(c – 1) degrees
of freedom.


H0: The two categorical variables are independent
Ha: The two categorical variables are not independent
2. Select a random sample from the population and collect data for both variables for every element in the sample. Record the observed frequencies, fij, in
a table with r rows and c columns.
3. Assume the null hypothesis is true and compute the expected frequencies, eij
4. If the expected frequency, eij, is 5 or more for each cell, compute the test statistic:
2

ϭ

͚͚
i

j

( fij Ϫ eij)2
eij

5. Rejection rule:
p-value approach:
Reject H0 if p-value Յ ␣
Critical value approach: Reject H0 if 2 Ն 2α
where the chi-square distribution has (r – 1)(c – 1) degrees of freedom and ␣
is the level of significance for the test.
Finally, if the null hypothesis of independence is rejected, summarizing the probabilities as
shown in the above example will help the analyst determine where the association or
dependence exists for the two categorical variables.

Exercises


Methods

SELF test

9. The following table contains observed frequencies for a sample of 200. Test for independence of the row and column variables using ␣ ϭ .05.

Column Variable
Row Variable

A

B

C

P
Q

20
30

44
26

50
30

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.



524

Chapter 12

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

10. The following table contains observed frequencies for a sample of 240. Test for independence of the row and column variables using α ϭ .05.

Row Variable
P
Q
R

A
20
30
10

Column Variable
B
30
60
15

C
20
25

30

Applications

SELF test

11. A Bloomberg Businessweek subscriber study asked, “In the past 12 months, when traveling for business, what type of airline ticket did you purchase most often?” A second question asked if the type of airline ticket purchased most often was for domestic or
international travel. Sample data obtained are shown in the following table.
Type of Flight

a.
b.

WEB

file

WorkforcePlan

Type of Ticket

Domestic

International

First class
Business class
Economy class

29

95
518

22
121
135

Using a .05 level of significance, is the type of ticket purchased independent of the
type of flight? What is your conclusion?
Discuss any dependence that exists between the type of ticket and type of flight.

12. A Deloitte employment survey asked a sample of human resource executives how their
company planned to change its workforce over the next 12 months (INC. Magazine,
February 2012). A categorical response variable showed three options: The company plans
to hire and add to the number of employees, the company plans no change in the number
of employees, or the company plans to lay off and reduce the number of employees.
Another categorical variable indicated if the company was private or public. Sample data
for 180 companies are summarized as follows.
Company

a.
b.

Employment Plan

Private

Public

Add Employees

No Change
Lay Off Employees

37
19
16

32
34
42

Conduct a test of independence to determine if the employment plan for the next
12 months is independent of the type of company. At a .05 level of significance, what
is your conclusion?
Discuss any differences in the employment plans for private and public companies
over the next 12 months.

13. Health insurance benefits vary by the size of the company (Atlanta Business Chronicle,
December 31, 2010). The sample data below show the number of companies providing
health insurance for small, medium, and large companies. For purposes of this study, small
companies are companies that have fewer than 100 employees. Medium-sized companies
have 100 to 999 employees, and large companies have 1000 or more employees. The
Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


12.2

www.downloadslide.net


525

Test of Independence

questionnaire sent to 225 employees asked whether or not the employee had health insurance and then asked the employee to indicate the size of the company.
Size of the Company
Health Insurance
Yes
No

a.
b.

WEB file
AutoQuality

Small

Medium

Large

36
14

65
10

88
12


Conduct a test of independence to determine whether health insurance coverage is
independent of the size of the company. What is the p-value? Using a .05 level of
significance, what is your conclusion?
A newspaper article indicated employees of small companies are more likely to lack
health insurance coverage. Use percentages based on the above data to support this
conclusion.

14. A vehicle quality survey asked new owners a variety of questions about their recently
purchased automobile (J.D. Power and Associates, March 2012). One question asked for
the owner’s rating of the vehicle using categorical responses of average, outstanding, and
exceptional. Another question asked for the owner’s education level with the categorical
responses some high school, high school graduate, some college, and college graduate.
Assume the sample data below are for 500 owners who had recently purchased an
automobile.
Education
Quality Rating
Average
Outstanding
Exceptional

a.
b.

Some HS

HS Grad

Some College


College Grad

35
45
20

30
45
25

20
50
30

60
90
50

Use a .05 level of significance and a test of independence to determine if a new
owner’s vehicle quality rating is independent of the owner’s education. What is the
p-value and what is your conclusion?
Use the overall percentage of average, outstanding, and exceptional ratings to comment
upon how new owners rate the quality of their recently purchased automobiles.

15. The Wall Street Journal Corporate Perceptions Study 2011 surveyed readers and asked how
each rated the quality of management and the reputation of the company for over 250 worldwide corporations. Both the quality of management and the reputation of the company were
rated on an excellent, good, and fair categorical scale. Assume the sample data for 200
respondents below applies to this study.
Reputation of Company
Quality of Management

Excellent
Good
Fair

a.
b.

Excellent

Good

Fair

40
35
25

25
35
10

5
10
15

Use a .05 level of significance and test for independence of the quality of management
and the reputation of the company. What is the p-value and what is your conclusion?
If there is a dependence or association between the two ratings, discuss and use probabilities to justify your answer.

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content

may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


526

Chapter 12

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

16. As the price of oil rises, there is increased worldwide interest in alternate sources of energy.
A Financial Times/Harris Poll surveyed people in six countries to assess attitudes toward
a variety of alternate forms of energy (Harris Interactive website, February 27, 2008). The
data in the following table are a portion of the poll’s findings concerning whether people
favor or oppose the building of new nuclear power plants.
Country
Response
Strongly favor
Favor more than oppose
Oppose more than favor
Strongly oppose

a.
b.
c.

Great
Britain


France

Italy

Spain

Germany

United
States

141
348
381
217

161
366
334
215

298
309
219
219

133
222
311
443


128
272
322
389

204
326
316
174

How large was the sample in this poll?
Conduct a hypothesis test to determine whether people’s attitude toward building new
nuclear power plants is independent of country. What is your conclusion?
Using the percentage of respondents who “strongly favor” and “favor more than
oppose,” which country has the most favorable attitude toward building new nuclear
power plants? Which country has the least favorable attitude?

17. The National Sleep Foundation used a survey to determine whether hours of sleep per night
are independent of age (Newsweek, January 19, 2004). A sample of individuals was asked
to indicate the number of hours of sleep per night with categorical options: fewer than
6 hours, 6 to 6.9 hours, 7 to 7.9 hours, and 8 hours or more. Later in the survey, the
individuals were asked to indicate their age with categorical options: age 39 or younger
and age 40 or older. Sample data follow.
Age Group
Hours of Sleep

39 or younger

40 or older


38
60
77
65

36
57
75
92

Fewer than 6
6 to 6.9
7 to 7.9
8 or more

a.
b.

Conduct a test of independence to determine whether hours of sleep are independent of
age. Using a .05 level of significance, what is the p-value and what is your conclusion?
What is your estimate of the percentages of individuals who sleep fewer than 6 hours,
6 to 6.9 hours, 7 to 7.9 hours, and 8 hours or more per night?

18. On a syndicated television show the two hosts often create the impression that they
strongly disagree about which movies are best. Each movie review is categorized as Pro
(“thumbs up”), Con (“thumbs down”), or Mixed. The results of 160 movie ratings by the
two hosts are shown here.
Host B
Host A


Con

Mixed

Pro

Con
Mixed
Pro

24
8
10

8
13
9

13
11
64

Use a test of independence with a .01 level of significance to analyze the data. What is your
conclusion?
Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


12.3


12.3

www.downloadslide.net

527

Goodness of Fit Test

Goodness of Fit Test
In this section we use a chi-square test to determine whether a population being sampled
has a specific probability distribution. We first consider a population with a historical multinomial probability distribution and use a goodness of fit test to determine if new sample
data indicate there has been a change in the population distribution compared to the historical distribution. We then consider a situation where an assumption is made that a population has a normal probability distribution. In this case, we use a goodness of fit test to
determine if sample data indicate that the assumption of a normal probability distribution
is or is not appropriate. Both tests are referred to as goodness of fit tests.

Multinomial Probability Distribution
The multinomial probability
distribution is an extension
of the binomial probability
distribution to the case
where there are three or
more outcomes per trial.

With a multinomial probability distribution, each element of a population is assigned to
one and only one of three or more categories. As an example, consider the market share
study being conducted by Scott Marketing Research. Over the past year, market shares for
a certain product have stabilized at 30% for company A, 50% for company B, and 20% for
company C. Since each customer is classified as buying from one of these companies, we
have a multinomial probability distribution with three possible outcomes. The probability

for each of the three outcomes is as follows.
pA ϭ probability a customer purchases the company A product
pB ϭ probability a customer purchases the company B product
pC ϭ probability a customer purchases the company C product

The sum of the probabilities
for a multinomial
probability distribution
equal 1.

Using the historical market shares, we have multinomial probability distribution with
pA ϭ .30, pB ϭ .50, and pC ϭ .20.
Company C plans to introduce a “new and improved” product to replace its current
entry in the market. Company C has retained Scott Marketing Research to determine
whether the new product will alter or change the market shares for the three companies.
Specifically, the Scott Marketing Research study will introduce a sample of customers to
the new company C product and then ask the customers to indicate a preference for the company A product, the company B product, or the new company C product. Based on the
sample data, the following hypothesis test can be used to determine if the new company
C product is likely to change the historical market shares for the three companies.
H0: pA ϭ .30, pB ϭ .50, and pC ϭ .20
Ha: The population proportions are not pA ϭ .30, pB ϭ .50, and pC ϭ .20
The null hypothesis is based on the historical multinomial probability distribution for the
market shares. If sample results lead to the rejection of H0, Scott Marketing Research will
have evidence to conclude that the introduction of the new company C product will change
the market shares.
Let us assume that the market research firm has used a consumer panel of 200 customers. Each customer was asked to specify a purchase preference among the three alternatives: company A’s product, company B’s product, and company C’s new product. The
200 responses are summarized here.

Company A’s
Product


Observed Frequency
Company B’s
Product

Company C’s
New Product

48

98

54

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


528

Chapter 12

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

We now can perform a goodness of fit test that will determine whether the sample of 200
customer purchase preferences is consistent with the null hypothesis. Like other chi-square
tests, the goodness of fit test is based on a comparison of observed frequencies with the
expected frequencies under the assumption that the null hypothesis is true. Hence, the next

step is to compute expected purchase preferences for the 200 customers under the
assumption that H0: pA ϭ .30, pB ϭ .50, and pC ϭ .20 is true. Doing so provides the
expected frequencies as follows.

Company A’s
Product

Expected Frequency
Company B’s
Product

Company C’s
New Product

200(.30) ϭ 60

200(.50) ϭ 100

200(.20) ϭ 40

Note that the expected frequency for each category is found by multiplying the
sample size of 200 by the hypothesized proportion for the category.
The goodness of fit test now focuses on the differences between the observed frequencies and the expected frequencies. Whether the differences between the observed and expected frequencies are “large” or “small” is a question answered with the aid of the
following chi-square test statistic.

TEST STATISTIC FOR GOODNESS OF FIT
2

ϭ


k

( fi Ϫ ei )2
ei
iϭ1

͚

(12.6)

where
fi ϭ observed frequency for category i
ei ϭ expected frequency for category i
k ϭ the number of categories
Note: The test statistic has a chi-square distribution with k Ϫ 1 degrees of freedom
provided that the expected frequencies are 5 or more for all categories.

The test for goodness of fit
is always a one-tailed test
with the rejection occurring
in the upper tail of the
chi-square distribution.

Let us continue with the Scott Marketing Research example and use the sample data to test
the hypothesis that the multinomial population has the market share proportions pA ϭ .30,
pB ϭ .50, and pC ϭ .20. We will use an α ϭ .05 level of significance. We proceed by using the
observed and expected frequencies to compute the value of the test statistic. With the expected
frequencies all 5 or more, the computation of the chi-square test statistic is shown in Table 12.9.
Thus, we have 2 ϭ 7.34.
We will reject the null hypothesis if the differences between the observed and expected

frequencies are large. Thus the test of goodness of fit will always be an upper tail test. We
can use the upper tail area for the test statistic and the p-value approach to determine
whether the null hypothesis can be rejected. With k Ϫ 1 ϭ 3 Ϫ 1 ϭ 2 degrees of freedom,

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


12.3

TABLE 12.9

www.downloadslide.net

529

Goodness of Fit Test

COMPUTATION OF THE CHI-SQUARE TEST STATISTIC FOR THE SCOTT MARKETING
RESEARCH MARKET SHARE STUDY

Category

Hypothesized
Proportion

Observed
Frequency
( fi )


Company A
Company B
Company C

.30
.50
.20

48
98
54

Total

Expected
Frequency
(ei )
60
100
40

Difference
( fi ؊ ei )

Squared
Difference
( fi ؊ ei )2

Squared Difference
Divided by

Expected Frequency
( fi ؊ ei )2/ei

Ϫ12
Ϫ2
14

144
4
196

2.40
0.04
4.90
2

200

ϭ 7.34

row two of the chi-square distribution table in Table 12.4 provides the following:
Area in Upper Tail
2

Value (2 df)

.10

.05


.025

.01

.005

4.605

5.991

7.378

9.210

10.597

2

ϭ 7.34

The test statistic 2 ϭ 7.34 is between 5.991 and 7.378. Thus, the corresponding upper tail
area or p-value must be between .05 and .025. With p-value Յ .05, we reject H0 and conclude that the introduction of the new product by company C will alter the historical market shares. Minitab or Excel procedures provided in Appendix F can be used to show
2
ϭ 7.34 provides a p-value ϭ .0255.
Instead of using the p-value, we could use the critical value approach to draw the same
conclusion. With α ϭ .05 and 2 degrees of freedom, the critical value for the test statistic
is 2.05 ϭ 5.991. The upper tail rejection rule becomes
Reject H0 if

2


Ն 5.991

With 7.34 Ͼ 5.991, we reject H0. The p-value approach and critical value approach provide
the same hypothesis testing conclusion.
Now that we have concluded the introduction of a new company C product will alter
the market shares for the three companies, we are interested in knowing more about how
the market shares are likely to change. Using the historical market shares and the sample
data, we summarize the data as follows:

Company

Historical Market Share (%)

Sample Data Market Share (%)

A
B
C

30
50
20

48/200 ϭ .24, or 24
98/200 ϭ .49, or 49
54/200 ϭ .27, or 27

The historical market shares and the sample market shares are compared in the bar chart
shown in Figure 12.2. This data visualization process shows that the new product will likely

increase the market share for company C. Comparisons for the other two companies indicate
that company C’s gain in market share will hurt company A more than company B.
Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


530

Chapter 12

FIGURE 12.2

www.downloadslide.net

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

BAR CHART OF MARKET SHARES BY COMPANY BEFORE AND AFTER
THE NEW PRODUCT FOR COMPANY C
0.6

Historical Market Share
After New Product C

Probability

0.5
0.4
0.3
0.2
0.1

0

A

B
Company

C

Let us summarize the steps that can be used to conduct a goodness of fit test for a
hypothesized multinomial population distribution.
MULTINOMIAL PROBABILITY DISTRIBUTION GOODNESS OF FIT TEST

1. State the null and alternative hypotheses.
H0: The population follows a multinomial probability distribution with
specified probabilities for each of the k categories
Ha: The population does not follow a multinomial distribution with the
specified probabilities for each of the k categories
2. Select a random sample and record the observed frequencies fi for each
category.
3. Assume the null hypothesis is true and determine the expected frequency ei in
each category by multiplying the category probability by the sample size.
4. If the expected frequency ei is at least 5 for each category, compute the value
of the test statistic.
2

ϭ

k


( fi Ϫ ei )2
ei
iϭ1

͚

5. Rejection rule:
p-value approach:
Reject H0 if p-value Յ α
Critical value approach: Reject H0 if 2 Ն 2α
where α is the level of significance for the test and there are k Ϫ 1 degrees of
freedom.

Normal Probability Distribution
The goodness of fit test for a normal probability distribution is also based on the use of the
chi-square distribution. In particular, observed frequencies for several categories of sample
data are compared to expected frequencies under the assumption that the population has a
Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


12.3

TABLE 12.10

CHEMLINE
EMPLOYEE
APTITUDE TEST
SCORES FOR
50 RANDOMLY

CHOSEN JOB
APPLICANTS
71
60
55
82
85
65
77
61
79

66
86
63
79
80
62
54
56
84

61
70
56
76
56
90
64
63


65
70
62
68
61
69
74
80

54
73
76
53
61
76
65
56

93
73
54
58
64
79
65
71

www.downloadslide.net


normal probability distribution. Because the normal probability distribution is continuous,
we must modify the way the categories are defined and how the expected frequencies are
computed. Let us demonstrate the goodness of fit test for a normal distribution by considering the job applicant test data for Chemline, Inc., shown in Table 12.10.
Chemline hires approximately 400 new employees annually for its four plants located
throughout the United States. The personnel director asks whether a normal distribution applies for the population of test scores. If such a distribution can be used, the distribution
would be helpful in evaluating specific test scores; that is, scores in the upper 20%, lower
40%, and so on, could be identified quickly. Hence, we want to test the null hypothesis that
the population of test scores has a normal distribution.
Let us first use the data in Table 12.10 to develop estimates of the mean and standard
deviation of the normal distribution that will be considered in the null hypothesis. We use
the sample mean x¯ and the sample standard deviation s as point estimators of the mean and
standard deviation of the normal distribution. The calculations follow.
x¯ ϭ


WEB

file

͚ xi
3421
ϭ
ϭ 68.42
n
50

ͱ

͚(xi Ϫ x¯)2
ϭ

nϪ1

ͱ

5310.0369
ϭ 10.41
49

Using these values, we state the following hypotheses about the distribution of the job
applicant test scores.
H0: The population of test scores has a normal distribution with mean 68.42
and standard deviation 10.41
Ha: The population of test scores does not have a normal distribution with
mean 68.42 and standard deviation 10.41

Chemline

With a continuous
probability distribution,
establish intervals such that
each interval has an
expected frequency of five
or more.

531

Goodness of Fit Test

The hypothesized normal distribution is shown in Figure 12.3.
With the continuous normal probability distribution, we must use a different procedure for

defining the categories. We need to define the categories in terms of intervals of test scores.
Recall the rule of thumb for an expected frequency of at least five in each interval or
category. We define the categories of test scores such that the expected frequencies will be
at least five for each category. With a sample size of 50, one way of establishing categories

FIGURE 12.3

HYPOTHESIZED NORMAL DISTRIBUTION OF TEST SCORES
FOR THE CHEMLINE JOB APPLICANTS

Standard Deviation
10.41

Mean 68.42

Copyright 2014 Nelson Education Ltd. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content
may be suppressed from the eBook and/or eChapter(s). Nelson Education reserves the right to remove additional content at any time if subsequent rights restrictions require it.


×