Chapter 11
Inference About a Population
1
11.1 Inference About a Population Mean When the Population Standard Deviation is Unknown
2
Recall: By the central limit theorem, when σ is known
xis normally distributed if:
•
•
the sample is drawn from a normal population, or
the population is not normal but the sample is sufficiently large.
2
2
When σ is unknown, we use s instead, and
has the t-distribution
x
Zt
x −µ
=
σs n
2
Pop dist:
Near Normal
σ known
n ≤ 30
Pop dist:
Non-Normal
μ?
Z-dist
?
n > 30
Z-dist
n ≤ 30
Pop dist:
σ unknown
Near Normal
n > 30
Z-dist
Pop dist:
Non-Normal
tn-1-dist
?
t
=
x −µ
s
Using the t-table
n
The Student- t distribution is mound-shaped, and symmetrical around zero.
D of freedom = n2 > n1
D of freedom = n1
0
Example 1: The productivity of newly hired trainees is studied. It is believed that trainees can
process and distribute more than 450 packages per hour within one week of hiring.
Can we conclude that this belief is correct, if the mean productivity observation of 50 trainees is
460.38 and the standard deviation is 38.83.
5
Step 1:H0:µ = 450; H1:µ > 450
Step 2: α = 0.05
Step 3: n= 50, use tn-1 = t49
Step 4: Reject Region t ≥ tα,n-1 ≅ t.05,50 = 1.676
Cf. 1.645 for the Z-distribution
Step 5:
x−µ
t=
s n
1.676
Critical value
460.38 − 450
=
= 1.89
38.83 50
6
Confidence interval estimator of µ when σ
x ± tα
2
is unknown
s
2
n
d.f . = n − 1
Example 2: An investor is trying to estimate the return on investment in companies that won
quality awards last year. A random sample of 83 such companies is selected, yields
x = 15.02 s 2 = 68.98
Construct a 95% confidence interval for the mean return.
s = 68.98 = 8.31
7
x = 15.02 s 2 = 68.98
s = 68.98 = 8.31
x ± t α 2,n−1
s
8.31
≅ 15.02 ± 1.990
= [13.205,16.835]
n
83
t.025,82≅ t.025,80
Checking the required conditions:
The Student t distribution is robust, which means that if the population is non-normal, the results
of the t-test and confidence interval estimate are still valid provided that the population is “not
extremely non-normal”.
14
12
10
8
6
4
2
0
30
25
20
15
10
5
0
400
425
450
475
500
525
550
575
More
-4
2
8
9
14
22
30
More
Example 4 Assume the content of a can of Bubbly cola is Normally distributed. Design a test to
see whether its manufacturer adequately fills their 1-liter (i.e., 1000 mls) bottles.
H0 : μ = 1, 000mls
(adequately)
HA: μ < 1, 000mls (1-side test)
(inadequately)
α =.10 (This might be just a class-room project)
n = 25 (small sample size) -> Use the t24 -distribution
Reject H0 at α = .10
Reject region: T < −1.32
−1.32
Critical value
10
11.2 Inference About a Population Variance
This statistic is
(n − 1)s
σ
2
2
If the population is normally distributed, it has a Chi-squared distribution, with df = n-1:
χ n2−1
The Chi-squared distribution
11
Testing the population variance – Left hand tail test
Example 3: A container-filling machine is considered to fill 1 liter containers consistently if the
variance of the filling is less than 1 cc (.001 liter). A random sample of 25 1-liter fills was taken,
2
and s =.6333. Do these data support the belief that the variance is less than 1cc at 5%
significance level?
12
2
2
Step 1:H0:σ = 1; H1:σ < 1
Step 2: α = 0.05
2
χ n2−1 = χ 24
Step 3: n= 25, use
χ12−α ,n −1 = χ .295, 24 = 13.85
Step 4: Reject Region
2
Χ 24
Step 5:
(n − 1) s
2
χ =
2
13.85
σ
Critical value
(25 − 10)(.6333)
=
= 15.20
1
2
.95
Step 6: Do not reject the null hypothesis
13
Testing the population variance –
Right hand tail test; Two tail test;
A right hand tail test:
A two tail test
2
H0: σ = value
2
H1: σ > value
2
H0: σ = value
2
H1: σ ≠ value
Rejection region
Rejection region:
χ 2 ≥ χ 2α,n−1
χ 2 ≤ χ12− α 2,n−1 or χ 2 ≥ χ 2α 2, n−1
14
Estimating the population variance
From the following probability statement
2
2
2
P(χ 1-α/2 < χ < χ α/2) = 1-α
2
2 2
we have (by substituting χ = [(n - 1)s ]/σ .)
(n − 1)s
χ 2α / 2
2
2
<σ <
(n − 1)s
2
χ12−α / 2
2
This is the confidence interval for σ with 1-α % confidence level.
15
Example 4: Estimate the variance of fills in example 3 with 99% confidence.
2
(n − 1) s 2
(
n
−
1
)
s
2
<
σ
<
2
χα / 2
χ12−α / 2
?
?
2
<σ <
45.5585
9.88623
16
16
11.4 Inference About a Population Proportion
When the population consists of nominal or categorical data, the only inference we can make is
about the proportion of occurrence of a certain value.
17
The statistic used when making inference about ‘p’ is:
x
pˆ =
where
n
x − the number of successes.
n − sample size.
pˆ
Under certain conditions, [np > 5 and n(1-p) > 5],
is approximately normally distributed, with
2
µ = p and σ = p(1 - p)/n.
18
Test statistic for p
pˆ − p
Z=
p(1 − p) / n
where np > 5 and n(1 − p) > 5
Interval estimator for p (1-α confidence level)
pˆ ± z α / 2 pˆ(1 − pˆ) / n
provided npˆ > 5 and n(1 − pˆ) > 5
19
Example 11.5 (Predicting the winner in election day): Voters are asked by a certain network to
participate in an exit poll in order to predict the winner on election day. Based on the data
presented in Xm12.5.xls (where 1=Democrat, and 2=Republican), can the network conclude that
the republican candidate will win the state college vote?
20
Step 1: H0: p = .5; H1: p > .5
Step 2: α = 0.05
Step 3: n= 765, use the Z-dist.
Step 4: Reject Region
Step 5:
z=
Z
p− p
p (1 − p ) / n
1.645
.532 − .5
=
= 1.77
.5(1 − .5) / 765
Critical value
Step 6: Reject the null hypothesis
21
Estimating the Proportion
Example (marketing application): In a survey of 2000 TV viewers at 11.40 p.m. on a certain night,
226 indicated they watched “The Tonight Show”. Estimate the number of TVs tuned to the Tonight
Show in a typical night, if there are 100 million potential television sets. Use 95% confidence
level.
pˆ ± zα / 2 pˆ (1 − pˆ ) / n
= .113 ± 1.96 .113(.887) / 2000
= .113 ± .014
1-.113 = .887
226/2000 = .113
22
Flowchart of Techniques
Describe a Population
Data Type?
Interval
Nominal
Type of descriptive measurement?
Central Location
t test & estimator of u.
z test & estimator of p
Variability
2 2
X Χ test & estimator
of d
12.23