4 1
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
4 2
When you have completed this chapter, you will be able to:
1.
2.
3.
Compute and interpret the range, the mean
deviation, the variance, the standard deviation,
and the coefficient of variation of ungrouped
data
Compute and interpret the range, the variance,
and the standard deviation from grouped data
Explain the characteristics, uses, advantages,
and disadvantages of each measure
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
4 3
4.
5.
Understand Chebyshev’s theorem and the normal
or empirical rule, as it relates to a set of
observations
Compute and interpret percentiles, quartiles and the
interquartile range
6.
Construct and interpret box plots
7.
Compute and describe the coefficient of skewness and
kurtosis of a data distribution
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
Terminology
Range
…is the difference between the
…is the difference between the
largest and the
and the smallest
smallest value.
value.
largest
Only two values are used in its calculation.
Only two values are used in its calculation.
It is influenced by an extreme value.
It is influenced by an extreme value.
It is easy to compute and understand.
It is easy to compute and understand.
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
4 4
Terminology
4 5
Mean Deviation
…is the arithmetic mean of the absolute values of
…is the arithmetic mean of the absolute values of
the deviations from the arithmetic mean.
the deviations from the arithmetic mean.
MD =
Σ x − µ
N
All values are used in the calculation.
All values are used in the calculation.
It is not unduly influenced by large or small values.
It is not unduly influenced by large or small values.
The absolute values are difficult to manipulate.
The absolute values are difficult to manipulate.
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
4 6
The weights of a sample of crates
containing books for the bookstore
(in kg) are:
103 97 101 106 103
Find the range and the mean deviation.
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
4 7
103 97 101 106 103
103 97 101 106 103
x
µ=
N
Find the mean weight
Find the mean deviation
103 102
... 103 102
5
Find the range
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
MD =
510
5
Σ x − µ
1 5 1 4 5
5
106 – 97 = 9
102
N
= 2.4
Terminology
4 8
Variance
…is the arithmetic mean of the
…is the arithmetic mean of the
squared deviations
squared deviations
from the arithmetic mean.
from the arithmetic mean.
All values are used in the calculation.
All values are used in the calculation.
It is not influenced by extreme values.
It is not influenced by extreme values.
The units are awkward…the square
The units are awkward…the square
units.
units.
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
of the original
of the original
Computation
Computing the V
Variance
ariance
Computing the
Formula … for a Population
Formula
σ
2
Σ( x − µ )
=
N
2
Formula … for a Sample
Formula
s
2
Σ( x − x )
=
n −1
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
2
4 9
4 10
The ages of the Dunn family are:
2, 18, 34, 42
What is the population mean and variance?
x
µ =
N
σ
2
Σ( x − µ )
=
N
2
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
96
4
24
2 24
944
4
236
2
...
4
42 24
2
Population Standard Deviation
4 11
… is the square root
square root of the
of the
… is the
population variance
population variance
From previous example…
2
= 15.36
236 = 15.36
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
Example
4 12
EXAMPLE
The hourly wages earned by a sample of five
The hourly wages earned by a sample of five
students are:
$7, $5, $11, $8, $6.
students are:
$7, $5, $11, $8, $6.
Find the mean, variance, and Standard Deviation.
Find the mean, variance, and Standard Deviation.
x
µ =
N
37
5
2 ... 6 7 . 4 2
2
7
7
.
4
Σ
(
x
−
x
)
s2 =
=
5 1
n −1
s =
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
s2
= 7.40
7.40
=
21.2
= 5.30
= 5.30
51
5.29
= 2.30
= 2.30
The Mean
The Mean
of
of
Grouped Data
Grouped Data
4 13
From chapter 3….
A sample of ten movie theatres in a metropolitan
A sample of ten movie theatres in a metropolitan
area tallied the total number of movies
area tallied the total number of movies
showing last week.
showing last week.
Compute the mean number of movies showing
Compute the mean number of movies showing
per theatre.
per theatre.
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
The Mean
The Mean
fx
x
N
of
Grouped Data
of Grouped Data
Continued…
Class
(f)(x)
Midpoint
Movies
Showing
Frequency
1 to under 3
1
2
2
3 to under 5
2
4
8
5 to under 7
3
6
18
7 to under 9
1
8
8
9 to under 11
3
10
30
Total
10
f
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
66
4 14
The Mean
The Mean
fx
x
N
of
Grouped Data
of Grouped Data
Movies
Showing
Frequency
Total
10
f
Formula
Formula
Now: Compute the
Now: Compute the
variance and
variance and
standard deviation.
standard deviation.
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
Continued…
Class
Midpoint
4 15
(f)(x)
66
x
fx
N
66
10
= 6.6
= 6.6
Sample Variance
Sample Variance
for Grouped Data
for Grouped Data
4 16
The formula for the sample variance for
grouped data is:
s
2
2
(
f
x
)
fx 2
n
n 1
e
r
e
wh
f is class frequency and X is class midpoint
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
Sample Variance
Sample Variance
for Grouped Data
for Grouped Data
4 17
Movies
Showing
Frequency
1 to under 3
1
2
2
4
3 to under 5
2
4
8
32
5 to under 7
3
18
108
7 to under 9
1
6
8
8
64
9 to under 11
3
10
30
300
Total
10
66
508
f
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
Class
(f)(x)
Midpoint
(x2)f
Sample Variance
Sample Variance
for Grouped Data
for Grouped Data
Movies
Showing
Frequency
Total
10
f
s
2
4 18
Class
(f)(x)
Midpoint
66
(x2)f
508
2
(
f
x
)
fx 2
n
n 1
2
66
= 508 10
9
The variance is = 8.04
= 8.04
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
The standard
The standard
deviation is
deviation is
8.04 = 2.8
Interpretation and Uses
Interpretation and Uses
of the Standard
of the Standard
Deviation
Deviation
Chebyshev’s Theorem:
For any set of
observations,
the minimum proportion of the values
that
lie within k standard deviations
1
Formula 1
Formula
of the mean is at least:
k2
where k2 is any constant greater than 1
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
4 19
4 20
Suppose that a wholesale plumbing supply company has
a group of 50 sales vouchers from a particular day.
The amount of these vouchers are:
How well
How well
does this
does this
data set
data set
fit
fit
Chebychev’s
Chebychev’s
Theorem?
Theorem?
olution
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
S
Solution (continued)
4 21
Using
Step 1
Step 1
Step 2
Step 2
Determine the mean and
Determine the mean and
standard deviation of the sample
standard deviation of the sample
Input k =2
k =2
Input
into
into
Chebyshev’s theorem
Chebyshev’s theorem
1 1
22
Mean = $319
Mean = $319
SD = $101.78
SD = $101.78
= 1 – ¼ = 3/4
i.e. At least .75 of the observations will fall
i.e. At least .75 of the observations will fall
within
within 2SD
2SDof the mean.
of the mean.
Step 3
Step 3
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
Solution (continued)
4 22
Step 3
Step 3
Using the mean and SD,
Using the mean and SD,
Mean = $319
Mean = $319
find the SD = $101.78
find the
SD = $101.78
range of data values
range of data values
within 2 SD
2 SD of the mean
of the mean
within
( 2S, + 2S) = 319 (2)101.78, 319 +2(101.78)
= (115.44, 522.56)
x
x
Now, go back to the sample data,
and see what proportion of the values fall between
115.44 and 522.5656
Proportion
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
Solution (continued)
4 23
Proportion of the values
that fall
between
115.44 and 522.56
We find that
We find that
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
4850
4850
or 96%
or 96%
of the data
of the data
values are in
values are in
this range
this range
–
–
certainly
certainly
at least 75%
at least 75%
as the theorem
as the theorem
suggests!
suggests!
Interpretation and Uses of the
Interpretation and Uses of the
4 24
Standard Deviation
Standard Deviation
Empirical Rule:
For any symmetrical, bellshaped distribution:
…About 68% of the observations
will lie within 1s of the mean
…About 95% of the observations will
lie within 2s of the mean
…Virtually all the observations
will be within 3s of the mean
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
BellShaped Curve
…showing the relationship between
and
Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved.
4 25