Chapter 3
Statistical measures
Measure center and location
Measure variation/dispersion
Summary
Statistical measures
Center and location
Variation/Dispersion
- Mean (arithmetic,
weighted, geometric)
- Range
- Mode, Median
- Percentile, Quartile
- Variance
- Standard deviation
- Coefficient of variation
Part B
Measures of variation/dispersion
1.
2.
3.
4.
5.
Range
Mean deviation
Variance
Standard deviation
Coefficient of variation
1. The range
The range is defined as the numerical difference
between the smallest and largest values of the
items in a set or distribution
Formula:
R = largest value – smallest value
Example
Ages of two groups of people on survey:
Group
A
20
30
40
50
60
Group
B
38
39
40
41
42
Advantages and disadvantages
of the range
Advantages:
Disadvantages:
Implication
2. The mean deviation
The mean deviation is a measure of dispersion
that gives the average difference (i.e. ignoring ‘-’
signs) between each item and mean.
Formula:
- For a data set
n
d =
∑x
i =1
i
n
−x
Formulae
- For a frequency distribution
k
d =
∑f
i =1
i
xi − x
k
∑f
i =1
i
Example
Group
A
20
30
40
50
60
Group
B
38
39
40
41
42
n
∑x
d A = i =1
i
n
−x
Example
Group
A
20
30
40
50
60
Group
B
38
39
40
41
42
n
∑x
d B = i =1
i
n
−x
Example
The data in table
below relates to the
productivity
(kg/person) of 100
workers in a small
factory
Mean deviation?
Productivity
(kg/person)
<10
Number of
workers
7
10 – 20
18
20 – 30
25
30 – 35
20
35 – 40
18
≥ 40
12
Total
100
Characteristics of the mean deviation
A better measure of dispersion than the range
Useful for comparing the variability between
distributions
Can be complicated to calculate in practice if the
mean is anything other than a whole number.
3. Variance
Variance is another statistical measure of
dispersion
It is defined as the average of squared
discrepancies between each data value and their
mean
Formula:
For a set of values
n
σ =
2
∑( x
i =1
i
− x)
2
n
n
or
σ2 =
2
x
∑ i
i =1
n
− ( x )2 = x 2 − ( x )2
The mean of the squares less the square of the mean
For a frequency distribution
k
σ2 =
2
(
x
−
x
)
fi
∑ i
i =1
k
∑f
i =1
i
k
or
σ2 =
2
x
∑ i fi
i =1
k
∑f
i =1
− ( x )2 = x 2 − ( x )2
i
The mean of the squares less the square of the mean
Example
Group
A
20
30
40
50
60
Group
B
38
39
40
41
42
n
σ2 =
2
(
x
−
x
)
∑ i
i =1
n
Example
The data in table
below relates to the
productivity
(kg/person) of 100
workers in a small
factory
Variance?
Productivity
(kg/person)
>10
Number of
workers
7
10 – 20
18
20 – 30
25
30 – 35
20
35 – 40
18
≥ 40
12
Total
100
Characteristics of the variance
A better measure of dispersion than the range
Complicated since it multiply the discrepancies
The unit of the variance is not meaningful
4. Standard deviation
Standard deviation is defined as the square root
of the variance.
Formula
For a set of values
n
∑( x
σ=
n
or
σ=
i =1
n
−x)
n
∑x
i =1
i
2
2
i
− (x ) = x − (x )
2
2
2
For a frequency distribution
k
∑( x
σ=
i =1
i
−x )
2
fi
k
∑f
i =1
i
k
or
σ=
2
x
∑ i fi
i =1
k
∑f
i =1
i
− ( x )2 = x 2 − ( x )2
Example
Group
A
20
30
40
50
60
Group
B
38
39
40
41
42
σ= σ
2
Example
The data in table
below relates to the
productivity
(kg/person) of 100
workers in a small
factory
Standard
deviation?
Productivity
(kg/person)
>10
Number of
workers
7
10 – 20
18
20 – 30
25
30 – 35
20
35 – 40
18
≥ 40
12
Total
100
Characteristics of Standard Deviation
Can be regarded as one of the most useful and
appropriate measure of dispersion.
For distribution that are not too skewed:
- 99.7% of the data items should lie within three
standard deviation of the mean
- 95% of the data items should lie within two
standard deviation
- 68% of the data items should lie within one
standard deviation of the mean