Tải bản đầy đủ (.pptx) (78 trang)

Numerical Methods and DATA COLLECTION AND SAMPLING

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (706.92 KB, 78 trang )

Descriptive Statistics:
Numerical Methods

1


4.1 Measures of Central Location

❧ The central data point reflects the locations of all the actual data points.
❧ How?
With two data points,
the central location
With one data point

should fall in the middle

clearly the central

between them (in order

location is at the point

to reflect the location of

itself.

both of them).


4.1 Measures of Central Location


❧ The central data point reflects the locations of all the actual data points.
❧ How?
But if the third data point
If the third
appears
data on
point
theappears
left hand-side
in the center
the measure
of the midrange,
of central itlocation
should will
“pull”
remain
thein central
the center,
location
but…to(click)
the left.


4.1 Measures of Central Location

As more and more data points are added, the
central location moves (left and right) as required
in order to reflect the effects of all the points.



The Arithmetic Mean (average)

• This is the most popular and useful measure of central location

Mean =

Sum of the measurements
Number of measurements


The Arithmetic Mean

Sample mean

nn x

∑ i=i=11x i i
x=
nn

Sample size

Population mean

∑ Ni=1 x i
µ=
N
Population size



The Arithmetic Mean
Example 1
Find the mean rate of return for a portfolio equally invested in five stocks having the following annual rate of returns:
11.2%, 8.07%, 5.55%, 13.7%, 21%.

Solution

11.2 + 8.07 + 5.55 + 13.7 + 21
x=
= 9.764%
5

7


3. Geometric mean

• A specialized measure, used to find the average growth rate, or rate of change of a variable over time
• Example:
The number of students attending the music class last Tuesday was 160. This Tuesday, the number is expected to
increase by 15%.
How many of them are likely to attend this Tuesday?


3. Geometric mean

The number of students likely to attend this Tuesday
Number of students

Growth rate/rate of change?


= 160*(100+15)%

= 160*(1+0.15)= 184 (students)

15% or 0.15


3. Geometric mean

• Formula:
-

Step 1: Express the rate of change (R) as (1+R)

- Step 2: Calculate the geometric mean using the formula:
(i) Simple geometric mean: applied when each rate of change appears once only

Rg = n (1+ R1 )(1+ R2 )...(1+ Rn ) −1


3. Geometric mean

-

Step 2: Calculate the geometric mean

(ii) Weighted geometric mean: applied when each rate of return repeatedly appears

Rg = (1+ R1 ) (1+ R2 ) ...(1+ Rk ) −1

f1

n

Rg =

f2

k

n

∏ (1+ R )
i

i=1

fk

fi

−1


Example

The number of employees in a small bank over the period 2000-2006 is presented in the table
below:

Year


2000

2001

2002

2003

2004

2005

2006

No of

200

220

250

262

284

300

312


employees

What is the average rate of change in the number of employees?


Example
Year

2000

2001

2002

2003

2004

2005

2006

No of

200

220

250


262

284

300

312

employee
s

(1+R)

-

1.1

1.136

1.048

1.084

1.056

1.04


Example


The average rate of change:

Rg = 1.1×1.136×1.048×1.084 ×1.056×1.04 −1= 0.077 ~ 7.7%
6


Example

Year

Growth
rate (%)

Year

Growth
rate (%)

2000

2001

2002

2003

2004

10


25

15

10

10

2005

2006

2007

2008

2009

10

10

15

25

15



Example

Average growth rate

k

Rg =

Rg =

5+3+2

∑ fi
i=1

k

fi
(1+
R
)
−1

i
i=1

1.10 ×1.15 ×1.25 −1= 0.14 ~14%
5

3


2


Characteristics of the mean

A representative of a data set
Takes every single value into account so it is likely to be affected by extreme values
Used to compare different-sized data sets.


The Median

• The median of a set of measurements is the value that falls in the middle when the
measurements are arranged in order of magnitude.

• When determining the median pay attention to the number of observations (k).



‘k’ is odd
Median = the number at the (k+1)/2th location of the ordered
array.
‘k’ is Even
Median = the average of the two numbers in the middle
(The number at the
(k/2)th and the [(k/2)+1)]th
locations of the ordered array.)



The Median
Example 2
Suppose an additional salary of $31,000
The salaries of seven employees

is added to the group of salaries recorded

were recorded (in 1000s): 28, 60, 26,

before. Find the median salary.

32, 30, 26, 29.
Find the median salary.

Odd number of observations

26,26,28,29,30,32,60

Even number of observations

26,26,28,29,

29.5, 30,32,60,31

There are seven salaries (K = 7).

There are eight salaries (K = 8).

th
The (k+1)/2 salary of the ordered


th th
The two salaries in the middle are 29 (in the (k/2) =4 location), and 30

array is the number at the

th th
(in the [(k/2)+1] =5 location.

th th
(7+1)/2 = 4 location.

The median is the average number – 29.5.

The median is 29.


The Mode

• The Mode of a set of measurements is the value that occurs most frequently.
• A Set of data may have one mode (or modal class), or two or more modes.

For large data sets
The modal class

the modal class is
much more relevant
than a single-value
mode.



The Mode

• Example 3
 The manager of a men’s clothing store observes the waist size (in inches) of trousers sold last
week: 31, 34, 36, 33, 28, 34, 30, 34, 32, 40.

 The mode of this data set is 34 in.

Thisinformation
informationseems
seemstotobebevaluable
valuable(for
(forexample,
example,for
forthe
the
This
designofofa anew
newdisplay
displayininthe
thestore),
store),much
muchmore
morethan
than“ “the
the
design
medianisis33.5
33.5in.”

in.”
median


Relationship among Mean, Median, and
Mode



If a distribution is symmetrical, the mean, median and mode coincide



If a distribution is non symmetrical, and skewed to the left or to the right, the three measures
differ.

A positively skewed distribution
(“skewed to the right”)

Mode

Mean
Median


Relationship among Mean, Median, and
Mode

• If a distribution is symmetrical, the mean, median and mode coincide


❧ If a distribution is non symmetrical, and skewed to the left or to the right, the three
measures differ.

A positively skewed distribution

A negatively skewed distribution

(“skewed to the right”)

(“skewed to the left”)

Mode

Mean
Median

Mean

Mode
Median


Using the Mean, Median, and Mode

• The mean - is very sensitive to extreme values, is used in most statistical
analyses.

• The median is not effected by extreme values, yet, does not reflect all the
values included in the data set, but rather the location of the observation
in the middle.


• The mode – should be used mainly for categorical data.


4.2 Measures of Variability

• Measures of central location fail to tell the whole story about the distribution.
• A question of interest still remains unanswered:

How much are the values of a given set spread
out around the mean value?


×