Tải bản đầy đủ (.pdf) (10 trang)

A textbook of Computer Based Numerical and Statiscal Techniques part 55 potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (109.41 KB, 10 trang )

526
COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES
=
50 225
100
×
= 112.5
Since n is large, the test statistic is
Z =
2
2
χ –
21n –
∼ N (0, 1)
Now, Z =
225

99
= 15 – 9.95 = 5.05
Since
Z
> 3, it is significant at all levels of significance and hence H
0
is rejected and we
conclude that σ ≠ 10.
Example 2. It is believed that the precision (as measured by the variance of an instrument is no more
than 0.16. Write down the null and alternative hypothesis for testing this belief. Carry out the test at 1%
level, given 11 measurements of the same subject on the instrument:
2.5, 2.3, 2.4, 2.3, 2.5, 2.7, 2.5, 2.6, 2.6, 2.7, 2.5
[B.U. (2006), Kanpur (2007)]
Sol. Null Hypothesis, H


0
: σ
2
= 0.16
Alternative Hypothesis, H
1
: σ
2
> 0.16
Computation of Sample Variance
XX –
X
XX–
di
2
2.5 – 0.01 0.0001
2.3 – 0.21 0.0441
2.4 – 0.11 0.0121
2.3 – 0.21 0.0441
2.5 – 0.01 0.0001
2.7 + 0.19 0.0361
2.5 – 0.01 0.0001
2.6 + 0.09 0.0081
2.6 + 0.09 0.0081
2.7 + 0.19 0.0361
2.5 – 0.01 0.0001
X
=
27 6
11

.
= 2.51 ∑
XX–
di
2
= 0.1891
Under the null hypothesis H
0
: σ
2
=0.16, the test statistic is:
χ
2
=
nS
2
2
σ
=
∑ XX–
di
2
2
σ
=
0 1891
016
.
.
= 1.182

which follows χ
2
-distribution with d.f. (11 – 1) = 10.
TESTING OF HYPOTHESIS
527
Since the calculated value of χ
2
is less than the tabulated value 23.2 of χ
2
for 10 d.f. at 1%
level of significance, it is not significant. Hence H
0
may be accepted and we conclude that the data
are consistent with the hypothesis that the precision of the instrument is 0.16.
(ii) Chi-Square Test of Goodness of Fit: χ
2
test is an approximate test for large values of
n. χ
2
test enables us to ascertain how well the theoretical distributions fit empirical distributions
or distribution obtained from sample data. If the calculated value of chi-square is less than the
table value at a specified level of significance the fit is considered to be good. Generally we take
significance at 5% level. Similarly if the calculated value of χ
2
is greater than the table value, the
chi-square fit is considered to be poor.
Example 3. The following table shows the distribution of digits in numbers chosen at random from
a telephone directory:
Digits 0 1 2 3 4 5 6 7 8 9
Frequency 1026 1107 997 996 1075 933 1107 972 964 853

Test whether the digits may be taken to occur equally frequently in the directory.
Sol. Null Hypothesis H
0
: The digits taken in the directory occur equally frequently. Therefore
there is no significant difference between the observed and expected frequency.
Under H
0
, the expected frequency is given by =
10 000
10
,
= 1000.
To find the value of χ
2
O
i
1026 1107 997 996 1075 1107 933 972 964 853
E
i
1000 1000 1000 1000 1000 1000 1000 1000 1000 1000
(O
i
– E
i
)
2
676 11449 9 1156 5625 11449 4489 784 1296 21609
χ
2
=

Σ OE
E
ii
i

bg
2
=
58542
1000
= 58.542.
Conclusion. The tabulated value of χ
2
at 5% level of of significance for 9 d.f. is 16.919. Since
the calculated value of χ
2
is greater than the tabulated value, H
0
is rejected.
i.e., there is significant difference between the observed and theoretical frequency.
i.e., the digits taken in the directory do not occur equally frequently.
Example 4. The following table gives the number of aircraft accidents that occurs during the various
days of the week. Find whether the accidents are uniformly distributed over the week
Days Sun. Mon. Tues. Wed. Thus. Fri. Sat.
No. of accidents 14 16 8 12 11 9 14
(Given: The values of chi-square significant at 5, 6, 7, d.f. are respecitvely 11.07.,12.59, 14.07 at the
5% level of significance.
Sol. Here we set up the null hypothesis that the accidents are uniformly distributed over the
week.
Under the null hypothesis, the expected frequencies of the accidents on each of the days

would be:
528
COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES
Days Sun. Mon. Tues. Wed. Thus. Fri. Sat. Total
No. of accidents 12 12 12 12 12 12 12 84
χ
2
=
14 12
12
16 12
12
812
12
12 12
12
2222
––––
bgbgbgbg
+++
+
11 12
12
912
12
14 12
12
22 2
–– –
bgbgbg

++
=
1
12
(4 + 16 + 16 + 0 + 1 + 9 + 4) =
50
12
= 4.17
The number of degrees of freedom
= Number of observations – Number of independent constraints.
= 7 – 1 = 6
The tabulated χ
2
0.05
for 6 d.f. = 12.59
Since the calculated χ
2
is much less than the tabulated value, it is highly insignificant and
we accept the null hypothesis. Hence we conclude that the accidents are uniformly distributed
over the week.
Example 5. Records taken of the number of male and female births in 800 families having four
children are as follows:
No. of male births 0 1 2 3 4
No. of female births 4 3 2 1 0
No. of families 32 178 290 236 94
Test whether the data are consistent with the hypothesis that the Binomial law holds and the chance
of male birth is equal to that of female birth, namely p = q = 1/2.
Sol. H
0
: The data are consistent with the hypothesis of equal probability for male and female

births, i.e., p = q = 1/2.
We use Binomial distribution to calculate theoretical frequency given by:
N(r)=N × P(X = r)
where N is the total frequency. N(r) is the number of families with r male children:
P(X = r)=
n
C
r
p
r
q
n–r
where p and q are probability of male and female births, n is the number of children.
N(0) = No. of families with 0 male children = 800 ×
4
C
0

1
2
4
F
H
G
I
K
J
= 800 × 1 ×
1
2

4
= 50
N(2) = 800 ×
4
C
1

1
2
1
2
13
F
H
G
I
K
J
F
H
G
I
K
J
= 200; N(2) = 800 ×
4
C
2

1

2
1
2
22
F
H
G
I
K
J
F
H
G
I
K
J
= 300
TESTING OF HYPOTHESIS
529
N(4) = 800 ×
4
C
3

1
2
1
2
13
F

H
G
I
K
J
F
H
G
I
K
J
= 200; N(4) = 800 ×
4
C
4

1
2
1
2
04
F
H
G
I
K
J
F
H
G

I
K
J
= 50
Observed frequency O
i
32 178 290 236 94
Expected frequency E
i
50 200 300 200 50
(O
i
– E
i
)
2
324 484 100 1296 1936
OE
E
ii
i

bg
2
6.48 2.42 0.333 6.48 38.72
χ
2
=
Σ OE
E

ii
i

bg
2
= 54.433
Conclusion. Table value of χ
2
at 5% level of significance for 5 – 1 = 4 d.f. is 9.49.
Since the calculated value of χ
2
is greater than the tabulated value, H
0
is rejected.
i.e., the data are not consistent with the hypothesis that the Binomial law holds and that the
chance of a male birth is not equal to that of a female birth.
Since the fitting is Binomial, the degrees of freedom ν = n –1 i.e., ν = 5 –1 = 4
Example 6. A survey of 320 families with 5 children each revealed the following distribution:
No of boys
No of girls
No of families
.
.
.


54 3 210
01 2 345
14 56 110 88 40 12
Is this result consistent with the hypothesis that male and female births are equally probable ?

Sol. Let us set up the null hypothesis that the data are consistent with the hypothesis of equal
probability for male and female births. Then under the null hypothesis:
p = Probability of male birth =
1
2
= q
p(r) = Probability of ‘r’ male births in a family of 5
=
5
r
F
H
I
K
p
r
q
5 – r
=
5
r
F
H
I
K
1
2
5
F
H

G
I
K
J
The frequency of r male births is given by:
f(r)=N. p(r) = 320 ×
5
r
F
H
I
K
×
1
2
5
F
H
G
I
K
J
= 10 ×
5
r
F
H
I
K
(1)

530
COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES
Substituting r = 0, 1, 2, 3, 4 successively in (1), we get the expected frequencies as follows :
f(0) = 10 × 1 = 10, f(1) = 10 ×
5
C
1
= 50
f(2) = 10 ×
5
C
2
= 100, f(3) = 10 ×
5
C
3
= 100
f(4) = 10 ×
5
C
4
= 50, f(5) = 10 ×
5
C
5
= 10
Calculations for
χχ
χχ
χ

2
Observed Expected (O – E)
2
(O – E)
2
/E
Frequencies Frequencies
(O)(E)
14 10 16 1.6000
56 50 36 0.7200
110 100 100 1.0000
88 100 144 1.4400
40 50 100 2.0000
12 10 4 0.4000
Total 320 320 7.1600
∴χ
2
=
OE
E

bg
2
L
N
M
M
O
Q
P

P

= 7.16
Tabulated χ
2
0.05
for 6 – 1 = 5 d.f. is 11.07.
Calculated value of χ
2
is less than the tabulated value, it is not significant at 5% level of
significance and hence the null hypothesis of equal probability for male and female births may
be accepted.
Example 7. Fit a Poisson distribution to the following data and test the goodness of fit:
X:01234 56
f: 275 72 30 7 5 21
Sol. Mean of the given distribution is:
X
=
fx
N
ii
i

=
189
392
= 0.482
In order to fit a Poisson distribution to the given data, we take the mean (parameter) m of
the Poisson distribution equal to the mean of the given distribution, i.e., we take
m =

X
= 0.482
The frequency of r successes is given by the Poisson law as:
f(r)=Np(r) = 392 ×
e
r
r
–0.
.
!
482
0 482
af
; r = 0, 1, 2, , 6
Now, f(0) = 392 × e
–0.482
= 392 × Antilog [– 0.482 log e]
= 392 × Antilog [– 0.482 × log 2.7183] [ e = 2.7183]
TESTING OF HYPOTHESIS
531
= 392 × Antilog [– 0.482 × 0.4343]
= 392 × Antilog [– 0.2093]
= 392 × Antilog [1.7907] = 392 × 0.6176 = 242.1
f(1) = m × f(0) = 0.482 × 242.1 = 116.69
f(2) =
m
2
× f(1) = 0.241 × 116.69 = 28.12
f(3) =
m

3
× f(2) =
0482
3
.
× 28.12 = 4.518
f(4) =
m
4
× f(3) =
0482
4
.
× 4.518 = 0.544
f(5) =
m
5
× f(4) =
0482
5
.
× 0.544 = 0.052
f(6) =
m
6
× f(5) =
0482
6
.
× 0.052 = 0.004

Hence the theoretical Poisson frequencies correct to one decimal place are as given below:
0 1 2 3456Total
242.1 116.1 28.1 4.5 0.5 0.1 0 392
X
Expected Frequency
CALCULATIONS FOR CHI-SQUARE
Observed Expected (O – E) (O – E)
2
(O – E)
2
/E
Frequency Frequency
(O) (E)
275 242.1 32.9 1082.41 4.471
72 116.7 44.7 1998.09 17.121
30 28.1 1.9 3.61 0.128
7
5
2
1
15
U
V
|
|
W
|
|
45
05

01
0
51
.
.
.
.
U
V
|
|
W
|
|
9.9 98.01 19.217
392 392.0 40.937
∴χ
2
=
Σ OE
E

bg
2
= 40.937
degree of freedom = 7 – 1 – 1 – 3 = 2
Tabulated value of χ
2
for 2 degree of freedom at 5% level of significance is 5.99.
532

COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES
Conclusion: Since calculated value of χ
2
(40.937) is much greater than 5.99, it is therefore
highly significant. Hence we say that poisson distribution is not a good fit to the given data.
Example 8. A die is thrown 270 times and the results of these throws are given below:
. 123456
40 32 29 59 57 59
No appeared on the die
Frequency
Test whether the die is biased or not.
Sol. Null Hypothesis H
0
: Die is unbiased.
Under this H
0
, the expected frequencies for each digit is
276
6
= 46.
To find the value of χ
2
,
()
2
40 32 29 59 57 59
46 46 46 46 46 46
36 196 289 169 121 169
i
i

ii
O
E
OE

χ
2
=
Σ OE
E
ii
i

bg
2
=
980
46
= 21.30.
Conclusion: Tabulated value of χ
2
at 5% level of significance for (6 – 1= 5) d.f. is 11.09. Since
the calculated value of χ
2
= 21.30 > 11.07 the tabulated value, H
0
is rejected.
i.e., die is not unbiased or die is biased.
Example 9. The theory predicts the proportion of beans in the four groups, G
1

, G
2
, G
3
, G
4
should
be in the ratio 9: 3: 3: 1. In an experiment with 1600 beans the numbers in the four groups were 882, 313,
287 and 118. Does the experimental result support the theory.
Sol. H
0
: The experimental result support the theory, i.e., there is no significant difference
between the observed and theoretical frequency under H
0
, the theoretical frequency can be
calculated as follows:
E(G
1
)=
1600 9
16
×
= 900;
E(G
2
) =
1600 3
16
×
= 300;

E(G
3
)=
1600 3
16
×
= 300;
E(G
4
) =
1600 1
16
×
= 100.
TESTING OF HYPOTHESIS
533
To calculate the value of χ
2
Observed frequency O
i
882 313 287 118
Expected frequency E
i
900 300 300 100
OE
E
ii
i

bg

2
0.36 0.5633 0.5633 3.24
χ
2
=
Σ OE
E
ii
i

bg
2
= 4.7266.
Conclusion: Table value of χ
2
at 5% level of significance for 3 d.f. is 7.815. Since the calculated
value of χ
2
is less than that of the tabulated value. Hence H
0
is accepted i.e., the experimental
result support the theory.
(iii)
χχ
χχ
χ
2
test as a test of Attributes: Let us consider two attributes A and B, A divided into
r classes A
1

, A
2
, , A
r
and B divided into S classes B
1
, B
2
, B
S
, such a classification in
which attributes are divided into more than two classes is known as manifold classification. The
various cell frequencies can be expressed in the following table known as r × s manifold contingency
table. Here (A
i
) is the number of persons possessing the attributes and (B
j
) is the number of
persons possessing the attributes (B
j
) and (A
i
B
j
) is the number of persons possessing both the
attributes
A
i
and B
j

for [i = 1, 2, , r; j = 1, 2, S]
A
i
i
r
=

1
=
B
j
j
s
=

1
= N, is the total frequency.
The contingency table for r × s is given below:
A A
1
A
2
A
3
A
r
Total
B
B
1

(A
1
B
1
)(A
2
B
1
)(A
3
B
1
) (A
1
B
1
) B
1
B
2
(A
1
B
2
)(A
2
B
2
)(A
3

B
2
) (A
r
B
2
) B
2
B
3
(A
1
B
3
)(A
2
B
3
)(A
3
B
3
) (A
r
B
3
) B
3



B
s
(A
1
B
s
)(A
2
B
s
)(A
3
B
s
) (A
r
B
s
)(B
s
)
Total (A
1
)(A
2
)(A
3
) (A
r
) N

The problem is to test if two attributes A and B under consideration are independent or not.
Under the null hypothesis, both the attributes are independent, the theoretical cell frequencies
are calculated as follows.
534
COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES
P(A
i
) = Probability that a person possesses the attribute A
i
=
A
N
i
bg
i = 1, 2, , r
P(B
i
) = Probability that a person possesses the attribute B
j
=
B
N
j
ej
P(A
i
B
j
) = Probability that a person possesses both attributes A
i

and B
j
=
AB
N
ij
ej
If (A
i
B
j
)
0
is the expected number of persons possessing both the attributes A
i
and B
j
(A
i
B
j
)
0
= N.P (A
i
B
j
) = NP (A
i
)(B

j
)
= N
A
N
B
N
i
j
bg
ej
=
AB
N
ij
bg
ej
(Since A and B are independent)
Therefore χ
2
=
i
r
j
s
==
∑∑
11
AB AB
AB

ij ij
ij
ejej
ej

0
2
0
which is distributedd as a χ
2
variate with (r –1)(S –1) d.f.
Some Remarkable points:
1. For a 2 × 2 contingency table where the frequencies are
ab
cd
/
/
, χ
2
can be calculated from
independent frequencies as χ
2
=
abcdadbc
abcdbdac
+++
++++
bgbg
bgbgbgbg


2
2. If the contingency table is not 2 × 2, then the above formula for calculating χ
2
cannot
be used. Hence, we have another formula for calculating the expected frequency (A
i
B
j
)
0
=
AB
N
ij
bg
ej
i.e., expected frequency in each cell is =
Product of column total and row total
whole total
3. If
ab
cd
/
/
is the 2 × 2 contingency table with two attributes, Q =
ad bc
ad bc

+
is called the

coefficient of association. If the attributes are independent then
a
b
=
c
d
.
Remark: Yatess Correction: In a 2 × 2 table, if the frequencies of a cell is small, we make
Yates’s correction to make χ
2
continuous.
Decrease by
1
2
those cell frequencies which are greater than expected frequencies, and
increase by
1
2
those which are less than expectation. This will not affect the marginal columns.
This correction is known as Yates’s correction to continuity.
TESTING OF HYPOTHESIS
535
After Yates’s correction χ
2
=
Nbc ad N
acbdcdab
––
1
2

2
F
H
I
K
++++
bgbgbgbg
when ad – bc < 0
χ
2
=
Nad bc N
acbdcdab
––
1
2
2
F
H
I
K
++++
bgbgbgbg
when ad – bc > 0
Example 10. (2 × 2 contingency table). For the 2 × 2 table,
ab
cd
prove that chi-square test of independence gives
χ
2

=
Nad bc
acbdabcd

bg
bgbgbgbg
2
++++
, N = a + b + c + d (1)
[Guwahati Univ. B.Sc., 2002]
Sol. Under the hypothesis of independence of attributes,
E(a)=
abac
N
++
bgbg
E(b)=
abbd
N
++
bgbg
E(c)=
accd
N
++
bgbg
and E(d)=
bdcd
N
++

bgbg
abab
cdcd
ac bd N
+
+
++
∴ χ
2
=
aEa
Ea
bEb
Eb
cEc
Ec
dEd
Ed
––––
af
af
af
af
af
af
af
af
2222
+++
(2)

a – E(a)=a –
abac
N
++
bgbg
=
a a b c d a ac ab bc
N
+++ + + +
bg
ej

2
=
ad bc
N

×