HANOI UNIVERSITY
FACULTY OF MANAGEMENT AND TOURISM
-o0o-
STATISTICS FOR ECONOMICS
Is there any difference
in the number of
students per teacher over years?
Tutor: Ms. Lê Thị Ngọc Tú
Tutorial: 4 – AC 09
Tutorial time: Tuesday – 12.30 – 14.00
Group members:
Nguyễn Huyền Trang
Trần Thu Hằng
Nguyễn Thị Tươi
Lê Thị Minh Thành
0904010116
0904010030
0904010112
0904010098
Nguyễn Thị Huệ
Nguyễn Thị Hồng Nga
Trần Thi Thanh Vân
Nguyễn Thị Mai
0904010043
0904010077
0604040183
0904010065
TABLE OF CONTENTS
Scenario................................................................................................1
I.Methodology......................................................................................2
1.Data collection..................................................................................................2
2.Approach...........................................................................................................3
II. Analysis and discussion.................................................................4
3.Check the required condition............................................................................4
1.1.Normality...................................................................................................4
1.2.Variances equality......................................................................................4
4.Hypothesis testing.............................................................................................6
2.1. Testing block means.................................................................................6
2.2. Testing treatment means...........................................................................7
5.Discussion of finding........................................................................................8
III.Limitation.......................................................................................9
IV.Recommendation and conclusion.................................................9
6.Recommendation..............................................................................................9
7.Conclusion......................................................................................................10
Reference..............................................................................................i
Appendixes..........................................................................................ii
A.Calculating the sample variance......................................................................ii
B.Check the variances equality..........................................................................iii
C.ANOVA (using Excel)....................................................................................iv
D. Histograms......................................................................................................v
E. Data from GSO..............................................................................................vii
Scenario
In recent years, along with an increasing demand in human resources, a growing
number of universities have plan to open new faculties as well as increase the number of
Case study - ANOVA
student admissions for these hot sectors. However, it is undeniable that the mismatch
between the number of students’ enrollment and teachers/lecturers’ quantity has large
effect on the quality of education and training. To be aware of this important issue, our
group decided to find out whether there are any differences in the number of students per
teacher from 2005 to 2009 (particularly 2005, 2007 and 2009) by using statistical
technique (2-way ANOVA). The available data is blocked into six main regions in
Vietnam. After conducting the test, the result show that during this 6-year period, despite
the changes in both number of students and teachers, the number of students per teacher
is nearly the same, which lead to our conclusion that there is no difference among three
years.
I.
Methodology
1. Data collection
As the problem objective is to test whether there are changes in the amount of
students per teacher in recent years in Viet Nam, to be more detail we conduct the test
over three years including 2005, 2007, and 2009. Moreover, the data type is quantitative;
2
Case study - ANOVA
we decided to use the analysis of variance. The data was collected from the Vietnam
General Statistics Office website (shown in Appendix E).
However, we pointed out that many other factors may affect to the result of our
test. As a result, the variability within the samples might be large. In order to reduce the
variation in each year, we made the survey according to blocks and then did the test.
Therefore, we took a random sample of six regions containing Red River delta, Northern
midlands & mountainous, Northern Central and Central Coastal, Highlands, South East,
and Mekong River delta to test the changes in the rate of student over one teacher in
those areas over three years. Nevertheless, because it was so difficult to conduct the
experiment on those areas, we continued using excel to select randomly one province in
each area to be on behalf of that region. And thereafter, we got the result of six provinces:
Hai Phong, Son La, Da Nang, Kon Tum, Dong Nai, and the last one is Kien Giang. Thus,
there are six blocks containing six regions and three treatments are three years in this test.
The experimental design used here is a randomized block design, which treatments are
the three years 2005, 2007, 2009.
After doing the test, the following table was produced:
2005
2007
2009
Red River delta
23.04452467
28.10416667
28.43558606
Northern midlands & mountainous
21.81818182
31.32592593
10.34782609
North Central and Central Coast
45.16666667
33.19047619
27.26348748
Highlands
24.68253968
12.05464481
38.86703383
South East
19.89583333
25.53491436
37.99269006
Mekong River delta
14.95890411
8.356495468
11.10789474
2. Approach
In order to indicate whether differences exist among the number of students over
the quantity of teachers over three years, it is necessary to check the required conditions
for using F-test of two-way ANOVA, which are the random variable is normally
distributes and the population variances are equal. We will check each condition one by
one.
3
Case study - ANOVA
II.
Analysis and discussion
3. Check the required condition
1.1. Normality
As you can see from the histogram in Appendix D, the three populations are non
normal, in order to use 2 way ANOVA, we assume that all of them are normally
distributed.
1.2. Variances equality
Since the best estimator of population variance is the sample variance, we applied
the F - test to compare the variability of two populations (biggest versus smallest ones,
shown in Appendix B). With α = 5%, the F-values of the three tests are higher than 0.05.
Therefore, it can be inferred that the variances are equal.
For its applicability, two-way ANOVA is a procedure that testes to determine
whether differences exist among two or more population means. It enables to measure
how much variation is attributable to difference among populations and how much
variation is attributable to differences within populations. By designing a randomized
block design experiment, it reduces the within treatment variation so as to more easily
detect difference among the treatment means.
However, the technique only allows
testing for a difference rather than indicating which population means exceed others.
4
Case study - ANOVA
After calculating the variance (shown in appendix A), the largest variance is that
one in 2009 while the smallest one is in 2007, so we use F-test to make inference about
those two population variances
1. Testing hypothesis:
HO :
σ 12
=1
σ 22
σ 12
HA : 2 ≠1
σ2
2. Test statistic:
s12
F= 2
s2
v
is F-distributed with 1
= n1 − 1
and
v2 = n2 − 1
3. Significance level: α = 0.05
4. Decision rule:
Reject Ho if F > Fα/2, v1, v2 = F.025, 2, 2 = 39 or F < F1-α/2, v1, v2 = 1/F.025, 2, 2 = 0.0256
5. Value of test statistic:
As shown in Appendix B: F = 0.69
6. Conclusion:
Since 0.0256 < F = 0.69 < 39, not reject Ho.
Therefore, there is not enough evidence to conclude that the population variances differ.
5
Case study - ANOVA
4. Hypothesis testing
2.1. Testing block means
1. Testing hypothesis:
Ho: Block means are all equal
Ha: At least tow block means differ
2. Test statistic:
F=
MSB
MSE is F-distributed with ν1 = b – 1 and ν2 = n – k – b + 1
3. Significance level: α = 0.05
4. Decision rule:
Reject Ho if F > Fα, b -1, n – k – b +1 = F.05, 5, 10 = 3.33
5. Value of test statistic:
As shown in the ANOVA table (Appendix C) F = 1.98995
6. Conclusion:
Since F = 1.98995 < 3.33, not reject Ho.
Therefore, there is not enough evidence to conclude that block means differ,
which indicate that we can use blocks to remove the variability and two-way ANOVA
can be conducted.
6
Case study - ANOVA
2.2. Testing treatment means
1. Testing hypothesis:
H O : µ1 = µ 2 = µ3
H A : At least 2 treatment means differ
2. Test statistic:
F=
MST
MSE
is F-distributed with ν1 = k – 1 and ν2 = n – k – b +1
3. Significance level:
α = 0.05
4. Decision rule:
Reject Ho if F
> Fα, k – 1, n – k – b + 1 = F.05, 2, 10 = 4.10
f(F)
Rejection
Region
0
0.11241
4.10
5. Value of test statistic:
As shown in the ANOVA table (Appendix C): F = 0.11241
6. Conclusion:
Since F = 0.11241 < 4.10, we do not reject Ho.
Hence, there is not sufficient evidence to conclude that differences exist among
the three years.
7
Case study - ANOVA
5. Discussion of finding
It is obvious from the hypothesis tests that there is not enough evidence to reject
the null hypothesis, which assumes that there is no difference between the ratios of
students/teacher in Vietnam over five year period. From the result extracted from the data
analysis section, there is also no difference among the block means representing the
population of six main regions in Vietnam. Therefore, it is quite easy to recognize the
balance state through these six areas.
If the test were not conducted, people may think that the ratio of students per
teacher increases over the years because of the student growth in Vietnam. The fact
shows that due to high demand in high quality human resource to meet challenges of
economic growth, many universities/colleges have increased the number of admission
year by year. To be aware of that fact, education units have had plan to recruit more
teachers to keep up with the increase in number of students and remain/improve teaching
quality. This fact somehow explains the reason for unchanged number of students per
teacher over the years. However, compared with the world’s standard (15-20
students/teacher) and the goal of Ministry of Education and Training (20
students/teacher), the current ratio in Vietnam is still much higher with 28
students/teacher. Therefore, we need to increase the number of teachers to improve the
quality of our country‘s education. Besides, teachers’ quality (degree, teaching skills, etc)
which directly affects education quality should be concerned about. From the result
extracted from the data analysis section, there is also no difference among the block
means representing the population of six main regions in Vietnam. So, it is quite easy to
recognize the balance state through these six areas.
8
Case study - ANOVA
III.
Limitation
Although we tried to do test with our best effort, some limitations still happened.
These following limitations can reduce our test’s accuracy:
•
Lack of information: it is difficult to find information through out longer
periods (5-year periods in stead of 1-year periods as we showed
previously).The 1-year periods can be too short time so that this limitation can
reflect inaccuracy in changing the number of professors. As a result, our
conclusions may be not much exactly.
•
Rejection regions: we chose α = 0.05, which might lead to type II error.
However, we believed that it is not affecting our result so much.
•
Time consuming: because checking consumptions are necessary for testing so
we spend lots of time to check the normality of populations and the equality of
its variance. Fortunately, histograms drawn resulting normally distributed
populations as we expected. Moreover, we also check SSB to ensure that there
is no difference between blocks.
•
Normality: In order to follow the 2 way ANOVA test above, we have assumed
that the three populations are normally distributed.
IV.
Recommendation and conclusion
6. Recommendation
In the recent years, the number of student increases continuously in universities.
As we expected, the ratio of the professors and their students does not change from year
9
Case study - ANOVA
to next year, which means that it does not have strong influences on the quality of
teaching and studying. However, we still have some recommendation in order to improve
those qualities.
+ Reinforcing high qualified professors: since the number of students increases in
universities, it creates a lot of pressure on education. The lack of high qualified teachers
is inevasible. Therefore, reinforcing high qualified professors are the first principles.
+ Motivating teachers: the teachers should be facilitated studies with suitable
compensations. Beside, creating good relationships between teachers and their students
are respected also. Thus, that reduce a large number of teachers quit their jobs.
+ Changing from traditional classes to new model ones: let Hanoi University be
an example, the students and teachers attend at five lectures and five tutorials each week.
Consequently, the professors and their students have extra time for self-study.
+ Flexible time: both teachers and student as well can involve in the social
activities, voluntary event, and part-time jobs in order to gain practical experiences, soft
skills like communication skills. In addition, universities can provide enough facilities
and equipments for teaching.
7. Conclusion
In conclusion, the report carried out on the purpose of dealing with a statistics
question: whether there exist any differences in the number of students per teacher
through 5-years period of time from 2005 to 2009 in six main regions in Vietnam
including Red River delta, Northern midlands & mountainous, Northern Central and
Central Coastal, Highlands, South East, and Mekong River delta The findings drawn
from this study shows that there are not differences from the number of students per over
10
Case study - ANOVA
year in regions which we indicate above. It also means that Vietnamese university
education can provide enough teachers to meet the need of social in general and the
increase in enrolment target through years. However, we still need some recommendation
in order to improve the education system as shown in our report.
During the time we were conducting the research, some limitation occurred
which lead to inaccuracy result. In addition, because of the characteristic of ANOVA test
and time consuming, we can not show the whole picture of the issue for example, the
trend of enrolment target, change in method and model class, etc. If by any chance our
report has aroused interest in other researchers about the same topic, we hope that future
studies would be conducted on a larger time scale, with more detailed data, and with
further knowledge of statistic.
11
Case study - ANOVA
Reference
•
General Statistic Office, Number of teachers, students in universities and colleges by
province, />
•
/>
•
/>
i
Case study - ANOVA
Appendixes
A. Calculating the sample variance
SUMMARY
Count
Red river delta
3
Sum
79.5843
Northern midlands and mountains areas
3
63.4919
21.164
110.341
Northern Central area and Central coastal area
3
105.621
35.2069
83.1804
Central highlands
3
75.6042
25.2014
179.928
South East
3
83.4234
27.8078
85.7486
Mekong river delta
3
34.4233
11.4744
10.9987
2005
2007
6
6
149.567
138.567
24.9278
23.0944
109.518
107.965
2009
6
154.015
25.6691
156.604
ii
Average
26.5281
Variance
9.12889
Case study - ANOVA
B. Check the variances equality
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
F-Test Two-Sample for Variances
Variable 1
23.09443724
107.9649341
6
5
0.6894119
0.346572174
0.1980069
Variable 2
25.66908638
156.6043958
6
5
Since 0.198 < F = 0.689 < 5, we do not reject Ho.
There is enough evidence to conclude that the two population variances are the same.
iii
Case study - ANOVA
C. ANOVA (using Excel)
Source of Variation
SS
df
MS
F
P-value
F crit
Rows
932.8621
5
186.572
1.98995
0.16583
3.32583
Columns
21.07898
2
10.5395
0.11241
0.89479
4.10282
Error
937.5724
10
93.7572
Total
1891.514
17
iv
Case study - ANOVA
D. Histograms
•
•
Population 1: 2005
Bin
10
20
30
40
50
More
Frequency
0
2
3
0
1
0
Bin
10
20
30
40
50
More
Frequency
1
1
2
2
0
0
Population 2: 2007
v
Case study - ANOVA
•
Population 3: 2009
Frequenc
Bin
y
10
0
20
2
30
2
40
2
50
0
More
0
vi
Case study - ANOVA
E. Data from GSO
Whole country
Red river delta
Num
1 Hà Nội
2 Hà Tây
3 Vĩnh Phúc
4 Bắc Ninh
5
6
7
8
Quảng Ninh
Hải Dương
Hải Phịng
Hưng n
9 Thái Bình
10 Hà Nam
11 Nam Định
12 Ninh Bình
Northern midlands and moutain areas
1 Hà Giang
2 Cao Bằng
3 Bắc Kạn
4 Tuyên Quang
2007
Teacher Student
192843
61321
6
25384
791671
S/T
16476
606207
36.793336
1404
29435
20.9651
536
17704
522
2008
Teacher Student
167570
60651
0
25310
695089
17065
529211
33.029851
568
18384
7624
14.605364
632
11676
896
8100
9.0401786
811
9272
761
1776
624
9677
49913
22875
12.716163
28.104167
36.658654
848
1862
907
13437
51070
22195
621
8409
13.541063
612
7222
118
3922
33.237288
268
3668
1517
27081
17.851681
1504
27590
133
4863
71
724
112385
2134
5.443609
30.056338
233
5702
65
1364
105105
1001
107
1410
13.17757
110
1734
212
2080
9.8113208
45
967
80
530
6.625
73
925
vii
S/T
31.01148
5
32.36619
7
18.47468
4
11.43279
9
15.845519
27.427497
24.470783
11.80065
4
13.68656
7
18.34441
5
5.8540773
15.4
15.76363
6
21.488889
12.67123
3
2009
Teacher Student
S/T
65115
1796174
26409
725976
18083
541671
29.954709
646
19576
30.303406
543
14530
26.758748
870
10277
11.812644
876
1894
963
13312
53857
24067
15.196347
28.435586
24.991693
613
8450
13.784666
315
4070
12.920635
1372
34802
25.365889
234
5978
71
1364
120033
1441
5.8290598
97
1571
16.195876
45
688
15.288889
73
905
12.39726
20.295775
Case study - ANOVA
5 Lào Cai
6 Yên Bái
7 Thái Nguyên
8 Lạng Sơn
9 Bắc Giang
10 Phú Thọ
11 Lai Châu
12 Sơn La
13 Hịa Bình
Northern Central area and Central
coastal area
1 Thanh Hóa
2 Nghệ An
3
4
5
6
7
8
Hà Tĩnh
Quảng Bình
Quảng Trị
Thừa Thiên-Huế
Đà Nẵng
Quảng Nam
9 Quảng Ngãi
10 Bình Định
11 Phú Yên
97
1917
19.762887
81
1552
70
2437
829
70666
11.842857
28.997128
109
2929
935
69822
148
1252
8.4594595
166
883
228
3592
15.754386
223
2333
725
10519
14.508966
1112
9959
124
2547
20.540323
187
2838
405
12687
31.325926
417
10226
159
2222
13.974843
185
1930
9601
316394
9640
268741
700
16646
23.78
808
15276
1282
41358
32.26053
1134
40293
162
1172
7.2345679
157
2555
138
78
1952
2394
650
4889
1272
97154
79458
3771
35.427536
16.307692
49.771516
33.190476
5.8015385
148
79
2009
2785
537
4952
1171
52141
82229
6984
403
5553
13.779156
280
5769
609
27751
45.568144
628
19825
329
4192
12.741641
241
4693
viii
19.16049
4
8.5779817
23.83817
5.319277
1
10.46188
3
8.9559353
15.17647
1
24.522782
10.43243
2
18.90594
1
35.53174
6
16.27388
5
33.459459
14.822785
25.953708
29.525673
13.005587
20.60357
1
31.56847
1
19.47302
9
81
714
8.8148148
111
3019
1264
75433
11.387387
24.986088
166
3188
19.204819
244
3001
12.29918
1031
13820
13.404462
214
2869
13.406542
23
238
10.347826
471
11706
24.853503
332
3195
10866
292413
26.910823
830
16022
19.303614
1325
39175
29.566038
167
148
80
2076
3135
2854
5039
1246
56599
90889
17.08982
34.047297
15.575
27.263487
28.991707
634
10616
16.744479
375
6270
16.72
696
22994
33.037356
Case study - ANOVA
12 Khánh Hịa
13 Ninh Thuận
14 Bình Thuận
Central highlands
1 Kon Tum
2 Gia Lai
3 Đắk Lắk
4 Đắk Nông
5 Lâm Đồng
South East
1 Bình Phước
2 Tây Ninh
3 Bình Dương
4 Đồng Nai
5 Bà Rịa - Vũng Tàu
6 TP. Hồ Chí Minh
Mekong river delta
1 Long An
2 Tiền Giang
3 Bến Tre
4 Trà Vinh
5 Vĩnh Long
6 Đồng Tháp
7 An Giang
8 Kiên Giang
9 Cần Thơ
724
30423
42.020718
651
28795
54
126
1853
183
111
450
565
847
1908
54774
2206
1163
14021
8976
15.685185
15.142857
53
130
1178
90
100
457
558
3500
45317
1539
1415
13278
544
28408
52.220588
531
29085
15381
97
84
761
759
549900
766
805
20824
19381
7.8969072
9.5833333
27.363995
25.534914
13720
109
77
527
607
447998
952
662
13409
19558
251
5171
20.601594
335
7808
13429
502953
37.452752
12065
405609
4239
84
215
178
216
572
103312
1295
3622
1506
5072
12563
15.416667
16.846512
8.4606742
23.481481
21.963287
5101
77
315
170
413
853
113450
1309
4940
1559
5179
12834
438
15400
35.159817
344
10785
384
8327
21.684896
482
8360
331
2766
8.3564955
384
3226
1523
47008
30.865397
1662
57411
12.054645
10.477477
31.157778
15.886726
ix
44.23195
1
10.528302
26.923077
17.1
14.15
29.054705
54.77401
1
8.733945
8.5974026
25.444023
32.220758
23.30746
3
33.61864
9
17
15.68254
9.1705882
12.539952
15.045721
31.35174
4
17.34439
8
8.401041
7
34.54332
370
6287
16.991892
852
53
125
1271
190
103
491
30733
446
3243
49400
2984
1570
15761
36.071596
8.4150943
487
29085
59.722793
15318
105
77
883
684
485285
879
904
15529
25987
8.3714286
11.74026
17.586636
37.99269
304
7684
25.276316
13265
434302
32.740445
5273
161
325
166
472
469
123067
3762
5879
1803
5535
14212
23.36646
18.089231
10.861446
11.726695
30.302772
412
12321
29.90534
514
10767
20.947471
380
4221
11.107895
1816
53766
29.606828
38.867034
15.705263
15.242718
32.099796
Case study - ANOVA
10 Hậu Giang
11 Sóc Trăng
12 Bạc Liêu
13 Cà Mau
43
797
18.534884
48
1326
105
2097
19.971429
156
2784
101
2083
20.623762
101
2557
49
776
15.836735
96
1180
x
1
27.625
17.84615
4
25.31683
2
12.29166
7
126
3625
28.769841
171
2989
17.479532
170
2546
14.976471
91
1641
18.032967