Tải bản đầy đủ (.pdf) (82 trang)

(Luận văn thạc sĩ) an evaluative study on the current final achievement tests for non english majors at quang ninh teacher training college m a thesis linguistics 60 14 01 1

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (876.34 KB, 82 trang )

VIETNAM NATIONAL UNIVERSITY, HA NOI
UNIVERSITY OF LANGUAGES & INTERNATIONAL STUDIES
FACULTY OF POST-GRADUATE STUDIES

VŨ THANH HÒA

AN EVALUATIVE STUDY ON
THE CURRENT FINAL ACHIEVEMENT TESTS FOR NON-ENGLISH
MAJORS AT QUANG NINH TEACHER TRAINING COLLEGE
ĐÁNH GIÁ BÀI KIỂM TRA CUỐI KỲ CHO SINH VIÊN
KHÔNG CHUYÊN NGỮ TRƯỜNG CAO ĐẲNG SƯ PHẠM QUẢNG NINH

M.A. MINOR PROGRAMME THESIS

Field: English Teaching Methodology
Code: 60 140 111

HANOI – 2016


VIETNAM NATIONAL UNIVERSITY, HA NOI
UNIVERSITY OF LANGUAGES & INTERNATIONAL STUDIES
FACULTY OF POST-GRADUATE STUDIES

VŨ THANH HÒA

AN EVALUATIVE STUDY ON
THE CURRENT FINAL ACHIEVEMENT TESTS FOR NON-ENGLISH
MAJORS AT QUANG NINH TEACHER TRAINING COLLEGE
ĐÁNH GIÁ BÀI KIỂM TRA CUỐI KỲ CHO SINH VIÊN
KHÔNG CHUYÊN NGỮ TRƯỜNG CAO ĐẲNG SƯ PHẠM QUẢNG NINH



M.A. MINOR PROGRAMME THESIS

Field: English Teaching Methodology
Code: 60 140 111
Supervisor: Đỗ Thị Thanh Hà, Ph.D

HANOI – 2016


CANDIDATE’S STATEMENT
-----------***-----------

I, Vu Thanh Hoa, hereby certify that this minor thesis entitled

AN EVALUATIVE STUDY ON THE CURRENT FINAL ACHIEVEMENT
TESTS FOR NON – ENGLISH MAJORS AT QUANG NINH TEACHER
TRAINING COLLEGE

is completely the result of my own work for the Degree of Master at University of
Languages and International Studies, Vietnam National University, Hanoi and that
this thesis has not been submitted for any degree at any other university or
institution.

i


ACKNOWLEDGEMENTS

I would like to express my gratitude to all those who have given me the possibility

and motivation to complete this thesis.
First and foremost, I am deeply indebted to my supervisor, Dr. Do Thi Thanh Ha,
for her wholehearted guidance, invaluable suggestions, thoughtful comments and
useful materials during the time of writing this thesis.
Second, I would also like to acknowledge my debt of gratitude to the staff members
of the Faculty of Post-Graduate Studies and the lecturers at University of Languages
and International Studies, Vietnam National University-Hanoi for their valuable
lectures, which laid the foundation for this thesis and for their knowledge as well as
their sympathy.
Third, a special thank would also go to the teachers and the non-English majors at
Quang Ninh Teacher Training College, who took part in the research. Without their
participation and cooperation I would not be able to complete this research paper.
Fourth, I should be grateful to the librarians at ULIS for their constant help thanks
to which I was able to access to all materials needed to accomplish the thesis.
Finally but importantly, I would like to express my appreciation to my family and
my friends who have been continuously giving me a lot of support and
encouragement for the fulfillment of this challenging work.

Hanoi, 2016

ii


ABSTRACT
The study was intended to give an evaluation on the current final achievement tests
for non English majors at QNTTC from perspectives of the teachers and nonEnglish majors at QNTTC. In addition, this study also investigated how the current
final achievement tests for non-English majors at QNTTC aligned to the CEFR.
The study was carried out by means of two sets of survey questionnaires, and
analysis of the current final achievement tests at QNTTC and the CEFR, using
some softwares.

From perspectives of the students, some test items of these tests such as phonetics,
vocabulary and grammatical structures were too difficult for them to do.
The teachers found that the current final achievement tests at QNTTC were not
very reasonable because they lacked two skills: speaking and listening and the
writing section did not test some useful skills such as writing letters, writing cards,
creating stories.
The analysis of the alignment between the current final achievement tests at
QNTTC and the CEFR showed that most of the vocabulary and test items in these
tests were in the ranges from levels A2 to B1 and some of them were at level A2
and did not reach the target level, B1. Moreover, the current final achievement
tests at QNTTC differed from other international tests (PET) in terms of length and
constructs.
The study will hopefully contribute to the test making at QNTTC by showing an
example of evaluation on the current final achievement tests for non-English
majors at QNTTC and the alignment between these tests and the CEFR.

iii


TABLE OF CONTENTS
ABSTRACT ............................................................................................................. iii
LIST OF ABBREVIATIONS ...................................................................................vii
LIST OF TABLES .................................................................................................. viii
CHAPTER 1. INTRODUCTION ............................................................................... 1
1.1 Rationale of the study............................................................................................ 1
1.2 Aims of the study .................................................................................................. 2
1.3 Research questions ............................................................................................... 3
1.4 Scope of the study ............................................................................................... 3
1.5 Significance of the study ..................................................................................... 3
1.6 Methodology ........................................................................................................ 4

1.7 Outline of the thesis ............................................................................................. 4
CHAPTER 2. LITERATURE REVIEW .................................................................... 6
2.1. Basic concepts of testing/ Language testing ........................................................ 6
2.2 The role of testing in teaching and learning .......................................................... 7
2.3 Types of tests according to test purpose .............................................................. 8
2.3.1 Diagnostic tests ................................................................................................. 9
2.3.2 Placement tests ................................................................................................... 9
2.3.3 Proficiency tests ................................................................................................. 9
2.3.4 Achievement tests .......................................................................................... 10
2.4 Criteria of a good test ......................................................................................... 12
2.4.1 Validity ............................................................................................................ 12
2.4.2 Reliability ........................................................................................................ 14
2.4.3 Practicality ..................................................................................................... 15
2.4.4 Discrimination ................................................................................................. 16
2.4.4.1 Item difficulty ........................................................................................... 16
2.4.4.2 Item discrimination .................................................................................. 17
2.5 The CEFR............................................................................................................ 17
2.5.1 What is the CEFR? ........................................................................................... 17
2.5.2 Levels of the CEFR .......................................................................................... 17
2.6 Target level for the non-English majors ........................................................... 18
2.7 Review of related studies .................................................................................... 19
2.8 Summary of Chapter 2 ........................................................................................ 20
iv


CHAPTER 3. METHODOLOGY ............................................................................ 21
3.1 Setting of the study.............................................................................................. 21
3.1.1 English teaching and learning of non-English majors at QNTTC ................... 21
3.1.2 Brief description of the materials used for non-English majors at QNTTC .... 21
3.1.3 The testing practice at QNTTC ........................................................................ 22

3.2 Informants ........................................................................................................... 23
3.3 Data collection instruments ................................................................................. 23
3.4 The alignment framework..................................................................................27
3.5 Data collection and data analysis procedure .................................................... 277
3.6 Summary of chapter 3 ......................................................................................... 29
CHAPTER 4. FINDINGS AND DISCUSSION....................................................... 30
4.1 The current tests at QNTTC ............................................................................... 30
4.1.1 Students’ comments on the existing tests ...................................................... 30
4.1.2 Students’ opinions towards the improvement of the tests ............................. 32
4.1.3 Teachers’ comments on the existing tests ....................................................... 33
4.1.4 Teachers’ opinions towards the improvement of the tests ............................. 35
4.2 The alignment between the current tests at QNTTC and the tests according to
the CEFR ................................................................................................................... 36
4.2.1 In terms of their constructs ............................................................................... 36
4.2.2 In terms of contents .......................................................................................... 36
4.3 Summary of Chapter 4 ..................................................................................... 44
CHAPTER 5. CONCLUSION .................................................................................. 45
5.1 Summary of the study ........................................................................................ 45
5.2 Concluding remarks ........................................................................................... 46
5.3 Limitations and suggestions for further study................................................... 46
REFERENCES .......................................................................................................... 48
APPENDIXES ............................................................................................................ I
APPENDIX 1: KEY POINTS FOR THIRD SEMESTER ......................................... I
APPENDIX 2 ............................................................................................................IV
QUESTIONNAIRE FOR NON-ENGLISH MAJORS AT QNTTC ........................IV
PHỤ LỤC 2 ............................................................................................................. VII
BẢNG CÂU HỎI ĐIỀU TRA DÀNH CHO SV KHÔNG CHUYÊN NGỮ ......... VII
TRƯỜNG CAO ĐẲNG SƯ PHẠM QUẢNG NINH ............................................ VII
APPENDIX 3 ............................................................................................................. X
v



QUESTIONNAIRE FOR TEACHERS AT QNTTC ................................................ X
APPENDIX 4 ......................................................................................................... VIII
FINAL ACHIEVEMENT TEST FOR NON-ENGLISH MAJORS ...................... VIII
APPENDIX 5 ............................................................................................................. X
FINAL ACHIEVEMENT TEST FOR NON-ENGLISH MAJORS .......................... X

vi


LIST OF ABBREVIATIONS
1. QNTTC

Quang Ninh Teacher Training College

2. TTI

Teacher training institution

3. CEFR

The Common European Framework of Reference for

Languages: Learning, Teaching, Assessment
4. TOEFL

Testing English as a Foreign Language

5. IELTS


International English Language Testing System

6. TOEIC

Test of English for International Communication

7. KNLNN

Khung năng lực ngoại ngữ

8. RMM

Pearson Reading Maturity Metric

vii


LIST OF TABLES
Table 2.1: Common European Framework of Reference (CEFR) ....................... 19
Table 4.1: Analysis of the degree of difficulty of the test items in phonetics
in Test number 1 .................................................................................................... 37
Table 4.2: Analysis of the degree of difficulty of the test items in phonetics
in Test number 2 .................................................................................................... 37
Table 4.3: Analysis of the degree of difficulty of the test items
in grammar and vocabulary in Test number 1 ....................................................... 38
Table 4.4: Analysis of the degree of difficulty of the test items
in grammar and vocabulary in Test number 2 ....................................................... 39
Table 4.5: Analysis of the degree of difficulty of the reading texts in Test
number 1 ................................................................................................................ 41

Table 4.6: Analysis of the degree of difficulty of the reading texts in Test
number 2 ................................................................................................................ 41
Table 4.7: The analysis of question items of the reading texts of Tests number 1
and 2 ...................................................................................................................... 42
Table 4.8: The comparison between Tests number 1 and 2 .................................. 43
Table 4.9: Comparison of the length between the current tests at QNTTC
and the reading – writing tests of PET .................................................................. 43

viii


LIST OF FIGURES

Figure 4.1: Students’ accomplishment of a test .................................................... 30
Figure 4.2: Students’ difficulty/ difficulties when doing the test .......................... 31
Figure 4.3: Students’ interests in test items ........................................................... 32
Figure 4.4: Students’ comments and suggestions ................................................. 32
Figure 4.5: Teachers’ making the English achievement test ................................. 33
Figure 4.6: Teachers’ attitudes toward the current test ......................................... 34
Figure 4.7: Teachers’ comments on the current test ............................................. 34
Figure 4.8: Changes? ............................................................................................. 35

ix


CHAPTER 1. INTRODUCTION
1.1 Rationale of the study
Nowadays English has become increasingly important as a means of global
communication. In the process of global integration, Vietnam has realized the
importance of English language learning and teaching; Thus, English has been

widely used in many fields and it has become a compulsory subject at many
schools and universities.
Quang Ninh Teacher Training College (QNTTC) was established in 1959 and now
it is considered the oldest institution in providing undergraduate teacher education
in Quang Ninh. In 1991, the organization was restructured from four provincial
teacher training institutions (TTI): Quang Ninh Early childhood TTI, Quang Ninh
Primary TTI, Quang Ninh Education Management TTI and Quang Ninh Low
Secondary TTI. Having awareness of the importance of English, the college
authorities have paid due attention to the matter of improving the quality of teaching
and learning English.
In the teaching and learning in general and in the teaching and learning foreign
language process in particular, testing and assessment play a significant role. The
importance of language testing is recognised by virtually all professionals in the
language education. Teachers should not be confined to imparting teaching and
learning with testing. Testing is of special importance in educational system that is
highly competitive as testing is not only an indirect stimulus to learning, but plays a
crucial role in determining the success or failure of an individual’s career with
direct implications for his future career. In the World Yearbook of education 1969,
Lauwerys and Seaton state: “Thus, testing is an important tool in educational
research and for programme evaluation, and may even throw light on both the
nature of language proficiency and language learning.”

1


Nga (1997:1) shares the same idea: “Tests are assumed to be powerful determiners
of what happens in classrooms and it is commonly claimed that they affect
teaching and learning activities both directly and indirectly”.
It cannot be denied that testing is an important part in teaching and learning
process, but has it been paid enough attention yet? Having taught English for

students at a high school and then at QNTTC for 5 years, the author of this study
has designed tests for both English majors and non-English majors. She has also
administered and marked these tests. Her teaching experience shows that there still
remain some problems that need to be solved such as the test content, the gap
between what is tested and what is taught, the reuse of tests from years to years,
from classes to classes. As a result, tests may lack validity and reliability. Hughes
(1990:1) also gives another comment on recent language testing: “It cannot be
denied that a great deal of language testing is of very poor quality. Too often
language tests have a harmful effect on teaching and learning and too often they
fail to measure accurately whatever it is they are intended to measure”. Moreover,
teachers frequently lack formal training in educational measurement techniques
and they tend to be alienated from the testing process.
A well designed test is necessary for all language learners even though they have
different levels. On the ground of the problems already mentioned, it is thought
that achievement tests for the non-English majors at QNTTC should be designed to
assure the accuracy and fairness for all students so that they can produce good
backwash in the teachers’ teaching and give students satisfaction and
encouragement in study. Those reasons above encourage me to conduct the study
“An evaluative study on the current final achievement tests for non-English
majors at Quang Ninh Teacher Training College”
1.2 Aims of the study
The study aims at evaluating the current final achievement tests at QNTTC. To
achieve this aim, the following objectives are established:
2


1.

To evaluate the current final achievement tests for non-English majors from
perspectives of the teachers and non-English majors at QNTTC.


2.

To investigate the alignment of the current final achievement tests at
QNTTC to The Common European Framework of Reference for
Languages: Learning, Teaching, Assessment (CEFR).

1.3 Research questions
In order to achieve the above aims of the study, the following questions will be
addressed:
1. How do English teachers and non-English majors at QNTTC evaluate the
current final achievement tests for non -English majors at QNTTC?
2. How does the current test align to the CEFR?
1.4 Scope of the study
As the title “An evaluative study on the current final achievement tests for nonEnglish majors at Quang Ninh Teacher Training College” suggests, this study is
intended to touch upon some following issues:
- This study is only aimed at evaluating the existing testing situations at QNTTC
from two stakeholders, the teachers and the students.
- This study is limited to evaluate the final achievement tests for non-English
majors.
- This study focuses on evaluating the constructs of the final achievement tests at
QNTTC and the tests based on the CEFR (PET).
- This study is a detailed survey at QNTTC. Therefore, the findings of the study are
not intended to be generalized to other school contexts. Indeed the findings may not
apply beyond the actual participants in this particular study.
1.5 Significance of the study
The findings of the thesis serves as a back- up for the improvements of the tests for
non-English majors at QNTTC. Practically, the findings are beneficial for both
3



teachers and learners at QNTTC from the experience of reflection. It is also hoped
that the thesis will be of contributions towards the development of the testing
situation at QNTTC in general and the testing situations for non-English majors at
QNTTC in particular.
1.6 Methodology
The above-given aims are to be achieved by means of:
(1) A survey questionnaire carried out on 30 non-English majors at QNTTC to
investigate their comments of the existing final achievement tests for nonEnglish majors to get their evaluation as well as their suggestions for
improving the testing situations and language tests at QNTTC.
(2) A survey questionnaire carried out on 10 teachers of the English Faculty of
QNTTC about their comments on the final existing final achievement tests
for non-English majors and their suggestions to improve the situation.
(3) Analysis of the contents and constructs of the current final achievement
tests at QNTTC to find out the alignment of these tests to the CEFR.
Besides the survey and analysis, more information and data needed for the study
were gathered by other methods such as formal and informal discussions with
students and teachers as well as critical reading. Moreover, the study employed a
combination of qualitative and quantitative methodology that includes crosstabulation data and statistical analysis of the results of the survey questionnaire and
the analysis of degree of difficulty of the current final achievement tests at QNTTC
in accordance with the CEFR.
1.7 Outline of the thesis
The author divided this study into five chapters:
- Chapter 1: Introduction, this chapter provides the author’ reasons for choosing
the topic, aims, research questions, scope, significance, methodology and outline
of the study.
4


- Chapter 2: Literature review, this chapter is the most theoretical one, looks at

the background knowledge on language testing such as the basic concepts of
language testing, the role of testing, types of tests according to test purpose,
criteria of a good test as well as the CEFR, target language for non-English
majors.
- Chapter 3: Methodology, this chapter discusses on methodology, presents the
deep analysis of the setting including English teaching and learning at QNTTC,
brief description of the material used for non-English majors and the current
testing situations at QNTTC; the informants; data collection instrument and data
collection and data analysis procedure.
- Chapter 4: Findings and discussion, discusses the major findings of the thesis. A
brief discussion about the actual English teaching and learning context and the
current tests at QNTTC and the alignment between these tests and the tests
according to the CEFR.
- Chapter 5: Conclusion, the author sheds the mantle of reviewing the study and
suggesting further research.

5


CHAPTER 2. LITERATURE REVIEW
2.1. Basic concepts of testing/ Language testing
The importance of language testing cannot be denied and is recognized by all
professionals. Language tests are considered as valuable tools in providing
information concerning language teaching. They provide evidence for the results
of learning and instructing the effectiveness of teaching as well as information for
both teachers and students to make decisions.
For these reasons, testing should be part of language teaching and one of the main
aspects of methodology. Many definitions of testing from different points of view
have been given.
According to Allen (1974:313), a test is a measuring device which we use when we

want to compare an individual with other individuals who belong to the same group.
Carroll (1968:40) defines: “A psychological and educational test is a procedure
designed to elicit certain behavior form which one can make inferences about
certain characteristics of an individual”. Brown (1971:8) has a different point of
view to define a test as “a systematic procedure for measuring an individual’s
behavior”. Peny Urr (1996:33) provides the following definition of a test: “Test is
an activity whose main purpose is to convey (usually to the tester) how well the
testees know or can do something”. Moore (1992:138) proposes: “evaluation is an
essential tool for teachers because it gives them feedback concerning what the
students have learned and indicates what should be done next in the learning
process. Evaluation helps us to understand students better, their abilities, interests,
attitudes, and needs in order to better teach and motivate them”. However, Brown
(1994a:373) stresses that tests are seen by learners as dark clouds hanging over their
heads, upsetting them with thunderous anxiety as they anticipate the lightning bolts
of questions they do not know and worst of all a flood of disappointed if they do not
make the grade. Read (1983:3) shares the ideas saying a language test is a sample of
linguistic performance or a demonstration of language proficiency. Nga (1999:2)
6


also states that “Test most commonly refers to a set of items or questions designed
to be presented to one or more students under specified conditions”. Broughton
(1990:1) thinks the word “test” is much more complicated with at least three quite
distinct meanings. The first meaning refers to a carefully prepared measuring
instrument. The second one refers to what is usually “a short quick teacher-devised
activity” carried out in the classroom and used by the teacher as the basis of an ongoing assessment... Assessment is the process of documenting knowledge, skills,
attitudes and beliefs, usually in measurable terms. The goal of assessment is to
make improvements, as opposed to simply being judged. In an educational context,
assessment is the process of describing, collecting, recording, scoring, and
interpreting information about learning. It may include a test, but also includes

methods such as observations, interviews, behavior monitoring, etc. The last one is
that “of an item within a larger test, part of a test battery, or even sometimes what is
often called a question in an examination”. Harrison (1983a:1) notices that a natural
extension of classroom work, providing teachers and students with useful
information that can serve as a basis for improvement and a test is necessary but
unpleasant imposition from outside the classroom. That means test is a useful tool
to measure learners’ ability in a certain situation especially in a classroom.
In short, testing is an effective means of measuring and assessing students’
language knowledge and skills. The meaning given to the term “testing” is defined
differently by test researchers and can be understood as the use of means requiring
students to respond to questions or tests that are designed to focus on a particular
aspects of learning and also perceived rather broadly as a process of assessment,
consisting of different stages such as preparation, data collection and evaluation.
2.2 The role of testing in teaching and learning
In the past, testing and teaching tended to be separated. Many applied linguistic
researchers and professional designers have shared the idea that language testing

7


plays a decision part in language teaching in general and language learning in
particular.
Heaton (1988:5 ) states that “teaching and testing in some ways are so interwoven
and independent that is very difficult to tease apart. Both testing and teaching are
so closely interrelated that it is virtually impossible to work in either field without
being constantly concerned with the others”.
Heaton (1988:5) also emphasizes that tests may be constructed primarily as
devices to reinforce learning and motivate the students or as a means of assessing
the students’ performance in the language. In the former case, testing is geared to
the teaching, whereas in the latter case, teaching is often geared largely to the

testing.
However, testing has both good and bad effects on teaching. Hughes (1989:1)
shares this point of view: “Backwash can be harmful or beneficial”. He states that
if the content of the test is in accordance with the content of teaching and method
of the course being followed, the test can be of beneficial effect to the teaching
process. Otherwise, it is likely to have bad effect.
In short, testing and teaching activities cannot be separated from each other and
from the programme or from the objectives of the course. Testing may influence
teaching in either good or bad ways.
2.3 Types of tests according to test purpose
Language tests are developed basing on so many purposes that there are many
types of language tests. Since language tests have different purposes and the
information obtained from tests is used for different types of decisions, let us
consider a brief description of some types of tests according to test purposes.

8


2.3.1 Diagnostic tests
Hughes (1990:13) states: “Diagnostic tests are used to identify students’ strengths
and weaknesses. They are intended primarily to ascertain what further teaching is
necessary”.
Brown, H.D (1994b:112) shares this point of view by noting that “diagnostic tests
are focused on the strengths and weaknesses of each individual, the instructional
objectives for purposes of correcting deficiencies “before it is too late”.
In addition to it, Brown (1994b:259) gives another comment on this type of tests as
follows “A diagnostic test is designed to diagnose a particular aspect of a particular
language.” Moreover, Harrison (1983b) also states that this kind of tests is used,
for example, at the end of a unit in the course-book after a lesson designed to teach
one particular point.

From these definitions, it is clear that the main purpose of diagnostic tests is to
identify test-takers’ strengths and weaknesses in the language, as well as to give
explanations to the problems, and what treatment can be assigned to foster
achievement by promoting strengths and eliminating weaknesses.
2.3.2 Placement tests
According to Hughes (1990:14): “placement tests are intended to provide
information which will help to place students at the stage of the teaching program
most appropriate to their abilities. Typically, they are used to assign students to
classes at different levels.” In other words, it is used to assign students to classes
according to their abilities so that they can start a course at approximately the same
level as the other students in the class. So as a rule, the results of placement tests
are needed quickly so that teaching may begin (Harrison, 1983b :4).
2.3.3 Proficiency tests
According to Brown (1995), proficiency tests are originated from the hope to
determine how much of a given language their students have learned and retained,
which focus on overall language ability without reference to any particular
9


programme (and its objectives, teaching and materials). Likewise, a proficiency
test looks to the future situation of language use without necessarily any reference
to the previous process of teaching (McNamara 2000:7).
Hughes (1990:9) states that “Proficiency tests are designed to measure people’s
ability in language regardless of any training they may have had in that language.”
That is to say the content of a proficiency test is not based on the content or
objectives of any language course test takers may have followed. It is rather based
on a specification of what they have to be able to do in the language to meet the
requirement of their future aims.
Other test specialists, such as Carroll and Hall (1985), Harrison (1983a) and
Henning (1987) share the same view that proficiency test helps both teachers and

learners know whether the learners can be able to follow a particular course or they
have to take some pre-departure training to some other popular tests such as
TOEFL, IELTS, which are used to test students’ proficiency for their study in
some English speaking countries. In Vietnam, proficiency tests are of different
levels namely A, B, C in the past and now A1, A2, B1, B2, C1 and C2 according
to the CEFR or the Vietnam’s English competence framework.
2.3.4 Achievement tests
According to Hughes (1990:10), “in contrast to proficiency tests, achievement tests
are directly related to language courses, their purpose being to establish how
successful individual students, groups of students, or the courses themselves have
been in achieving objectives”. Achievement tests are commonly used at school of
all levels and of great importance in evaluating language knowledge and skills
students have acquired during the English teaching learning process.
McNamara (2000:6) states that “achievement tests are associated with the process
of instruction. Achievement tests accumulate evidence during, or at the end of a
course of study in order to see whether and where progress has been made in terms
of the goals of learning. Achievement tests should support the teaching to which
10


they relate. An achievement test may be self-enclosed in the sense that it may not
bear any direct relationship to language use in the world outside the classroom (it
may focus on knowledge of particular points of grammar or vocabulary, for
example).” Brown (1994b:259) shares McNamara’s viewpoint, “an achievement
test is related directly to classroom lessons, units or even a total curriculum.”
Achievement tests are divided into two basic types according to the time of
administration. They are namely progress

achievement tests and final


achievement tests.
(1) Progress achievement tests
Progress achievement tests (criterion-referenced or objective-referenced), as the
name suggests, are intended to measure the progress that learners are making.
Since “progress” in achieving course objectives, these tests should be related to
objectives. These should make a clear progression towards the final achievement
test based on course objectives. They are usually carried out to measure the extent
which students have mastered from what has been taught in the classroom.
Thanks to the results of the achievement tests, teachers will be able to find out and
diagnose areas not properly mastered by students during the course, which need
remedial action. Moreover, these tests also provide students with a good chance to
stimulate learning and performing the target language they have learnt in a positive
and effective manner with confidence. This is also considered a preparative step to
make students familiar with the test.
(2) Final achievement tests
Final achievement tests are given at the end of the course. They may be written or
administered by ministries of education, official examining boards, or by members
of teaching institutions. Clearly, the content of these tests must be related to the
courses with which they are concerned, but the nature of this relationship is still a
matter of disagreement amongst language testers. It is a good chance for teachers
to judge the degree of success of their teaching and identify students’ weakness.
11


Hughes (1990:10) divided them into two kinds depending on different approaches
used. The syllabus - content test is the one in which its content should be based
directly on a detailed course syllabus or on books and other materials used.
Whereas the syllabus – objective test is used to test objectives so it is good to
measure students’ ability to meet course objectives. However, it is bad as they
work against the teaching because this approach copes with testing problems rather

than what students have achieved.
2.4 Criteria of a good test
As mentioned before, testing may have good or bad effects on teaching so before
making tests, test designers often ask themselves these questions: How do we
design a test that can test all the language skills? Who is it for? Is it suitable for all
of them? What is it meant to test? How do we know that it is a good one? Does this
test get the target level?
In order to construct a good test, teachers have to take into consideration the
various factors such as the purpose of the test, the course content and above all
students’ background and so on. In addition to these factors, good tests must
possess some characteristics namely validity, reliability, practicality and
discrimination. According to a number of leading scholars in testing as Valette
(1977), Harrison (1983a), Carroll and Hall (1985), Henning (1987), Heaton (1988),
Hughes (1990) and Brown (1994a) all good tests possess all these four
characteristics. These characteristics will be critically reviewed bellow.
2.4.1 Validity
Validity is certainly the most important single characteristic of a test. If not valid,
even a reliable test does not worth much. Carmen (1995) defines that: “a test is
valid if it measures what you want to measure”. Hughes (1989) also shares the
same ideas: “a test is said to have validity if it measures accurately what it is
intended to be measured”. According to Aik’s opinion (1983:2), “a test is said to
be valid if it is relevant to the aims and purposes of the areas of learning on which
12


it is set”. In this sense, validity of the test and purposes of the course syllabus are
closely related.
There are different kinds of validity such as face validity, content validity,
criterion-related validity, construct validity, empirical validity, predictive validity,
etc but among them content validity, face validity and criterion-related validity are

the most important.
Content validity refers to the correspondence between the content of the test and
the content of the materials to be tested. Of course, a test cannot include all the
elements of the content to be tested. Nevertheless, the content of the test should be
a reasonable sample and representative of the total content to be tested. In Read’s
opinion (1983:6), the most relevant type of validity for classroom testing is content
validity, which means that the content of the test should reflect the content and
objectives of the syllabus that is being followed. According to Anastasi (1982:131)
defines content validity as “essentially the systematic examination of the test
content to determine whether it covers a representative sample of the behavior
domain to be measured.” She shows a fact of useful guideline for establishing
content validity:
- The behavior domain to be tested must be systematically analyzed to make
certain that major aspects are covered by the test items with correct proportions;
- The domain under consideration should be fully described in advance, rather than
being defined after the test has been prepared;
- The content validity depends on the relevance of the individual test relevance of
item content.
From the above concepts, it is obvious that the contents of a tests are main concern
in achieving its content validity.
Whereas, face validity refers to the extent to which the physical appearance of the
test corresponds to what it is claimed to measured. Anastasi (1982:136) points out
13


that face validity is not validity in the technical sense; it refers, not to what the test
actually measures, but to what it appears who take it, the administrative personnel
who decide on its use and other technically untrained observers. Face validity is
supported by the judgment that a test is appealing to laymen–students,
administrations, etc. Hughes (1990) in “testing for Language Teachers” states: “a

test is said to have face validity if it looks as if it supposed to measure”. In other
words, tests should be based on the course content and methodological teaching
approaches.
Criterion-related validity refers to the correspondence between the results of the
test in question and the results obtained from an outside criterion. The outside
criterion is usually a measurement device for which the validity is already
established. In contrast to face validity and content validity, which are determined
subjectively, criterion-related validity is established quiet objectively.
In short, validity is the “must” for testers to take into consideration when they
construct a language test.
2.4.2 Reliability
Reliability is one of the most important characteristics of all tests in general, and
language tests in particular. In fact, an unreliable test is worth nothing. It is of
primary importance in the use of proficiency tests for both public achievement and
classroom tests. An appropriateness of the various factors affecting reliability is
important for the teachers at the very outset, since many teachers tend to regard
tests as infallible measuring instruments and fail to realize that even the best test is
indeed a somewhat imprecise instrument with which to measure skills.
The two things need to be considered about reliability are the consistency of
performance from candidates and scoring. The former is affected by several factors
such as the number of questions, test administration and test instructions. This is
defined by Moore (1992:110) that “reliability refers to the consistency with which
a measurement device measures some target behavior or trait. To put it another
14


×