THIẾT kế một bài THI ĐÁNH GIÁ kết QUẢ học tập môn TIẾNG ANH CHO SINH VIÊN KHÔNG CHUYÊN NGỮ năm THỨ NHẤT TRƯỜNG CAO ĐẲNG sư PHẠM sơn LA

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (219.2 KB, 36 trang )

1

CHAPTER 1: INTRODUCTION
1.1 RATIONALE
The importance of language testing is recognized by virtually all professionals in the field
of language education. It is of special importance in educational system that is highly
competitive as testing is not only an indirect stimulus to learning, but also plays a crucial role
in determining the success or failure of an individual's career with direct implications for his future
earning power. "Thus, testing is an important tool in educational research and for programme
evaluation, and may even throw light on both the nature of language proficiency and language
learning"(Lauwerys and Seanlon, 1969).

Likewise, in the teaching and learning foreign language process, testing takes a very
important role. Language testing is one of the most important ways to evaluate how
students acquire when they learn a foreign language. Through tests teachers know not only
the success or failure of learners but also how well the learners use what they have been
taught. Moreover, the learners know what they gain, what they can apply, and what they
cannot. Moore (1992, p.138) states: “Evaluation is an essential tool for teachers because it gives
them feedback concerning what the students have learned and indicates what should be done next in the
learning process. Evaluation helps you to better understand students, their abilities, interests, attitudes
and needs in order to better teach and motivate them.” Nga (1997, p.1) reaches the same conclusion:
“Tests are assumed to be powerful determiners-of what happens in classroom and it is
commonly claimed that they affect teaching and learning activities both directly and
indirectly.”

Therefore, testing is an important part of the teaching and learning process; but has it been
given adequate attention and careful study yet? Test researchers (Hughes, 1989; Brown,
1995; Read, 1982; Hai, 1999; Tuyet, 1999) in general claim that unfortunately tests have
got a bad rap in recent years and not without reason. More often than not, tests are seen by
learners “as dark clouds hanging over their heads, upsetting them with thunderous anxiety
as they anticipate the lightning bolts of questions they do not know and worst of all a flood

of disappointment if they do not make the grade” (Brown, 1994a: p.373). Hughes (1989,
p.1) makes another comment on recent language testing: “It cannot be denied that a great
deal of language testing is of very poor quality. Too often language tests have a harmful
effect on teaching and learning and too often they fail to measure accurately whatevaer it is

2

they are intended to measure.” This coupled with the fact that teachers frequently lack
formal training in educational measurement techniques and they tend to be alienated from
the testing process. They regard it as a necessary evil, an intrusion on their regular
instructional activities.
At present, English tests at Son La Teachers’ Training College (STTC) have the following
characteristics:
- It has not been given appropriate attention and careful study
- Its role in teaching and learning has not been fully recognized.
- Almost language teachers think that teachers should be responsible for making tests
because testing is one part of teaching and learning activities that students have to pass.
- There has been a tendency using commercial (ready-made) tests rather than teacher selfmade tests since commercial tests are very convenient and do not take much time to
construct. Thus these selected tests may not be relevant to the objectives of the course.
- Test content is sometimes found to be unrelated to the objectives of the course and very
often many test items in some tests have not been dealt with classes.
- Students have complained that there is still a big gap between what is taught and what is
tested. An instance for this would be the case when some tests designed for preintermediate level are given to students of elementary level. They are so difficult that only
few students can accomplish. Therefore, such tests are not valid and reliable.
- Using tests exclusively for grading, there is no feedback about the tests.
- There has been no discarding of bad tests or bad items. Some items are found to be so
difficult that few testees could do whereas there are test items, which are so easy that all
testees can obtain the correct answers. Such items should be discarded or replaced.
- Moreover, due to the fact that the writing and reading comprehension tests at the

university are totally designed with multiple choice techniques so students can easily cheat
by asking and copying answers from their classmates.
- Apart from those carefully designed tests, some others are still of low and poor quality
and these do not accurately measure the students' real ability. Perhaps the test writer only
pays attention to the fulfillment of his/her duty, which is to give tests, rather than to the
effectiveness of the tests. Those tests often fail to measure accurately whatever they are
intended to measure.

3

- Finally, the last testing problem at STTC is that some of the tests may lack reliability
because they are not pre-tested anywhere else for the sake of confidentiality. Truly, for the
sake of "confidentiality" test designers are often informed to write tests at short notice, just
some time before it is administered. In such circumstances who can say for sure that the
required standards, criteria will be met by the test writers?
Therefore, a well-design test is necessary for every language level especially for college
level since it is the elementary level, which aims at acquiring survival English and
diagnosing students’ aptitudes in the course and what they have to study to improve both
their knowledge and skills. In this minor thesis, the author bases herself on the knowledge
of testing and testing situation to propose a sample achievement test for the first year
students who have been taught the student’s book New Headway English Course
(elementary level) from unit 1 to unit 8.
1.2 SCOPE OF THE STUDY
The scope of the study focuses on the existing situations at Son La Teacher’s Training
College. I design a sample test only on writing and reading skills focusing on grammar,
vocabulary, reading and writing skills. The study provides investigated and analyzed data
of the achievement test for the first-year non-English major students. Moreover, the
teachers’ and students’ comments on the test and their suggestion for its improvement will
be presented in this thesis.

1.3 AIMS OF THE STUDY
The aim of the study is to report a research examining the current testing situations and
language tests for non-English majors at STTC with great emphasis on analyzing the result
of the sample test, the teachers’ and students’ comments on the test and their suggestion for
its improvement. The specific aims of the study are:
1. To investigate the STTC teachers’ evaluation and students’ evaluation of the
sample test concerning its content, time allowance and its format.
2. To investigate the teachers’ suggestions and students’ suggestions for improving
testing situations and language tests at STTC.

4

3. To propose an achievement test construction for the first-year students at STTC
and a sample test will be designed based on the proposed test construction.
4. To offer some practical recommendations for improving of testing situation at STTC.
1.4 METHODS OF THE STUDY
In order to achieve the above aims, a study has been carried out with the following
approach. Basing on the theory and principle of language testing, major characteristics of a
good test, especially achievement tests, the author analyzes the results of the sample test,
and the survey questionnaire done on 10 English teachers of the English major students at
STTC. Many other methods, such as interviews, informal discussion with students,
teachers, and classroom testing observation are also employed to get more needed
information.
1.5 RESEARCH QUESTIONS
The research questions of the study are as follows:
1. What should be done to improve the English testing situation for the first-year
students at STTC?
2. Which test components are considered appropriate for the English Achievement
test construction at STTC?

1.6 DESIGN OF THE STUDY
The minor thesis is organized into four chapters
Chapter one is the introduction consisting of the rationale, the aims, the method, the
research questions and the design of the study.
Chapter two presents the literature review on the basic concepts of testing, types of tests
and characteristics of good tests, the test items, test item types of language components and
language skills.
Chapter three, which is the main part of the study, shows the analysis of the finding of test
designing and some brief comments from teachers and testees.
Chapter four deals with some suggestions to improve the test and the summary of the
research.

5

CHAPTER 2: LITERATURE REVIEW
2.1 BASIC CONCEPTS OF TESTING
According to Brown (1994: p.252), “A test, in plain or ordinary words, is a method of
measuring a person’s ability or knowledge in a given area.” Moore (1992: p.138) proposes
that evaluation is an essential tool for teachers because it gives them feedback concerning
what the students have learned and indicates what should be done next in the learning
process. Evaluation helps us to understand students better, their abilities, interests,
attitudes, and needs in order to better teach and motivate them. However, in the book of
Brown (1994, p.373) he stresses that tests are seen by learners as dark clouds hanging over
their heads, upsetting them with thunderous anxiety as they anticipate the lightning bolts of
questions they do not know and worst of all a flood of disappointed if they do not make the
grade. Read (1983, p.3) shares the idea saying a language test is a sample of linguistic
performance or a demonstration of language proficiency. In other words, a test is not
simply a set of items that can be objectively marked; it can also involve a ‘subject’
educational of spoken and written performance with the assistance of a checklist, a rating

scale, or a set of performance criteria.” Nga (1992, p.2) also confirms that tests commonly
refer to a set of items or questions designed to be presented to one or more students under
specified conditions. Harrions (1986, p.1) notices that a natural extension of classroom
work, providing teachers and students with useful information that can serve as a basis for
improvement and a test is necessary but unpleasant imposition from outside the classroom.
That means test is a useful tool to measure learners’ ability in a certain situation especially
in classroom.
2.2 TYPES OF TESTS
2.2.1 Proficiency Tests
According to Hughes (1990:9), “Proficiency tests are designed to measure people’s ability
in a language regardless of any training they may have had in that language.” That is to say
the content of a proficiency test is not based on the content or objectives of any language

6

course test takers may have followed. It is rather based on a specification of what they
have to be able to do in the language to meet the requirement of their future aims.
Other test specialists, such as Carroll and Hall (1985), Harrison (1986) and Henning (1987)
share the same view that proficiency test helps both teachers and learners know whether
the learners can be able to follow a particular course or they have to take some predeparture training to some other popular tests such as TOEFL, IELTS, which are used to
test students’ proficiency for their study in some English speaking countries. In Vietnam
proficiency tests are of different levels namely A, B, C for workers, engineers, teachers,
architects, etc.
2.2.2 Achievement Tests
As it has been mentioned above, not many teachers are interested in proficiency tests since
it does not base on any particular course book. (Hughes, 1990:10) states: “In contrast to
proficiency tests, achievement tests are directly related to language courses, their purpose
being to establish how successful individual students, groups of students, or the courses
themselves have been in achieving objectives”. Achievement tests are usually carried out

after a course on a group of learners who take the course. Sharing the idea about
achievement tests with Hughes, Brown (1994:259) suggests: “An achievement test is
related directly to classroom lessons, units or even total curriculum”. Achievement tests, in
his opinion, “are limited to a particular material covered in a curriculum within a particular
time frame.” Another useful comment on achievement tests offered by Finocchiaro and
Sako (1983:15) is that achievement types or attainment tests are widely employed in any
language teaching institutions. They are used to measure the amount of degree of control
of discrete language and cultural items and of integrated language skills acquired by the
students within a specific period of instruction in a specific course”. In his book, Harrison
(1983:7) shows: “an achievement test looks back over a longer period of learning than the
diagnostic test, for example, a year’s work, or even a variety of different courses.” He also
points out that achievement tests are intended to show the standard, which the students
have reached in relation to other students at the same level.
There are two kinds of achievement tests: final achievement tests and progress
achievement tests.

7

Final achievement tests are those administered at the end of a course of study. They may
be written and administered by ministries of education, official examining boards, or by
members of teaching institutions. Clearly, the content of these tests must be related to the
courses with which they are concerned, but the nature of this relationship is still a matter of
disagreement amongst language testers.
According to some testing experts, the content of a final achievement test should be based
directly on a detailed course syllabus or on the books and other material used. This has
been referred to as the syllabus–content approach. It has an obvious appearance, since the
test only contains what it is thought that the students have actually encountered, and thus
can be considered, in this respect at least, a fair test. The disadvantage of this type is that if
the syllabus is badly designed, or the books and other materials are badly chosen, then the

results of a test can be very misleading. Successful performance on the test may not truly
indicate successful achievement of course objectives.
The alternative approach is to design the test content directly on the objectives of the
course, which has a number of advantages. Firstly, it forces designers to elicit course
objectives. Secondly, test takers show how far they have achieved those objectives. This in
turn puts pressure on those who are responsible for the syllabus and for the selection of
books and materials to ensure that these are consistent with the course objectives. Tests
based on course objectives work against the perpetuation of poor teaching practice, a kind
of course–content–based test, almost as if part of a conspiracy fails to do. It is the author’s
belief that test content based on course objectives is much preferable, which provides more
accurate information about individual and group achievement, and is likely to promote a
more beneficial backwash effect on teaching.
Progress achievement tests, as the name suggests, are intended to measure the progress that
learners are making. Since ‘progress’ in achieving course objectives, these tests should be
related to objectives. These should make a clear progression towards the final achievement
test based on course objectives. Then if the syllabus and teaching methods are appropriate
to these objectives, progress tests based on short – term objectives will fit well with what has
been taught. If not, there will be pressure to create a better fit. If it is the syllabus that is at fault, it
is the tester’s responsibility to make clear that it is there, that change is needed, not in the tests.

8

In addition, more formal achievement tests require careful preparation; teacher could feel
free to set their own ways to make a rough check on students’ progress to keep learners on
their toes. Since such tests will not form part of formal assessment procedures, their
construction and scoring need not be purely towards the intermediate objectives on which a
more formal progress achievement tests are based. However, they can reflect a particular
‘route’ that an individual teacher is taking towards the achievement of objectives.
2.2.3 Diagnostic Tests

According to Hughes (1990:13), “Diagnostic tests are used to identify students’ strengths
and weaknesses. They are intended primarily to ascertain what further teaching is
necessary”. Brown (1994:259) proposes, “A diagnostic test is designed to diagnose a
particular aspect of a particular language.” Harrison (1983) remarks that this kind of tests
is used at the ends of a unit in the course book or after a lesson designed to teach one
particular point. This kind of test is reasonably straight-forward to find out what skills are
applied well or badly by the learners. Otherwise, this leads to disadvantage, as it is not so
easy to obtain a detailed analysis of a learner’s command of grammatical structures. In
order to be sure of this, we would need a number of examples of the choice the student
made between the two structures in every different context on which we thought was
significantly different and important enough to warrant obtaining information. Tests of this
kind still need a tremendous amount of work to produce. Whether or not they become
generally available will depend on the willingness of individuals to write them and of
publishers to distribute them.
2.2.4 Placement tests
According to Hughes (1990:14), “Placement tests are intended to provide information
which will help to place students at the stage of the teaching progamme most appropriate
to their abilities. Typically, they are used to assign students to classes at different levels.”
In other words, we use placement tests to place pupils into classes according to their ability
so that they can start a course approximately at the same level as the other students in the group.
2.2.5 Progress Tests
A progress test is designed to measure the extent to which the students have mastered the
material taught in the classroom. It is based on the language programme which the students

9

have been following and is just as important as an assessment of the teacher's own work as
the students' own learning. Results obtained from the progress tests enable the teacher to
become more familiar with the work of each of the students and with the progress of the

class in general. It also aims at stimulating learning and reinforcing what has been taught.
Good performances may act as a mean of encouraging the students, and even poor
performances may act as an incentive-to more work.
According to Baker (1989, p.103), the frequent use of the progress test, as a goad to
encourage application on the part of the learners, can also in theory serve as a basis for
decisions on course content, learner placement and future course design. He also concludes
that the results of a progress test can be used as an indication to parts of the course content,
which have not been mastered by numbers of students and thus need remedial action.
Moreover, a properly written progress test sampling correctly from the course content can
be a pointer to learners which part of the course need more attention, and to course
designers which parts of the course have not been effective. Whereas, Khoa's research
(1999, p. 13) establishes: “A progress test is an ‘on-the-way’ achievement test, which is
linked to the specific content of a particular set "of teaching materials" or particular course
of instruction.
Progress tests are prepared by a teacher and given at the end of a chapter, a course, or a
term. They may also be regarded as similar in nature to achievement tests but narrower and
much more specific in scope. These tests help the teacher to judge the degree of success of
his or hers in teaching and to identify the weaknesses of the learners. The application of
progress tests is gaining force in many universities and colleges in Vietnam nowadays.
They are parts of what is generally known as ''continuous assessment", a process of
assessment which takes into consideration the results scored by students when they did
their progress tests.
2.2.6 Direct versus Indirect Tests
It is pointed out by Hughes (1990:15) that direct testing requires the candidate to perform
precisely the skills that we wish to measure. If we want to know how well the candidate
can write compositions, we ask them to write compositions. If we want to know how well
they pronounce words, we ask them to speak. The tasks, and the texts which are used,
should be as authentic as possible. There is a fact that the tasks cannot be really authentic.

10

Nevertheless, the effort is to make them as realistic as possible. Direct testing is easier to
design when it is intended to measure the productive skills of speaking and writing since
the very acts of speaking and writing provide us with information about the candidate’s
ability. With listening and reading it is necessary to get candidates not only to listen or read
but also to demonstrate that they have done this successfully. He also indicates several
attractions of direct testing. Firstly, if teachers want to assess pupils’ ability, it is relatively
straightforward to create the conditions, which will elicit the behavior based on judgments.
Secondly, in his opinion at least in the case of the productive skills, the assessment and
interpretation of students’ performance is quite straight - forward. Thirdly, there is likely to
be a helpful backwash effect since practice for the test involves the practice of the skills
that we want to encourage.
By contrast, indirect testing tries to measure the abilities that “underlie” the skills in which
we are interested (Hughes, 1990:15). One section of the TOEFL is considered an indirect
measure of writing ability where the candidate has to identify which of the underlined
elements is erroneous or inappropriate in formal Standard English. Another example of
indirect testing id Lado’s (1961) proposes methods of testing pronunciation ability by a
paper and pencil test in which the candidate has to identify pairs of words, which rhyme
with each other. The main problem with indirect tests is that the relationship between
language performance and skill performance in which we are usually interested tends to be
rather weak in strength and uncertain in nature. We do not know enough about the
component parts of composition writing to predict accurate composition writing ability
from scores on tests that measure the abilities, which we believe underlies it. We may
construct tests of grammar, vocabulary, discourse markers, handwriting, and punctuation.
Still we will not be able to predict accurately scores on compositions even if we make sure
of the representation of the composition scores by taking many samples.
2.2.7 Discrete point verse integrative testing
According to Hughes (1990:16), “Discrete point testing refers to the testing of one element
at a time, item”, which means the test involves a series of items and each item tests a

particular grammatical structure. On the contrary, integrative testing requires the candidate
to combine many language elements in the completion of a task involving writing a
composition, taking notes while listening to a lecture, taking a dictation, or completing a

11

cloze passage. Henning (1987) shares with Hughes the idea that discrete point tests will
usually be indirect, while integrative tests will tent to be direct. However, some integrative
testing methods, such as the cloze procedure, are indirect. Similarly, he stresses that the
distinction between discrete point and integrative was tests originated by John and Carroll
(1961). Discrete point tests are designed to measure knowledge or performance in very
restricted area of the target language. On the other hand, integrative tests are said to tap a
greater variety of language abilities. Moreover, Henning (1987) offers examples of
integrative tests such as random cloze dictation, oral interview, and oral imitation tasks.
2.2.8 Norm – Referenced versus Criterion – Referenced Testing
Imagine that a reading test is administered to an individual student. When teachers use
questions to see how the students perform the test, they may be given two kinds of
answers. The first kind would be that the student obtained a score that placed her or him in
the top ten per cent of candidates who have taken that test, or in the bottom five percent; or
that she or he did better than sixty percent of those who took it. Hughes (1990:17) defined,
“A test which is designed to give this kind of information is said to be norm – referenced.”
According to Henning (1987), a norm – referenced test must have been administered to a
large sample of people. For the purpose of language testing and testing in general, norm –
reference tests also have strengths and weaknesses. Positively, the comparison can easily
be made with the performance or achievement of a large population of students.
Negatively, norm – referenced tests are usually valid only with the population on which
they have been normed.
Criterion–referenced tests are not without their share of weakness. The objectives of
criterion – referenced tests are often too limited and restrictive (Henning, 1987: 7). The

purpose of criterion – referenced tests is to classify people according to the fact that
whether or not they are able to perform some task or set of tasks satisfactorily. Moreover,
the test must match teaching objectives perfectly, so that any tendency of the field of
language measurement, criterion tests possesses two positive virtues: they are helpful in clarifying
objectives and they motivate students to a setting standard in terms of what they can do.
2.2.9 Objective Testing versus Subjective Testing

12

The difference between objective testing and subjective testing is that of scoring. If no
judgment is required on the part of the scorer, then the scoring is objective. A multiple–
choice item test, with the correct responses unambiguously identified, would be a case to
point. If judgment is called for, the scoring is said to be subjective. There are different
degrees of subjectivity in testing. The impressionistic scoring of a composition may be
considered more subjective than the scoring of short answers in response to questions on a
reading tsak. In Oller’s point of view (1979), many tests, such as cloze tests, “lie
somewhere between subjectivity and objectivity”. As a result, many testers are seeking
after objectivity in scoring not only for the sake of objectivity itself, but also for the great
reliability it brings.
2.2.10 Communicative Language Testing
In recent years, in parallel with the development of communicative language teaching
(CLT), communicative language testing has been the focus of a great number of researches
on language testing. Discussions have been centered on the desirability of measuring the
ability to take part in acts of communication. In sum, it is assumed that the main function
of language is to enable people to communicate with each other in society. As a result,
testing language ability is but testing communicative ability (including reading and
listening, the two receptive skills necessary for the process of communication, a two-way
process (Khoa, 1999). Communicative language testing may embrace a number of testing
approaches such as direct versus indirect testing, objective versus objective testing and etc.

Based upon the theory language ability is a complex and multifaceted construct. Bachman
(1991, p.678) proposes the following characteristics or communicative tests: “First, such
tests create an “information gap," requiring test takers to process complementary
information through the use of multiple sources of input. Test takers, for example, might
be required to perform a writing task that is based on input from both a short recorded
lecture and a reading passage on the same topic. A second characteristic is that of task
dependency, with tasks in one section of the test building upon the content of earlier
sections, including the test taker's answers to those sections. Third, communicative tests
can be characterized by their integration of test tasks and content within a given domain of
discourse. Finally, communicative tests attempt to measure a much broader range of
language abilities including knowledge of cohesion, functions, and sociolinguistic

13

appropriateness than did earlier tests, which tended to focus on the formal aspects of the
language grammar, vocabulary, and pronunciation.”
2.3 CHARACTERISTICS OF A GOOD TEST
In order to make a well – designed test, teachers have to take into consideration the various
factors such as the purpose of a test, the content of the syllabus, the students’ background
and so on. In addition to these factors, test characteristics play a very important role in
constructing a good test. According to a number of leading scholars in testing as Valette
(1977), Harrison (1983), Weir (1990), Carroll and Hall (1985), and Brown (1994) all good
tests have four main characteristics as: Validity, reliability, practicality, discrimination
2.3.1 Validity
2.3.1.1 Construct validity
Construct validity is defined by Anastasi (1982: 144) as “the extent to which the test may
said to measure a theoretical construct or trait. Each construct is developed to explain and
organize observed response consistencies. It derives from establishing inter-relationships
among behavioral measures focusing on a broader; more enduring and more abstract kind

of behavioral description construct validation requires the gradual accumulation of
information from a variety of source. Any data throwing light on the nature of the trait
under consideration for the conditions affecting its development and manifestations are
grist for this validity mill.”
Construct validity is viewed from a purely statistical perspective in much of the recent
American Bachman and Palmer (198l) literature. It is seen principle as a matter of the
posterior statistical validation of whether a test has measured a construct that has a reality
dependence of other constructs.
2.3.1.2 Content validity
The more a test simulates the dimensions of observable performance and accords with
what is known about that performance, the more likely it is to have content and construct
validity. According to Kelly (1978:8), content validity seems “an almost and completely
overlapping concept “with construct validity and for Moller (1982: 68), “the distinction
between construct and content validity language proficiency.” Anastasi (1982: 131) defines

14

content validity as “essentially the systematic examination of the test content to determine
whether it covers a representative sample of the behavior domain to be measured.” She
shows a fact of useful guideline for establishing content validity:
- The behavior domain to be tested must be systematically analyzed to make certain that
major aspects are covered by the test items with correct proportions:
- The domain under consideration should be fully described in advance, rather than being
defined after the test has been prepared.
- The content validity depends on the relevance of the individual test relevance of item content.
2.3.1.3 Face validity
Anastasi (1982:136) points out that face validity is not validity in the technical sense; it
refers, not to what the test actually measures, but to what it appears who take it, the
administrative personnel who decide on its use and other technically untrained observers.

Fundamentally, the question of face validity concerns report and public relations. Lado
(1961), Davies (1968), Ingram (1977), Palmer (1981), and Bachman and Palmer (1981)
have all discounted the value of face validity. If a test does not have face validity though, it
may not be acceptable to the students taking it, or the teachers using it. If the students do
not accept it as valid, their adverse reaction to it may mean that they do not perform in a
way that truly reflects their ability. Anastasi (1982:136) takes a similar line “Certainly if
test content appears irrelevant, inappropriate, silly or childish, the result will be poor cooperation, regardless of the actual validity of the test. Especially in adult testing, it is not
sufficient for a test to be objectively valid. It also needs face validity to function effectively
in practical situations.”
2.3.1.4 Backwash validity
Language teachers operating in a communicative frame work normally attempt to equip
students with skills that are judged relevant to present or future needs, and to the extent
that tests are designed to reflect these, the closer the relationship between the test and the
teaching that precedes it, the more the test is likely to enhance construct validity. A
suitable criterion for judging communicative tests in the future might well be the degree to
which they satisfy students, teachers, and future users of test results, as judged by some
systematic attempt to gather data on the perceived validity of the test. If the first stage, with

15

its emphasis on construct, content, face, backwash validities, the bypassed procedures do
not suit the purpose for which it was intended.
2.3.1.5 Criterion-related validity
The concept is concerned with the extent to which test scores correlate with a suitable
external criterion of performance. Criterion-related validity consists of two types, (Davies,
1977), concurrent validly, where the test scores are correlated with another measure of
performance, usually an older established test, taken at the same time (Kelly, 1978: Davies
1983) and predicative validity, where test scores are correlated with some future criterion
of performance. (Bachman and Palmer, 1981)

2.3.2 Reliability.
Reliability is a necessary characteristic of any good test. It is of primary importance in the
use of proficiency tests for both public achievement and classroom tests. An
appropriateness of the various factors affecting reliability is important for the teacher at the
very outset, since many teachers tend to regard tests as infallible measuring instruments
and fail to realize that even the best test is indeed a somewhat imprecise instrument with
which to measure skills.
A fundamental criterion against any language test, which has to be judged, is its reliability.
The concern here is with how far we can depend on the results that a test produces. Three
aspects of reliability are usually taken into account. The first concerns the consistency of
scoring among different markers. The second is the concern of the tester how to enhance
the agreement between markers by establishing, and maintaining adherence to, explicit
guidelines for the conduct of this marking. The third aspect of reliability is that of parallelforms of a test to be devised. The concept of reliability is particularly important when
language tests within the communicative paradigm one considered. Moreover, Davies
(1968) stresses that reliability is the first essential for any test, but for certain kinds of
language test, they may be very difficult to achieve the appropriate result.
2.3.3 Discrimination
Another important feature of a test is its capacity to discriminate among the different
candidates and to reflect the differences in the performances of the individuals in the
group. The extent of the need to discriminate will vary depending on the purpose of the

16

test. In many classroom tests, for example, the teacher will be much more concerned with
finding out how well the pupils have mastered the syllabus and will hope for a cluster of
marks around the 80 per cent and 90 per cent brackets. Nevertheless, there may be
occurrences in which the teacher may require a test to discriminate to some degree in order
to assess relative abilities and locate areas of difficulty. Here below are the items in the test
should be spread over a wide difficulty level as follows:

- Extremely easy items
- Very easy items
- Easy items
- Fairly easy items below average difficult level
- Items of average difficult level
- Items above average difficult level
- Fairly difficult items
- Difficult items
- Very difficult items
- Extremely difficult items.
2.3.4 Practicability.
A test must be practicable, in other words, it must be fairly straight forward to administer.
The most obvious practical consideration concerning the test is overlooked. Firstly, the
length of time available for the administration of the test is frequently misjudged even by
experienced test writers; especially the whole test consists of a number of sub-tests.
Another practical consideration concerns the answer sheets and the stationary used. The
use of answer sheets, however, greatly facilitates marking and is strongly recommended
when large population of test takers are being tested. The question of practicability, is not
confined solely to oral tests, such written tests as situational composition and controlled
writing tests depend not only on the availability of qualified markers who can make valid
judgment concerning the use of language, etc; but also on the length of time available for

17

the scoring of the test. A final point concerns the presentation of the test paper itself, where
possible, it should be printed or typewritten and appear neat, tidy, and authentically pleasing.
2.4 TEST ITEMS FOR READING SKILL, WRITING SKILL, GRAMMAR, AND
VOCABULARY
2.4.1 Test items

Tests usually consist of a series of items. Cohen (1992: 488) defines, “an item is a specific
task to perform, can test one or more points or objectives. For example, an item may test
one point such as the meaning of a given word, or several points, such as an item that tests
the ability to obtain fact from a passage and then makes inferences based on the facts. He
also suggests that ‘sometimes an integrative item is really more a procedure than an item,
as in the case of a free composition which could have a number of objectives’.
Furthermore, he stresses that the objectivity of an item is determined by the way it is
scored. A multiple- choice item, for example, is objective in that there is only one right
answer. He also points out that a free composition may be more subjective in nature if the
scorer does not look for any one right answer but rather for a series of factors namely
creative, cohesion and coherence, grammar and mechanics.
Item types for testing comprehension are ordering tasks, open-ended comprehension
questions and answers, dichotomous items, summary writing, note-taking, guessing
meaning of unfamiliar words from the context, making references, information transfer,
multiple-choice items, jumbled sentences, jumbled paragraphs, completion exercises,
matching words (sentences), cursory reading, gap-filling cloze test. Item types for testing
writing skills are multiple- choice items, matching items, editing, dictation, short-answer
items, summary writing, sentence transformation, free writing, compositions and essays,
error-recognition items, ‘broken-sentence’- items. Item types for testing grammar are multiplechoice items, completion items, matching items, completion items, word transformation.
2.4.2 Language components and language skills.
Linguistics is the study of phonology, syntax, and semantics. The first, phonology, is
concerned with the sound of a language and the way in which these are structured into
segments such as syllables and words. The second, syntax, with the way we string words
together in phrases, clauses, and sentences to build well-formed sentences. Moreover, the

18

third, semantics, with the way we assign meaning to a certain unit of a language in order to
communicate. Each of these has additional levels, phonology is supplemented by

phonetics, the study of the physical characteristics of sound; syntax by morphology is the
study of the structure of words and semantics by pragmatics is the study of the situational
constrains on meaning. The language components we focus on in this minor thesis are
grammar, vocabulary, and phonology. Grammar belongs to syntax. Vocabulary belongs to
semantics. And phonology belongs to phonetics. In addition, the language skills, which we
want to test are reading and writing skills.
2.4.3 The test item types used to evaluate language components and language skills.
Test item types for Reading and Writing skills and Grammar, Vocabulary
Table 1: Test item types for Reading and Writing skills and Grammar, Vocabulary
Reading
-Multiple-choice

Writing
- Sentence building

Gram. and Usage
-Multiple-choice

Vocabulary
- Multiple-choice

Item

- Sentence

items

items

- Short-answer items

transformation

- Rearrangement

- Matching

- Cloze items

- Sentence completion

items

- Word

- Words and sentence

- Letter writing of

- Completion

formation

matching

application

items

- Items involving

-Picture and sentences

- Eliciting a narrative

- Transformation

synonyms

matching

from a series of

items

- Reordering

- True-False items

pictures

- Error-

- Definitions

- Completion items

- Controlled writing

Recognition

(explaining the

- Questions and

tasks: a graph, plan or

Multiple-choice items meaning of each

Answers

drawing.

- ‘Broken

word.)

-Split sentence

- Free writing: letters,

sentence’ items

- Sentence

- Cloze

postcards, diaries,

- Pairing and

Completion

- Reading

forms, directions,

matching items

- Gap filling

Comprehension

instructions.

(open ended questions)

- Reordering

- Context-based

19

CHAPTER 3: THE STUDY
This chapter provides information about the current teaching, learning and testing English
situations at STTC as the setting of the study. The analyzed data from the sample test result
and survey questionnaire is provided in this chapter.
3.1 THE SUBJECTS AND THE CURRENT ENGLISH TEACHING, LEARNING
AND TESTING SITUATIONS AT STTC

3.1.1 Students and their backgrounds
Students who have been studying at STTC are selected from different districts in Son La
province or different provinces in the North of Laos. They come from different ethnics;
most of them are ethnic minority such as, Thai, H’mong, Muong, Tay, Lao, Sinh Mun...
They are all bilingual students, so English becomes their third language. Some of them
come from the urban areas where foreign language learning is paid special attention; some
of them come from the rural areas where foreign language learning is not paid much
attention to.
However, most of them have been studying English for three years at schools. Some of
them study English for seven years (four years at secondary school and three years in high
school). In fact, some of them come from very far mountainous regions where there are no
foreign language teachers, so they have never studied any foreign language before; only
few of them studied Russian as a foreign language at school. Consequently, students are
varied in their English background. The objective of school curriculum is that after three or
seven years of English, the students should have general knowledge of grammar and an
active vocabulary which they can use to talk about some familiar topics in the target
language. Therefore, students entering STTC are mixed up in their level of English.
However, in general, they are all eager to study English and have shown themselves as
active and attentive learners in class. The problem for the teachers of English in STTC is
how to teach and design appropriate tests to meet the demands of the qualification setting
at STTC.
3.1.2 The English teaching staff

20

The foreign language group consists of two sub-groups: English language group (10
teachers), and Chinese language group (1 teachers). All the English teachers have been
trained in Vietnam and none of them has studied abroad. Most of them are so young and
have been teaching English for about 8 years. The majority of teachers have BA degree,

and half of them are taking post-graduate course right in the College of Foreign
Languages–Hanoi National University. All the teachers are eager to teach English and take
part in doing research to improve their knowledge as well as their teaching experience.
In teaching, they prefer using Vietnamese in class, as they found it easy to explain in
Vietnamese. Moreover, English speaking and listening ability of students is limited.
Furthermore, most teachers are not qualified enough to conduct communicative activities
in a foreign language lesson. Besides that, classrooms are not well equipped. Another point
should be mentioned here is that materials for reference and self-study are not enough for
both students and teachers. Most of the English textbooks used are not authentic. In
addition, newspapers and magazines which are good source of modern language reference
are not accessible.
However, teachers are very enthusiastic and helpful to each other. They always teach
English and sometimes design, administer tests in four skills to the students during their
terms. After designing, administering and marking these tests, they have some experience
in designing, evaluating and administering tests.
3.1.3 English teaching and learning at STTC
The Son La Teachers’ Training College is one of the colleges responsible for training
teachers to work in preschools, primary schools and secondary schools in Son La Vietnam
and the North provinces of Laos. The function of English section is to teach students in the
first year and second years for three semesters (two semesters in the first year and the first
semester of second year).
English has been taught at STTC for the first time since 1997. Before that time no foreign
language was taught here. Then English has been taught as it is the language used in the
rapid development of the economy as well as science, technology in the world in general
and in Vietnam in particular. When starting teaching English at the College, teachers have
faced with a number of difficulties such as seeking for a course book appropriate for STTC
students and for suitable teaching methodologies. Though there are a lot of books available in

21

bookshops, it is really hard to identify a suitable one that can meet the demands of STTC
students.
So far, for about four years the English course book has been changed. The first one is
English in focus by Nicholas Sampson and Nguyen Quoc Hung M.A (published by Ho Chi
Minh City Publishing House). After seven years exploiting this book, the teachers and
students all realized that the book is written mainly in Vietnamese and too simple
structures and vocabulary are presented. There are also not many activities for students to
have chances to practice language skills. As a result, the teachers and students felt that was
boring. Next, the differences between the objectives of textbook and objectives of the term
tests are also great obstacle for the teachers and students. The textbook concentrated on
vocabulary while the test items focused on much more grammar, reading, writing and
vocabulary.
Since 2005, the STTC students are to be provided with basic general English through
learning New Headway Elementary by John and Liz Soars (2000, Oxford University
Press). The skill practice is given the ration of 4:3:2:1 for Reading, Writing, Listening and
Speaking. In STTC, students have three-semester-English course with 150 lesson periods
(each one is 45 minutes). The time frame and the units in each semester are as follows:
Table 2: Time frame and the units
Semeste

Level

Time (45-minute lesson

Unit

Course books

r

First
Second
Third

Elementary
Elementary
Elementary

periods)
45 lesson periods
45 lesson periods
60 lesson periods

1-4
5-8
9-14

New Headway Elementary
New Headway Elementary
New Headway Elementary

3.1.4 The objectives of the Elementary English course
According to the content of the course book, after studying 8 units students are expected to
have the following abilities:
Speaking: students are able to carry out simple conversation about everyday life,
such as favourite seasons, differences between pictures, home and around the world, likes
and dislikes etc. They are expected to use simple sentences to exchange information as
well as to make up questions and give appropriate responses within the vocabulary studied.
Listening: students are able to understand simple information about essential topics,
and answer simple questions in the scope of the topics they have learnt.

22

Reading: students are expected to understand formal letters, essay writing, short
and simple articles.
Writing: students are able to write a narrative, letters and to describe photos,
pictures or people.
3.1.5 Teaching material used for first-year students at STTC
As mentioned earlier in the scope of the study, this study focuses on designing of the final
achievement English test for the first-year non-English major students who have been
learning English for 90 lesson periods, covering the first 8 units of the Headway
Elementary.
Moreover, below are the key points of those 8 units: (see next page)

23

Table 3: The key points of those 8 units
Unit
1

Topic

Grammar

Vocabulary

Hello
- Verb to be

everybody! (am/is/are)
- Possessive
adjectives (my,
your, his her)
Meeting
Verb to be
people
Questions and
negatives
Negative and
short answers
Possessive ‘s
The world - Present simple
of work
- Questions and
negatives

- Countries
- using
bilingual
dictionary

4

Take it
easy!

- Present simple

- Leisure

activities

5

Where do
you live?

- There is/are
- some/any
- How many?
- this/that/
these/ those

6

Can you
speak
English?

- Can/can’t
- Could/ couldn’t
- was/ were

- Rooms
- Household
goods
- Parts of a
plane
- Places
- Countries and

languages

7

Then and
now

- Past simple

2

3

- Family
- Food and
drink
- Opposite
adjectives
- Verbs
- Jobs

- Verbs

Reading item
types
Reading
comprehension
Q type: Gap-filling

Writing task

type
Qs type: Write
a short text
about
yourselves

Reading
comprehension
Q-type: True-False
sentences
Write questions for
given answers
Reading
comprehension
- Match sentences
with photographs
- Answer questions
Reading
comprehension
Q-type:
- Fill in the gaps
- Short answers
- Correct mistakes

Qs type:
Writing a letter

Qs type:
Writing a letter

Reading
Qs type:
comprehension
Describing where
Short-answer
you live
items
True-false
sentences
Reading
Qs type:
comprehension
Writing a letter
Q-type:
of application
Short-answer items for a job.
Reading
Qs type:
comprehension
Describing a
Q-type: Gap-filling holiday
- Mark True-False
- Write questions

24

8

How long

ago?

Past simple
Negative and ago

Relationship

Reading
Qs type:
comprehension
Describing an
Qs type:
old friend
Reading for
specific information
3.2 THE CURRENT TESTING SITUATION AT STTC
Based on the experience in teaching English at STTC for nearly 2 years, the author has
learned that testing is not the most important thing for teachers at STTC. Classroom
teachers themselves design most language tests by using a cut-and-paste method, by which
it means they use commercial tests available to write tests without following any rules of
testing. It is thought that teachers who are able to teach are able to design a good test for
their learners.
English has not always been an important subject; it is one of general compulsory subjects
such as, philosophy, political economy, psychology, computing... which are not used in
final college examinations. Therefore, the teachers of English section regard the class
progress tests as a mean to estimate the students’ results in learning as well as to reinforce
their knowledge and motivate their learning. After about ten lessons students will have a
written test which means, after fifteen weeks studying in each term, they will have three
tests. At the examinations, only one test is given to all the students of the same level.
Objective tests such as multiple-choice question items, matching items, cloze test, etc. have

been used in order to achieve the high-test reliability and discrimination among the test
takers. In general, English tests look good and reasonable for students where the test takers
write down the answers on the examination papers.
Since this textbook has already been used for the first year, some progress tests are
designed and run by teachers of the English group. Most of them are familiar with training
English major students, so the tests are thought as validity, reliability and discrimination
for the students not for non-English major ones.
Up to now, there has not been an English standard item for students at STTC. Therefore,
English tests are mainly taken from grammar book designed for learners at the elementary
level such as practical grammar usage (exercises book ), revisions and tests, grammar in
use, sentence building, sentence transformation, filling the gap (published by Ho Chi Minh
City Publishing House), etc. Thus, as mentioned earlier, the content of the test is
sometimes found to be rather unrelated to the course objectives and may lack the most

25

important criteria for a good test concerning its validity, format, practicality, and
reliability. On the other hands, the test content of both progress tests and achievement test
for the first year at STTC are likely to fail to measure the language skills and language
components. As mentioned above, students do not have opportunities to practise speaking
and listening skills have no chances to test learners’ speaking ability otherwise measuring
the way of pronunciation of students is done by phonetics section of the test which is not
reasonable. The Sample Achievement test consists of four sections with eight item types:
Section I. Grammar and Vocabulary (Fifteen multiple-choice questions)
Section II: Grammar and Vocabulary (Ten matching items)
Section III: Writing (Sentences building with given words)
Section IV: Grammar and vocabulary (Ten multiple-choice questions)
Section V. Gap-filling
3.3 THE PROPOSED CONSTRUCTION OF THE ACHIEVEMENT TEST FOR

THE FIRST YEAR STUDENTS AT STTC
3.3.1 Test objectives
As mentioned above, achievement tests are directly related to the language course. When
taking this test, students have to study 8 units of the course book for the first year. The time
allowance is 90 minutes. It is essential to design a test that is suitable to what students have
been taught, and satisfied the objectives of the course. The problem is that students are
required to master four macro-skills but we are not able to test speaking and listening
skills. The objectives of the achievement test are:
- To grade students.
- To elicit the abilities of students on grammar and usage
- To evaluate teachers’ methods.
3.3.2 The Paper Specification Grids for the 2nd Term Achievement Test
The test consists of 50 items which are divided into five sections.
Section I aims at checking use of grammatical structures and vocabulary that
students have studied by asking them to choose the best option from four given options.
This part accounts for 30 % of the total mark.
Section II asks students to match questions with suitable answers. The objective is
to check students’ grammar structures and vocabulary. This part accounts for 20 % of the
total mark.

THIẾT kế một bài THI ĐÁNH GIÁ kết QUẢ học tập môn TIẾNG ANH CHO SINH VIÊN KHÔNG CHUYÊN NGỮ năm THỨ NHẤT TRƯỜNG CAO ĐẲNG sư PHẠM sơn LA

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về