Tải bản đầy đủ (.pdf) (65 trang)

A study on the validity of the current final English test for the 2nd semester non-English majors at Hanoi University of Industry = Nghiên cứu về tính xác thực

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (851.73 KB, 65 trang )


VIETNAM NATIONAL UNVERSITY, HANOI
UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES
FACULTY OF POST-GRADUATE STUDIES




NGUYEN VAN BAC


A STUDY ON THE VALIDITY OF THE CURRENT FINAL
ENGLISH TEST FOR THE 2ND SEMESTER NON-ENGLISH
MAJORS AT HANOI UNIVERSITY OF INDUSTRY
(Nghiên cứu về tính xác thực của bài thi tiếng Anh cuối học kỳ thứ
hai hiện nay dành cho sinh viên không chuyên tiếngAnh tại trường
Đại học Công Nghiệp Hà Nội)

M.A. Minor Programme Thesis
Field: Methodology
Code: 60 14 10








Hanoi, 2010


VIETNAM NATIONAL UNVERSITY, HANOI
UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES
FACULTY OF POST-GRADUATE STUDIES




NGUYEN VAN BAC


A STUDY ON THE VALIDITY OF THE CURRENT FINAL
ENGLISH TEST FOR THE 2ND SEMESTER NON-ENGLISH
MAJORS AT HANOI UNIVERSITY OF INDUSTRY
(Nghiên cứu về tính xác thực của bài thi tiếng Anh cuối học kỳ thứ
hai hiện nay dành cho sinh viên không chuyên tiếngAnh tại trường
Đại học Công Nghiệp Hà Nội)

M.A Minor Programme Thesis
Field: Methodology
Code: 60 14 10
Supervisor: Pham Thi Hanh, M.A








Hanoi, 2010


iv
TABLE OF CONTENTS


CANDIDATE’S STATEMENT………………………………………………….
i

ACKNOWLEDGEMENT………………………………………………………
ii

ABSTRACT……………………………………………………………………….
iii

TABLE OF CONTENTS …………………………………………………
iv

LIST OF ABBREVIATIONS ……………………………………………
vii

LIST OF TABLES AND CHARTS ……………………………………………
viii

CHAPTER 1: INTRODUCTION……………………………………………….
1

1.1. RATIONALE……………………………………………………………
1

1.2. SCOPE OF STUDY………………………………………………………

2

1.3. AIMS OF STUDY………………………………………………………
2

1.4. METHODS OF STUDY………………………………………………….
3

1.5. RESEARCH QUESTIONS………………………………………………
3

1.6. DESIGN OF STUDY……………………………………………………
3

CHAPTER 2: LITERATURE REVIEW……………………………………….
5

2.1. RELATIONSHIP BETWEEN LANGUAGE TESTING AND
LANGUAGE TEACHING AND LEARNING……………………………
5

2.2. LANGUAGE TESTING…………………………………………………
6

2.2.1. Purpose of language testing……………………………………….
6

2.2.2. Types of language testing………………………………………….
7


2.2.3. The current trends in language testing…………………………
9

2.3. QUALITIES OF A GOOD TEST……………………………………….
10

2.3.1. Reliability…………………………………………………………
10

2.3.2. Validity……………………………………………………………
11

v

2.3.3. Practicality…………………………………………………………
11

2.4. VALIDITY……………………………………………………………….
12

2.4.1. Content or face validity………………………………………….
12

2.4. 2. Response Validity………………………………………………
13

2.4.3. Concurrent validity and predictive validity…………………….
13

2.4.4. Construct validity………………………………………………….

13

CHAPTER 3: THE STUDY……………………………………………………
15

3.1. THE SUBJECT AND THE CONTEXT OF ENGLISH TEACHING
AND LEARNING AT HAUI…………………………………………………

15

3.1.1. English teaching and learning context at HaUI.………………
15

3.1.2. English Testing for non English majors at HaUI………
17

3.1.3. Subject of the study ……………………………………………
18

3.1.3.1. Students…………………………………………………
18

3.1.3.2. Teachers…………………………………………………
18

3.2. RESEARCH METHODS………………………………………………
19

3.2.1. Survey questionnaire………………………………………………
19


3.2.2. Interview…………………………………………………………….
21

3.2.3. Document analysis………………………………………………….
21

3.3. DATA COLLECTION PROCEDURE………………………………….
22

CHAPTER 4: FINDINGS AND DISCUSSIONS……………………………….
23

4.1. DATA ANALYSIS………………………………………………………
23

4.1.1. Data analysis of students and teachers survey questionnaires
and interviews……………………………………………………………

23

4.1.2. Data analysis of students’ score……………………………
32

4.2. DISCUSSIONS……………………………………………………………
34

vi

4.3. SUGGESTIONS FOR IMPROVING THE QUALITY OF THE

CURRENT FINAL TEST FOR THE SECOND SEMESTER NON
ENGLISH MAJOR STUDENT AT HAUI…………………………………

35

4.3.1. Take the students’ language ability and knowledge in
consideration……………………………………………………………
35

4.3.2. Make clear instructions for the test composers………………….
35

4.3.3. Determine objectives of the test…………………………………
36

4.3.4. Determine the content of the test………………………………….
36

CHAPTER 5: CONCLUSION…………………………………………………
38

REFERENCES……………………………………………………………………
40

Appendix 1A………………………………………………………………………
I

Appendix 1B………………………………………………………………………
IV


Appendix 2A………………………………………………………………………
VII

Appendix 2B………………………………………………………………………
X

Appendix 3………………………………………………………………………
XIII

Appendix 4………………………………………………………………………
XIV

Appendix 5………………………………………………………………………
XV

Appendix 6………………………………………………………………………
XVII

vii
LIST OF ABBREVIATIONS



1. EC
Economic and Computer Science Students Group
2. ESP
English for Specific Purposes
3. HaUI
Hanoi University of Industry
4. LT

Language Testing
5. L1
Mother tongue
6. OM
Other Major Students Group
7. SLA
Second Language Acquisition
8. TESOL
Teaching English to Speakers of Other Languages
9. TOEIC
Test of English for International Communication


viii
LIST OF TABLES AND CHARTS



Table 1:
A framework for language assessments
Table 2:
The syllabus for the second semester
Table 3:
Factors to consider in writing items and tasks
Table 4:
Key points presented in the course book New headway pre-intermediate
Table 5:
List of test scores selected from the second semester final test
Chart 1:
Opinions of students on the time allowance of the final test

Chart 2:
Teachers’ comments on time allowance of the test
Chart 3:
Appropriateness of the final test in student’s opinion
Chart 4:
Teachers’ comment on the test appropriateness
Chart 5:
Test items best measure students’ true ability in students’ perception
Chart 6:
Teachers’ opinions on test items best measuring your students’ true ability
Chart 7:
Student’s comments on the level of Grammar and Vocabulary test
Chart 8:
Teachers’ comments on Grammar and Vocabulary test
Chart 9:
Student’s comment on the difficulty level of Reading Comprehension test
Chart 10:
Teachers’ comment on the difficulty level of Reading commprehension test
Chart 11:
Student’s comment on the appropriateness of Writing test
Chart 12:
Teachers’ comment on the writing test
Chart 13:
Student’s comment on Listening commprehension test
Chart 14:
Teachers’ comment on the construct of Listening comprehension test





1
CHAPTER 1: INTRODUCTION

1.1. RATIONALE
Along with the emergence of the globalization, English has been proving its importance in
most areas including science, technology, telecommunication, media, culture, international
relations. In Vietnam, English teaching and learning have drawn a lot of concerns from not
only teachers and students, but the whole society.
English is a non-major subject for most students of Hanoi university of Industry (HaUI).
However, it has received more attention and time than any other basic subjects taught in
the training program. On the average, students have to learn English in 5 semesters, four of
which serve for general English, and in the last semester students will learn English for
Specific Purposes (ESP). Compared with other universities in Vietnam, the time for
English program at HaUI is one of the longest
To evaluate student’s English level and achievement at Hanoi University of Industry,
testing is an essentially important tool. Testing and assessment have been considered as the
light on both the nature of language proficiency and language learning. In other words,
tests can produce the assessment of students’ ability of language use. For each semester,
students have to take three progress tests, and one final achievement test. In the final
achievement test, listening, speaking, reading, grammar knowledge and writing, each
accounts for 20% of the total mark scale.
Although there are a lot of tests used as mentioned above in the process of teaching, it is
recognized that these tests may still not exactly evaluate the student’s ability to use
English. Some students have excellent performance in class, but the results of their test are
not satisfactory, and vice versa. Other teachers at Hanoi University of Industry also hold
this view and they often complain that the current final achievement test for the second
semester does not reflect the true language competence of their students. Some students
and teachers share the same point that what is taught in the program is not included in the
test; therefore, it seems not to measure students’ achievement of the course and their
expected linguistic skills and knowledge. Through interaction with other teachers, I reckon

that test writers often choose the test items somewhere else, but not based on the course
book and the syllabus given at the beginning of the course.

2
One more reason why I choose this topic for my research is that the test evaluation and
assessment at Hanoi University of Industry appear not to receive proper attention. Being a
teacher of English, I have also involved in designing many kinds of test for non English
major students at HaUI, but there are no formal discussions, no systematic and
comprehensive assessments, and no research on the appropriateness of the tests.
With above mentioned reasons, I have decided to choose the research topic: “A study on
the validity of the current final English test for the 2nd semester non-English Majors at
Hanoi University of Industry.” It is believed that this study will be helpful for English
teachers in English Faculty of Hanoi University of Industry who often participate in
designing the progress tests and final achievement exams.

1.2. SCOPE OF STUDY
The scope of this minor thesis is limited to a study on examining the validity of the current
final test for second semester non English major students in terms of its validity for the
non-English majors at Hanoi University of Industry.
Due to the limitations of time, the author cannot send the questionnaires to all non-English
students of Hanoi University of Industry. However, to achieve a broad view from the
teachers and students of Hanoi University of Industry about the final test in terms of its
validity, the author tries his best to give the questionnaires to the students of 5 Faculties
including Economic Faculty, Chemistry Technology Faculty, Electronic Technology
Faculty, Mechanical Technology Faculty, and Electrical Technology Faculty. The students
questioned are all university students, not covering the college students. The author also
cannot conduct the survey and interview with all the teachers of English Department;
instead, he selects the experienced ones who regularly involve in designing tests for non
English major students and those who are currently involving in teaching first year
students in their 2

nd
semester.

1.3. AIMS OF STUDY
The study aims at investigating the validity of the current final achievement test for the 2
nd

semester non English major students at Hanoi University of Industry.
The specific aims of the research are as follows:

3
- to investigate the appropriateness of the current final test for the 2nd semester non
English majors in terms of time allowance, difficulty level, test contents.
- to find out the teachers’ and students’ comments on the test validity;
- to provide some suggestions for improving the test in terms of its validity.

1.4. METHODS OF STUDY
The study employs a combination of methodologies including quanlitative and quantitative
methods to achieve the aims mentioned above.
The author himself base on the literature review on theory and principles of language
testing, the characteristics of a good test, the test reliability and validity to achieve the
overview of language testing. The synthesizing of the literature helps the author to make a
framework for the study.
Questionnaires are sent to teachers and students involving in teaching and learning in the
second semester to collect information on their views of the test’s validity.
In addition, an informal interview and discussions are also carried out with the teachers of
English and their students to gain more information on the appropriateness of the test.

1.5. RESEARCH QUESTIONS
The study is conducted to find the answers to the following research questions:

1. Does the final test for non English major students in the 2
nd
semester give a true picture
of truly the students’ English Competence according to the view of teacher and students?
2. Does the test measure what is purported to measure (i.e. its validity)?
3. How can the test be made valid? In what way should the current final test be improved?

1.6. DESIGN OF STUDY
The study consists of five chapters, organized as follows:
Chapter 1- Introduction- provides background to the study, identifies the problems,
states the aim, purpose and significance of the study, the scope, the methods, the research
questions and the design of the study.
Chapter 2 - Literature review- Presents a review of related literature that provides
the theoretical background of the testing and evaluation in general and the test validity in

4
particular. This review also provides an overview of other studies related to testing,
evaluation, especially the evaluation of tests in terms of its validity.
Chapter 3 - The Study- Provides information about the subjects of the study. It then
describes the data collection instruments and data collection procedure. The rationale for
choosing such data collection instruments is also provided.
Chapter 4 - Findings and Discussions- Analyses and discusses the data collected to
reveal the real results and the validity of the final 2
nd
semester exam for non English major
students of HaUI. The causes for any problems if any and some implications for effective
final achievement tests will be also discovered.
Chapter 5 - Conclusion- Summarizes the major findings that are hoped to find the
appropriate way to enhance the validity of final achievement tests to non English majors.
Limitation of the study and suggestions for further research are also given in this chapter.


5
CHAPTER 2: LITERATURE REVIEW

2.1. RELATIONSHIP BETWEEN LANGUAGE TESTING AND LANGUAGE
TEACHING AND LEARNING
Shohamy (2000) introduced three dimensions of potential contributions of LT to SLA: (1)
defining the construct of language ability; (2) applying LT findings to test SLA hypotheses
and (3) providing SLA researchers with quality criteria for tests and tasks. Also, he gave
out three dimensions of potential contributions of SLA to LT including (1) identifying
language components for elicitation and criteria assessment; (2) proposing tasks for
assessing language; (3) informing language testers about differences and accommodating
these differences.
When regarding relationship between language testing and language teaching and learning,
Hughes (1989) supposed that it has both negative and positive sides. He believed that “too
often language tests have a harmful effect on teaching and they fail to measure accurately
whatever it is they are intended to measure.”(1989:1). He argues that good teaching may
not create good language tests and vice versa.
Bachman (1990) backed the notion that the language testing has positive effects on
language teaching and learning. In his words, “advances in language testing are stimulated
by advances in our understanding of the processes of language acquisition and language
teaching.” (1990:3). For the use of language tests in education program, he said that, “the
fundamental use of testing in an educational program is to provide information for
making decisions, that is, for evaluation”. (Bachman, 1990: 54). He supposed that through
language testing we can evaluate learners’ achievements in each certain learning period,
self-evaluate our ways of teaching or teaching methods, or the language test can provide
us with input into the language teaching process. It was also believed that language
teaching and language learning help rater and test makers have more information and more
input resources for designing and improving achievement tests.
In short, language testing and language teaching and learning have a close and interrelated

relation. Teaching and learning provide a great source of language materials for test and in
turn, testing reinforces, improves and encourages the teaching and learning process.
“teaching and testing are so closely related that it is virtually impossible to work in either
field without being constantly concerned with the other.” (Heaton, 1988:5).

6
2.2. LANGUAGE TESTING
2.2.1. Purpose of language testing
Shohamy (1985: 6) made a distinction between classroom tests and external tests.
Classroom tests are written and administered by teachers while external tests are designed
and submitted by an external agency. The purposes of classroom tests are to find out
whether what was taught in the program was also successful acquired; evaluate and
improve instruction; obtain information on students’ progress and language knowledge;
help organize learning/ teaching materials; provide information for grades; help diagnose
students’ strengths and weaknesses in the language and motivate students to learn.
External tests, however, evaluate proficiency; decide whether to accept students to a
certain program; provide information for administrative decision- special treatment to
certain group, assist in selection and grouping; help evaluate the curriculum; serve research
purposes and obtain information for grading.
Having some ideas similar to Shohamy, but Henning explains the purpose of language
testing in a different way. According to him, language tests aim to deal with the diagnosis
and feedback, screening and selection, placement, program evaluation, providing research
criteria, and assessment of attitudes and sociopsychological differences. (Henning, 1987,
pp 1-4).
He states that the most common aim of language test is to find out strengths and
weaknesses in students’ learning ability. In this sense, the use of diagnostic tests provides
critical information to the student, teacher as well as administrator that should make the
leaning process more efficient.
Language tests can also be used to decide whether students should be allowed to
participate in a particular program of instruction. To make fair selection and decision, the

test must be accurate in the sense that they must provide information that is both reliable
and valid.
Another use of tests is to classify students’ ability to learn languages. In this sense, tests are
used to identify a particular performance level of the student and to place them at a suitable
level of instruction.
In addition, tests are usually used to provide information about effectiveness of programs
of instruction. In this sense, group mean or average scores are of greater interest than
isolated scores of individual students.

7
Language tests score can be used to provide research criteria such as comparisons of
methods, and techniques of instruction, text books or audio visual aids. Also, tests can be
used to assess student’s attitude toward the target language, its people and their culture,
which are essential elements for good language learning.

2.2.2. TYPES OF LANGUAGE TESTING
Harrison (1991) introduces four types of language tests, including placement, diagnostic,
achievement and proficiency. Placement tests are designed to classify new students into
certain group, so that they can start the course at the same level as the other students in
class. The placement test is concerned with the student’s present standing, thus it relates to
general ability rather than specific points of learning.
Diagnostic tests are used for checking student’s progress in learning particular elements of
the course. The test may be given at the end of a unit in the course book or a lesson
designed to teach on particular point. The diagnostic test tries to find out that how well the
students learnt a particular material, and it’s closely related to particular elements in the
course which have just been taught.
Achievement tests look back over a longer period of learning than the diagnostic tests. The
test aims at showing standard which the students have now reach in relation to other
students at the same stage. Achievement tests covers a much wider range of material than a
diagnostic test and relate to long term rather than short term objectives.

A proficiency test aims at assessing the student’s ability to apply in actual situations what
he or she has learnt. The test is not usually related to any particular course because it is
concerned with the student’s current standing in relation to his or her future needs.
The following is the summary of types of language test established by Harrison:




8

Category
Content
Purpose
Considerations
Placement
General reference
forward to future
learning
Grouping
speed of results
variety of tests
interview
Diagnostic
Detailed reference back
to class work
Motivation
Remedial work
Short term objectives
New examples of the
materials taught

Achievement
General reference back
to the course
Certification
Comparison with
others at the same
stage
Decision about
sampling
Similar material to
that taught in new
context
Proficiency
Specific purposes
Reference forward to
particular applications of
language acquired
Evidence of ability to
use language in
practical situations
Definition of
operation needs
Authenticity
Context
Strategies for coping
Table 1: A framework for language assessments
(Source: Harrison, 1991: 5)
Henning (1987), however, develops different categories for types of language testing. He
introduces seven types including objective vs subjective tests, direct vs. indirect tests,
discrete point vs. integrative tests, aptitude, achievement and proficiency test, criterion or

domain referenced vs. norm referenced or standardized tests, speed tests vs. power test and
other test categories. (Henning, 1987, pp 4-9).
The objective vs. subjective tests are distinguished on the basis of the manner in which
they are scored. An objective test may be scored by comparing examinee responses with an
established set of acceptance responses or scoring key. The example of this kind of test is
multiple choice test. On the other hand, a subjective test may be scored by opinionated
judgment based on insight and expertise of the scorer. The example of this type would be

9
free composition or cloze tests which permit all grammatical acceptable responses to
systematic deletions from a context.
Direct tests are said to test language performance directly whereas indirect tests indirectly
tap true language performance. The direct tests are usually in the forms of spoken tests
which are the ratings of language use in real communication situations. The indirect tests
are usually in the forms of written tests such as multiple choice, cloze tests.
Discrete point tests, as a variety of diagnostic tests, are designed to measure knowledge or
performance in very restricted areas of the target language. Integrative tests, on the other
hand are used to assess a greater variety of language abilities.
Aptitude tests are usually used to measure the suitability of a specific program of
instruction or a particular kind of employment. Achievement tests are used to measure the
extend of what students have already learnt. Proficiency tests are the most often global
measures of ability in language or other context area.
For the criterion-or domain-referenced test, the instructions are designed after the test are
created. The tests must match teaching objectives perfectly and they are useful when
objectives are under constant revision. Such kinds of test are useful with small and/or
unique group for whom norms are not available. The norm- referenced or standardized
tests, on the other hand, must have been administered to a larger number of examinee from
the target population. Acceptable standards of achievement can only be found by reference
to the mean or average score.
Speed test is the test in which the items are easy but the time seem to be insufficient. In

contrast, power test includes difficult items, but the time is sufficient.
Henning (1987) also mention some other categories of tests including examinations vs
quizzes, questionnaires, single stage and multi stage tests, language skill tests and language
feature tests, etc.

2.2.3. The current trends in language testing
Shohamy (1985: 5) gives out three trends in language testing. Firstly, it’s the transition
from discrete point tests to integrative tasks. In the past, language test often based on the
independent items like putting the correct verb forms, selecting the lexical elements. Tests,
nowadays, however, mostly aim at testing communicative competence, and the tasks are

10
much wider, such as writing letters, comprehension of a whole text with specific elements
in the text.
The second trend in language testing is the transition from indirect to direct/ authentic
tests. Until now, the test methods were mainly indirect, that means it has no relation to the
real life situation which are more similar to what the test takers will encounter in real
language use.
The last trend in language testing mentioned by Shohamy (1985) is the transition from
knowledge to performance type tests. In performance type tests, students or test takers
have to apply the knowledge of the language performing certain functions like actually
speaking or actually writing.

2.3. QUALITIES OF A GOOD TEST
Three most important characteristics of a good test, according to Harrison (1991), are
reliability, validity and practicality. However, according to Bachman and Palmer (1996), a
test's usefulness can be determined by considering the following measurements qualities of
the test: reliability, construct validity, authenticity, interactivity, impact, and practicality. In
this minor MA thesis, I will mention the three qualities of a good test including reliability,
validity and practicality with special focus on validity in the next part of the thesis.


2.3.1. Reliability
The test is reliable if it consistently provides accurate measures of abilities at all times,
with different students and/or different testers . According to Harrison (1991: 10), “the
reliability of the test is its consistency.” He confirmed that it is very important that the
students’ score should be the same or nearly the same whether the test taker takes one test
or another, and the same result the test taker obtain whether the test is marked by one
person or another, and a test should measures the same thing all the time. “There are three
aspects to reliability: the circumstances in which the test is taken, the way in which it is
marked and the uniformity of the assessment it makes.” Harrison (1991: 11).
Henning (1987: 74) supposes “Reliability is thus a measure of accuracy, consistency,
dependability, or fairness of scores resulting form administration of a particular
examination.” He added that if reliability is concerned with accuracy of measurement,
reliability may increase when the error of measurement is made to minimize. Therefore,

11
we should take care of the amount of error present in our measurement so that the
reliability could be quantified.
The term reliability, according to Bachman and Palmer (1996), refers to consistency of
measurement. Elaborately, they say that a reliable test score is consistent across different
characteristics of the testing situation. Moreover, if test scores are inconsistent, they
provide no information about the ability being measured. Because it is impossible to
eliminate inconsistencies on the whole, we try to reduce variations in the test's task
features.

2.3.2. Validity
The test is valid if it measures what is intended to measure. According to Bachman and
Palmer (1996), the term construct validity refers to the extent to which people can interpret
a given test score as an indicator of the abilities or constructs that people want to measure.
However, no test is entirely valid because validation is an ongoing process (Weir, 2005).

The test's reliability and validity are strongly correlated. Any valid test is considered a
reliable test; however, not all reliable tests can be considered valid (Alderson, 2000).
Recently, according to Alderson (2000), “the term construct validity is used to refer to the
general, overarching notion of validity”. Therefore, the main focus of discussing the test's
validity is construct validity, in addition to some issues regarding this test's content
validity.
2.3.3. Practicality
The test must be well organized in advance in relation to time, space, classroom
management, equipment, cost “Practicality is the relationship between the resources that
will be required in design, development, and use of the test and the resources that will be
available for these activities” (Bachman and Palmer, 1996:36). They illustrated that this
quality is unlike the others because it focuses on how the test is conducted. Moreover,
Bachman and Palmer (1996) classified the addressed resources into three types: human
resources, material resources, and time. Based on this definition, practicality can be
measured by the availability of the resources required to develop and conduct the test.
Therefore, our judgment of the language test is whether it is practical or impractical.

12
2.4. VALIDITY
According to Henning (1987), validity has been distinguished into empirical and non
empirical kinds. The non-empirical validity does not required the collection of data or the
use of formulae (e.g. content or face validity, response validity) while the empirical kinds
of validity usually involve resource to mathematical formulae for the computation of
validity coefficient. The common kinds of empirical validity include concurrent and
predictive validity. Another kind of validity mentioned by Henning is construct validity.
Having some similar ideas with Henning, Bachman (1990) also introduced five main types
of validity including content validity, criterion validity, concurrent validity, predictive
validity and construct validity. In this minor MA thesis, I will classify the major types of
validity based on the assumptions of Henning (1987) and Bachman (1990).


2.4.1. Content or face validity
Commonly, testing specialists consider content and face validity to be synonyms
(Magnusson, 1967). Of course, some others make distinction between them and suppose
that face validity, unlike content validity, is often determined impressionistically.
Content or face validity is intuitive and logical but usually lacks an empirical basis. The
name of this kind of validity shows that it is concerned with whether or not the content of
the test is sufficiently representative and comprehensive for the test to be a valid measure
of what it is supposes to measure.
The test content must be selective. For example, the achievement test’s content should be
bound to the content of instruction which in turn is constrained by the instructional
objectives.
According to Bachman (1990), there are two aspects of content validity including content
relevance and content coverage. The content relevance requires the specification of the
behavioral domain in question and the attendant specification of the task or test
domain. Content coverage is the extent to which the tasks required in the test adequately
represent the behavioral domain in question. Demonstrating that a test is relevant to
and covers a given area of content or ability is therefore a necessary part of validation.




13
2.4.2. Response Validity
Response validity refers to the extent to which examinees respond in the manner expected
by the test developer. It mentions the response manner of the test takers and the instruction
of the test. For example, if the test takers respond in a difficult and unreflective manner,
their obtained score may not represent their actual ability. Moreover, if the instruction of
the test is unclear and the test format is unfamiliar to the examinees, their response may not
reflect their true ability. The two cases mentioned above may be said to be lack of response
validity


2.4.3. Concurrent validity and predictive validity
“Concurrent validity is a kind of empirical criterion related validity.” (Henning, 1987). The
validity is based on the collected data and formulas applied to generate an actual numerical
validity coefficient. Of course, the validity coefficient derived represents the strength of
relationship with some external criterion measure
To validate a test of some particular ability in this way, one administers a recognized,
reputable test of the same ability to the same persons concurrently or within a few days of
the administration of the test to be validated.
Bachman (1990) supposed that concurrent validity can examine differences in test
performance among groups of individuals at different levels of language ability, or
examine correlations among various measures of a given ability
Predictive validity has close relationship to concurrent validity. It is usually reported in the
form of a correlation coefficient with some measure of success in the field or subject of
interest. The predictive validity can tell us how well test scores can predict some future
behavior. (Bachman 1990: 250).

2.4.4. Construct validity
The construct validity is empirical in nature because it involves the gathering of data and
testing of hypotheses. However, unlike concurrent and predictive validity, it does not have
any one particular validity coefficient associated with it
According to Henning (1987), the purpose of construct validation is to provide evidence
that underlying theoretical constructs being measured are themselves valid. The construct
validation usually begins with a psychological construct that is part of a formal theory

14
which enables certain predictions about how the construct variable will behave or be
influenced under specified conditions. Then the construct is tested under the conditions
specified, and it is said to be valid if the hypothesized result occur and the hypotheses are
supported.

Construct validity concerns the extent to which performance on tests is consistent with
predictions that we make on the basis of a theory of abilities, or constructs. (Bachman
1990: 255). In order to examine the construct validation, it is necessary to exam patterns
of correlations among item scores and test scores, and between characteristics of items and
tests and scores on items and tests; analyze and model the processes underlying test
performance; study group differences; study changes over time, or investigate the effects
of experimental treatment (Messick 1989).



15
CHAPTER 3: THE STUDY

3.1. THE SUBJECT AND THE CONTEXT OF ENGLISH TEACHING AND
LEARNING AT HAUI
3.1.1. English teaching and learning context at HaUI.
English faculty is one of the biggest faculties of Hanoi University of Industry. There are
more than 150 teachers of English who are divided into three divisions. One division is in
charge of teaching English for students of English, the other one is in charge of teaching
English for secondary and vocational student, and the biggest one teaches English for all
college and university non English major students. All students of Hanoi University of
Industry study English as their foreign language.
According to the objectives given in the syllabus , the teaching aims of the English course
for the non English students in the second semester are stated as follows:
In general, it helps enhance the knowledge and skills students have studied at the
elementary level (the 1
st
term), as well as improve General English level of student up to
pre-intermediate level.
In details, it aims to provide students with knowledge of vocabulary, grammar,

pronunciation and develop their listening, speaking, reading and writing skills based on
natural and social science topics; give students orientation about the importance of English
in their life and in their future jobs; and build and practice languages learning skills as well
as develop their own thinking and ideas when communicating in English.
Grammar: Grammatical points are improved and enhanced through each unit. All
principles related to grammatical points in each lesson are practiced effectively through
group work and pair work.
Vocabulary: In this section, students have chance to improve their own vocabulary
considerably. Vocabulary provided mostly is related to the topic of each unit.
Skill work: In this part, students improve and develop their listening, speaking, reading
and writing skills. These skills are integrated and this integration will help students to
uncover their creativeness, which brings about the best learning result.
Everyday English (communication focus): Students are equipped with some cultural
knowledge of English speaking countries and communication samples. Besides, students
can entertain with songs and interesting conversation practice.

16
Writing: Students are able to write some short paragraphs about the topics related to each
unit such as writing about their last holiday, future plans, hometowns, etc.

The syllabus of General English course for the second semester non English major students
of HaUI is described in the following table:

No.
Title
Time (preriod)
Theory
Practice
Test


1
Unit 1: Getting to know you!
4
4


2
Unit 2: The way we live
4
4


3
Unit 3: It all went wrong
4
4


4
Unit 4: Let’s go shopping!
4
4


5
Stop and check 1 + Progress test 1

1
1


6
Unit 5: What do you want to do?
4
4


7
Unit 6: Tell me! What’s it like?
4
4


8
Unit 7: Famous couples
4
4


9
Unit 8: Do’s and don’ts
4
4


10
Stop and check 2 + Mid-term test

1
1


11
Unit 9: Going places
4
4


12
Unit 10: Scared to death
4
4


13
Unit 11: Things that changed the
world
4
4


14
Unit 12: Dreams and reality
4
4


15
Stop and check 3 + Progress test 2

1
1


16
Unit 13: Earning a living
4
4


17
Unit 14: Love you and leave you
4
4


18
Stop and check 4 + Revision

2


Total
56
61
3

Table 2: The syllabus for the second semester

Course books being employed by teachers and students of Hanoi University of Industry are
the set of New Headway by John and Liz Soars (2000) (elementary, pre-intermediate and
intermediate), Talktime by Susan (2004), and TOEIC Analysts by Taylor (2006). As for
non English majors in the second semester, the main course book is New Headway Pre-

intermediate (2000) by Liz and John Soars. Besides, students are recommended to use

17
another reference book named English Grammar in Use by Murphy, R. In the teaching
process, teachers also use other materials to present and recycle the basic structures of
English to develop students’ proficiency in using these structures in certain contexts. The
focus is also placed on reinforcing and improving students’ knowledge of vocabulary and
students’ ability of communication.
The major teaching points of the course book for the second semester are presented in
appendix 5.

3.1.2. English Testing for non English majors at HaUI
For each semester, students are required to take at least three progress tests and one final
achievement test. During my teaching at HaUI, I reckon that testing is not the main
concern of teachers. Testing has not been paid proper attentions and carefully studied in
terms of its validity, reliability, format and practicality.
Within the scope of this thesis, the study focuses on investigating the validity of the final
achievement English test (for the second semester) for non English major students who
have been learning English for 120 class hours covering all 14 units of New Headway Pre-
intermediate. Hereunder is the testing format registered to the second semester non English
Majors named Test 2. or the final achievement test.
Test 2 with the time allowance is 60 minutes has total score of 100 points and consists of
the following parts:
Section A (20 points): Grammar and Vocabulary. This section includes 20 multiple choice
questions and is marked 20 points.
Section B (20 points): Reading comprehension. This section contains 2 short reading
passages with 10 multiple choice questions.
Section C (20 points): Listening. In the listening section, students are required to listen to
several short conversation or short talk and then answer the questions. There are 10
multiple choice questions and 10 true/ false questions with 1 point for each correct answer.

Section D (20 points): Writing. Students have to do 5 sentence building questions by
selecting the correct answers, then write a short paragraph about one of the topic given
before.
Section E (20 points): Speaking. In speaking section, students often introduce about
themselves and then talk about one of the topic they have been assigned. (See appendix 4)

18
3.1.3. Subject of the study
Subjects of this study are students and teachers from HaUI. This is the University
which gives out the most time for EFL with 6 periods each week. In the university, English
is learned not as a major but as an instrument. Students are oriented to take the TOEIC test
in their graduation examination, which helps them a lot in their work in the future.

3.1.3.1. Students
150 first year college students are selected from different classes of Department of
Computer science, Department of Garment and Fashion design, Department of Chemistry,
Department of Economics, and Department of Mechanical Engineering, Department of
Electronic Engineering, and Department of Electrical Engineering. Of which, students of
economics and computer science are considered having better ability in using English and
they are required to take the TOEIC Test in their graduation exam. At the same time,
students of other departments are required to take a B level test when they graduate. Their
English, as classified by the placement test, is at elementary level and pre-intermediate
level. Most students studied English between 3 and 7 years at lower and upper secondary
school. Among them, some learnt other foreign languages rather than English. As a result,
their English proficiency is considerably varied.
Based on the situation that students of Economics and computer science are judged as
better users of English and they will takes the TOEIC course and TOEIC Test before
graduating, and that students of other majors will take the other course named Talktime
and will take B level test as the requirement for their graduation, the author divided the
student population into two separated groups. The first group includes students of

Economics and Computer Science (75 students, hereinafter referred to as EC) and the
second one includes other students (75 students hereinafter referred to as OM)
participated in this survey.

3.1.3.2. Teachers
The English Department is one of the biggest departments of HaUI in terms of its
staff number. There are more than 140 teachers of English who are in charge of teaching
English for almost all students of HaUI including vocational students, college students and
university students. In this study, 15 teachers of English at HaUI are selected. They all

×