Tải bản đầy đủ (.ppt) (24 trang)

Chapter 10: Criteria and Test Types doc

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (113.28 KB, 24 trang )


Chapter 10:
Criteria and Test Types

A. Criteria
1. Validity
2. Reliability
3. Discrimination
4. Administration
5. Test instructions to candidates
6. Backwash effects

1. Validity
Validity

the extent to which it
measures what it is supposed to
measure & nothing else (content)

Face validity

Content validity

Construct validity

Empirical validity


Face validity

If a test item looks right to other testers,


teachers, moderators & testees

described as
having face validity

In the past, regarded by test writers simply as a
public relations exercise

Now, designers of communicative tests: face
validity- the most important of all types of validity



Content validity

Depending on a careful analysis of the
language being tested & of the
particular course objective

When constructing tests, writers should
first draw up a table of test
specifications (language skills, areas
included…)


Construct validity

A test having construct validity is capable
of measuring specific characteristics in
accordance with a theory of language

behavior and learning

For example, a test consisting of multiple
choice items will lack construct validity if
the communicative approach is adopted
during the language course


Empirical /statistical validity
This kind of validity obtained as a result of comparing the
results of the test with the results of some criterion
measure such as:

An existing test, known to be valid and given at the same
time

The teacher’s ratings or any other such form of
independent assessment given at the same time



Empirical /statistical validity

The subsequent (later) performance of
the testees on a certain task measured
by some valid test

The teacher’s ratings or any other such
form of independent assessment given
later


Summary (Validity)

The test situation

The technique used


important factor in determining
the overall validity of any test

2. Reliability (definitions)

A test administrated to the same candidates
on different occasions produces the same
results

reliable

Reliability denotes the extent to which the
same marks /grades awarded if the same
test papers marked by
(i) 2 or more ≠ examiners
(ii) the same examiner on ≠ occasions

2. Reliability (affecting factors)

Reliability affected by the size of the sample
& the administration of the test


Other factors:
(1) test instructions (rubrics)
(2) personal factors like motivation & illness
(3) scoring of the test (the most important factor-
objective tests overcome this problem of
marker reliability)

2. Reliability (measuring methods)
(1) Re-administering the same test (the
same group of candidates) after a lapse
time
(2) Administering parallel forms of the test
to the same group (tests must be identical
in the nature of sampling, difficulty, length
& rubrics). If the correlation between 2
tests is high, the test can be termed
reliable.

3. Reliability versus Validity

2 chief criteria for evaluating any test
( an ideal test should be valid &
reliable)

The greater the reliability of a test, the
less validity it usually has.

4. Discrimination

An important feature of a test is its capacity:

(1) To discriminate among ≠ candidates
(2) To reflect the differences in the
performances of individuals in a group

The extent of the need to discriminate will
vary depending on the purpose of the test

5. Administration/Practicality

A test must be practicable, i.e. fairly straight
forward to administrate or able to
administrate (the length of time for
administrating, collecting answer sheets,
reading instructions).

Another practical consideration concerns
the answer sheets and the stationery used.

6. Test instructions to the candidates

All instructions are clearly written.

Samples are given.

Grammatical terminology should be
avoided.

7. Backwash effects

Def.: the influences of testing on teaching & learning


Positive backwash effect (reading tests


development of reading skills)

Negative backwash effect (objective tests

reducing
learners’ motivation

Implications: influences of tests on the compilation of
syllabus & language teaching programmes

B. Types of tests
1. Achievement /attainment tests
2. Proficiency tests
3. Aptitude tests
4. Diagnostic tests

1. Achievement /attainment tests

Class progress tests, the most widely
used types of tests

Achievement tests, formal tests


Class progress tests


Designed to measure the extent to which Ss
have mastered the material taught in the
classroom, allowing Ss to show what they
have mastered

Used as a teaching device: backwash effects
on teaching & motivation

Good tests

encouraging Ss to perform well
& gain confidence


Achievement tests

Intended to measure achievement on a large
scale, to show mastery of a particular syllabus

Standardized tests: pre-tested, items are
analysed & revised where necessary

A good achievement test should reflect the
particular approach to learning & teaching
adopted

2. Proficiency tests

Defining a student’s language
proficiency with reference to a

particular task which he/she will be
required to perform (TOEFL, TOEIC)

In no way related to any syllabus or
teaching programme

3. Aptitude tests

Designed to measure the Ss’ probable
performance in a foreign language which
he/she has not started to learn

Generally, seeking to predict Ss’ probable
strengths & weaknesses in learning a
foreign language by measuring
performance in an artificial language

4. Diagnostic tests

Achievement & proficiency tests:
frequently used for diagnostic purposes
such as diagnosing areas of difficulty Ss
may have so that appropriate remedial
action can be taken later.

Diagnostic testing: frequently carried out
for groups of Ss rather than for individuals

×