Tải bản đầy đủ (.docx) (18 trang)

CHUYÊN đề hội THẢO các TRƯỜNG CHUYÊN VÙNG DHĐBBB lần THỨ VIII năm 2015 môn TIẾNG ANH TRƯỜNG CHUYÊN THÁI BÌNH

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (182.47 KB, 18 trang )

Thai Binh Gifted High School

Assessment and Testing
in EFL Class at
High School Level

by Nguyen Thi Hong Hung
A. INTRODUCTION
1

1


It is undeniable that English is an international language. In Viet Nam,
English has become more and more important. It is now commonly used in technology,
science, education, culture and economic activities. This leads to the increasing demand
for English learning and teaching throughout the country. There are a lot of language
centers in big cities. More importantly English has been a compulsory subject in schools
and universities. More attention has been paid to English teaching and learning in gifted
schools especially in gifted English classes. Nevertheless, how to teach, to learn English
successfully is not simple.

Besides teaching

writing, reading, listening, speaking

English, teachers need to be aware of the importance of assessing and testing . Testing
and assessing will help teachers and students adjust the teaching and learning methods so
that they can have the best resultsl. Within a limited scope of this paper, we would like to
deal with the investigated issues that relate to “assessment and testing in EFL at High
school level..


By writing this paper, we aim to cite some theoretical knowledge related to
assessment and testing , the different testing and assessment activities.
B. BODY
Part 1. Fundamentals in theories of assessment and testing
1.1 Assessment, testing, measurement, evaluation
It is important to distinguish between these four terms. Many people in the applied
linguistics seen to make no difference, or at least no special distinction, between these
terms. However, these terminologies, although related, may have different goals. Below
is the distinction between them discussed by Nitko (2011).
Assessment is a broad term defined as a process for obtaining information that is used
for making decisions about students, curricula and programs, and educational policy.
Test is a concept that is narrower than assessment. It is defined as an instrument or
systematic procedure for observing and describing one or more characteristics of a
student using either a numerical scale or a classification scheme.
Measurement is defined as a procedure for assigning numbers (usually called scores)
to a specific attribute or characteristics of a person in such a way that the numbers
describe the degree to which the person possesses the attribute.
Evaluation is defined as the process of making a value judgment about the worth of a
student’s products or performance.
There may exist different definitions of these terms, but they all share the common
feature that is the interrelation between these terms in an assessment process.

2

2


ACTIVITY 1
Decide the truth or falsity of each of the following statements. Defend your
answer

a. To make evaluations, one must use measurements
b. To measure an important educational attribute of a student, one must use a test.
c. To evaluate a student, one must measure that student.
d. To test a student, one must measure that student.
e. Any piece of information a teacher obtains about a student is an assessment.
f. To evaluate a student, one must assess that student.
1.2. Purposes of assessment
Before designing any assessment, teachers need to ask themselves “what is the
purpose of assessment?” One way to answer the question is to thoroughly examine these
considerations: what kind of information you want to obtain from the assessment, and
how the obtains information from assessment will be used Nitko (2000) discussed some
of the decisions teachers may wish to consider.
a. Instructional management decisions
Your classroom is a decision-rich environment. You must take many decisions, including
planning instructional activities, placing students into learning sequences, monitoring
students’ progress, diagnosing students’ learning difficulties, providing students and
parents with feedback about achievements, evaluating teaching effectiveness, and
assigning grades to students. Hence, assessment can be used to make decisions on
- Instructional diagnosis and remediation
- Feedback to students
- Feedback to the teacher
- Modeling learning targets
- Motivating students
- Assigning grades to students
b. Selection decisions
Assessment can be used to provide part of the information on which selection decisions
are based. For example, college admissions are often selection decisions: some
candidates are admitted and others are not.
c. Placement decisions
Placement decisions are made to assign students to different levels of the same general

type of instruction, education or work. For example: teachers may assess students to
decide whether they are placed in a fast-track or regular group.
d. Classification decisions
Some times we must make a decision that results in a person being assigned to one of
several different but unordered categories or programs. For example: students may be
assessed to be classified into groups of different learning styles or interests.
e. Counseling and guidance decisions
Assessment results are frequently used to assist students in exploring and choosing
careers and in directing them to prepare for the careers they select. In this case, a series of
assessment is combined, instead of using a single assessment result.
f. Credentialing and certification decisions
Credentialing and certification decisions are concerned with assuring that a student has
attained certain standards off learning.
3

3


ACTIVITY 2
To grade
To punish students
To report the progress
To get promotion
To adjust your teaching
To compare two teaching methods
To pilot a new scheme

To motivate students
To identify areas for improvement
To raise money

To show off
To create learning opportunities
To end a program
To keep students in school

Circle those that can be the purpose of assessment in the following box
Now, reflect on your own assessing experience, why do you assess your students?
What kind of decisions have you made and in which assessing situation?
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------1.3. Types of assessment
List of terms used to describe and classify assessments
Criteria of classification
By kind of item

By how student performance is scored
By degree of standardization
By administrative conditions
By the basis for interpreting scores
By the use of assessment result

By the time of assessment
4

Types of assessments
Choice items (true-false, multiple choice,
matching)
Completion items
Short-answer items
Essay items
Objective assessment
Subjective assessment

Standardized assessments
Nonstandardized assessments
Individual assessments
Group assessments
Norm-referencing
Criterion-referencing
Formative assessment
Summative assessment
Diagnostic assessment
Placement/classification/selection
Assessment
Final assessment
4


By the domain to be assessed
By the language emphasis of the scoring

Continuous/on-going assessment
Product assessment
Process assessment
Verbal assessment
Performance assessment

a. Objective versus subjective assessment
A true-false and multiple choice test is said to be objective because once the scoring key
is set, nearly everyone who scores a student’s responses arrives at the same report. Essay
items, portfolio, and performance assessments, on the other hand, have a history of being
scored differently by different persons and differently by the same persons on different
occasions. Because of this, they are said to be subjective methods of assessment.

b. Standardized versus nonstandardized assessment
Standardization can improve the objectivity of assessments as well as the validity of
interpreting the results. Standardization is the degree to which the observational
procedures, administrative procedures, equipment and materials, and scoring rules have
been fixed so that, same procedure occurs at different times and places, insofa as is
possible.
c. Norm versus criterion referencing
Norn-referencing interpretations describe assessed performance in terms of a person’s
position in a reference group that has been administered the assessment. For example,
you may report a student’s performance on a test as being ‘better than 80% of the class’.
This report expresses the student’s standing in a reference group, but it does not state
what the student knows or is able to perform. The reference group is called the norm
group. Criterion-referencing interpretations describe assessed performance in terms if
the kinds of tasks a person with a given score can do. It is important to note that both
kinds of interpretations are important to understand how well a student is learning.
d. Formative versus summative assessment
Formative assessment is designed to assist the learning process by providing feedback
to the learning, which can be used to highlight learning areas for further study, hence
improve future performance. Self and diagnostic assessments are types of formative
assessment with specific purposes. Summative assessment is for progression n and/or
external purposes, given at the end of a course and designed to judge the student’s overall
performance.
Summative assessment is most useful for those external to the educative process, who
wish to make decisions based on the information gathered. It generally provides a concise
summary of student’s abilities which can easily be understood as a pass/fail or a grade. It
is however not useful for communicating more complex data about student’s individual
abilities, which can then be used to inform further study and improve student
performance.
e. Final versus continuous/on-going assessment
Final assessment is the one taken at the end of a course while continuous or on-going

assessment is scattered throughout the course. The primar advantage of final assessment
is that it is simple to organize and condense assessment process into a short space of
time. It means, however, the timing of the examination becomes of great importance.
Illness at an unfortunate time can unduly influence the result. Moreover, final assessment
5

5


cannot be used for formative purposes. The main advantages of continuous or on-going
assessment is that both teachers and students obtain feedback from the process which can
then be used to improve teaching and learning period. Disadvantages include the
increased workload inherent with this mode of assessment, and difficulties associated
with students from different backgrounds tackling the same material and being assessed
in exactly the same way.
f. Product versus process assessment
With the rapidly changing nature of modern society, increased emphasis is being placed
on skills and abilities, rather than knowledge. It is therefore important to consider
whether you wish to assess the product of student learning, or the process undertaken.
Product-driven assessments are usually easier to create, as the assessment criteria seem to
be more tangible. They can also be more easily summarized. Process based assessment
however can give more useful information about skills, and therefore can highlight for
students the importance of learning generalized techniques, rather than specific
knowledge.
g. Verbal versus performance assessment
Verbal assessment call for observing the verbal responses of students: for example, how
well they can define words, explain their answers, or define similarities or differences
between concepts. Most school assessments are verbal because schools emphasize verbal
attributes.
Other assessments are crafted to elicit and observe nonverbal response: assembling

objects, completing experiments, performing psychomotor activities, and so on. These are
called performance assessments. Although performance assessments emphasize
nonverbal responses, verbal ability or language ability are also necessary. When
assessing school learning targets, performance assessment focus on a student’s ability to
apply and use knowledge from several areas to make something, produce a report, or
demonstration.

ACTIVITY 3
For question 1-5, match the situations in which a teacher sets a test with the reason for
assessment listed A-F. There is one extra option which you do not need to use
1.
The teacher has a new class. On the first day of the course, she sets a test which
covers some language points she expects the students to be familiar with and
others that she thinks the students may not know. The students do not prepare for
the test.
2.
The teacher notices that his intermediate students are making careless mistakes
with basic question formation, which they should know. He announces that there
will be a test on this the following week. The students have time to prepare for the
test.
3.
The students are going to take a public examination soon. The teacher gives them
an example paper to do under test conditions
4.
The teacher monitors students whenever they carry out speaking tasks and keeps
notes about each student.
5.
The class has recently finished a unit of the course book which focused on the use
of the present perfect simple with ‘for’ and ‘since’. The teacher gives the class a
surprise test on this.

6

6


Reason for assessment
A. To familiarize students with the test format
B. To allow the teacher to plan an appropriate scheme of work
C. To show students how well they have learned specific language
D. To allow students to assess each other
E. To motivate the students to revise a particular language area
F. To assess students’ progress on a continuous basis

ACTIVITY 4
For questions 1-5, match the assessment aims with assessment types listed A-G. There is
one extra option which you do not need to use.

7

7


Assessment aims
1. to put students into a class at the
correct level
2. to identify how much the class already
knows about particular language items
3. to give student a test on language
taught in the latest unit of their course
4. to keep a record of students’

performance , based on work completed
throughout the course
5. to help students evaluate their own
progress
6. to see how well students perform at the
end of a course

Assessment types

A. continuous
assessment
B. placement tests
C. diagnostic tests
D. peer assessment
E. self-assessment
F. achievement tests
G. progress test

ACTIVITY 5
Which of the following are true for summative assessment (SA) or informative
assessment (FA) in the appropriate box.
is the practice of building a cumulative record of student achievement
assists you to make judgments about student achievement at certain relevant
points in the learning process or unit of study (e.g. end of course, project,
semester, unit, year)
assists teacher in modifying or extending their programmers or adapting their
learning and teaching methods
Can be used formally to measure the level of achievement of learning
outcomes (e.g. tests, labs, assignments, projects, presentations etc)
Is used to monitor students’ ongoing progress and to provide immediate and

meaningful feedback
Is very applicable and helpful during early group work processes
Can also be used to judge programmed, teaching and/or unit of study
effectiveness (that is as a form of evaluation)
Usually takes place during day to day learning experiences and involves
ongoing informal observations throughout the term, course, semester, or unit
of study
1.4.

Assessment OF Learning versus

Assessment FOR Learning
During the last few decades, there is increasing tension between assessment of
learning and assessment for learning, whose fundamental differences lie in their
assessment purposes. These are discussed in details in Earl and Katz (2006).
Assessment OF learning

8

8


Assessment of learning refers to strategies designed to confirm what students know,
demonstrate whether or not they have met curriculum outcomes or the goals of their
individualized programs, or to certify proficiency and make decisions about students’
future programs or placements. It is designed to provide evidence of achievement to
parents, other educators, the students themselves, and sometimes to outside groups (e.g.,
employers, other educational institutions).
Assessment of learning is the assessment that becomes public and results in
statements or symbols about how well students are learning. It often contributes to pivotal

decisions that will affect students’ futures. It is important, then, that the underlying logic
and measurement of assessment of learning be credible and defensible.
Effective assessment of learning requires that teachers provide
i. a rationale for undertaking a particular assessment of learning at a particular
point in time
ii.
clear descriptions of the intended learning
iii.
processes that make it possible for students to demonstrate their competence
and skill
iv.
a range of alterative mechanism for assessing the same outcomes
v.
public and defensible reference points for making judgments transparent
approaches to interpretation
vi.
description of the assessment process
vii.
strategies for recourse in the event of disagreement about the decisions
Assessment FOR learning
Assessment for learning occurs throughout the learning process. It is designed he
make each student’s understanding visible, so that teachers can decide what they can do
help students progress. It assessment for learning, use assessment as an investigative tool
to find out as mush as the can about what their students know and can do, and what
confusions, preconceptions, or gaps they may have.
Assessment Reform Group (2002) defines assessment for learning as the process of
seeking and interpreting evidence for use by learners and their teachers to find out where
the learners are in their learning process, where they need to go and how to get there.
They suggest ten principles of Assessment for Learning as follow.
i.

Assessment for learning should be part of effective planning of teaching and
learning
A teacher’s planning should provide opportunities for both learner and teacher to
obtain and use information about process towards learning goals. It also has to he flexible
to respond to initial and emerging ideas and skill. Planning should include strategies to
ensure that learners understand the goals they are pursuing and the criteria that will be
applied in assessing their work. How learners will receive feedback, how they will take
part in assessing their learning and how they will be helped to make further progress
should also be planned.
ii. Assessment for learning should focus on how students learn
The process of learning has to be in minds of both learner and teacher when assessment is
planned and when the evidence is interpreted. Learners should become as aware of the
‘how’ of their learning as they are of the ‘what’.
iii. Assessment for learning should be recognized as central to classroom

9

9


Much of what teachers and learners do in classrooms can be described as assessment.
That is, tasks and questions prompt learners to demonstrate their and interpreted, and
judgments are made about how learning can be improved. These assessment processes
are an essential part of everyday classroom practice and involve both teachers and
learners in reflection, dialogue and decision making.
iv. Assessment for learning should be regarded as a key professional skill for
teachers
Teachers requires the professional knowledge and skills to: plan for assessment; observe
learning; analyze and interpret evidence of learning; give feedback to learners and
support learners in self- assessment. Teachers should be supported in developing these

skills through initial and continuing professional development.
v. Assessment for learning should be sensitive and constructive because any
assessment has an emotional impact
Teachers should be aware of the impact that comments, marks and grades can learners'
confidence and enthusiasm and should be as constructive as possible in the feedback that
they give. Comments that focus on the work rather the person are more constructive for
both learning and motivation.
vi. Assessment should take account of the importance of learner motivation.
Assessment that encourages learning fosters motivation by emphasizing progress and
achievement rather than failure. Comparison with others who have been more successful
is unlikely to motivate learners. It can also lead to their withdrawing from the learning
process in areas where they have been made to feel they are no good Motivation can be
preserved and enhanced by assessment methods which protect the learner's autonomy,
provide some choice constructive feedback, and create opportunity for self-direction.
vii. Assessment for learning should promote commitment no learning goals and
a shared understanding of the criteria by which they are assessed
For effective learning to take place learners need to understand what it is they are trying
to achieve and want to achieve it. Understanding and commitment follows when learners
have some part in deciding goals and identifying criteria for assessment criteria involves
discussing them assessing progress. Communicating learners using terms that they can
understand, providing examples of how the criteria can be met in practice and engaging
learners in peer- and self-assessment.
viii. Learners should receive constructive guidance about how to improve
Learners need information and guidance in order to plan the next steps in their learning.
Teachers should: pinpoint the learner's strengths and advise on how to develop them; be
clear and constructive about any weaknesses and how they might addressed; provide
opportunities for learners to improve upon their work.
ix.
Assessment for learning develops learners capacity for s assessment so
that they can become reflective and self managing

Independent learners have the ability to seek out and gain new skills, new knowledge and
new understandings. They are able to engage in self-reflection and to identify the next
steps in their learning. Teachers should equip learners with the desire and the capacity to
take charge of their learning through developing the skills of self-assessment
x.
Assessment for learning should recognize the full range of achievements
of all learners

10

10


Assessment for learning should be used to enhance all learners’ opportunities to learn in
all areas of educational activity. It should enable all learners to achieve their best and to
have their efforts recognized.

ACTIVITY 6
Determining which of the types of assessment discussed in 1.3 are FOR or OF
learning. Put a tick in the appropriate column.
Types of assessment

OF learning

FOR learning

1.5.

Test usefulness: Qualities of language tests
Bachman and Palmer (1996) discuss the following qualities of language tests.

a. Reliability
Reliability is defined as consistency of measurement. It means a reliable test score will
be consistent across different characteristics of the testing situation. In other words, a group
of learners were to take the same assessment instrument on two occasions. That is, their
results of these two occasions should be roughly the same provided that the conditions are
constant. For a language test, the reliability be achieved through its size, specifically
through a large number of test items in the test or a large number of learners taking the test.
b. Validity
Just as important as reliability is the question of validity. Does the assessed task
actually fulfill its purpose? Does it give you the information you want about the students?
Does the assessment enable you to make well-founded decisions?
As exemplified by Rust (2002), just because an exam question includes the instruction
analyze and evaluate does not actually mean that the skills of analysis and evaluation are
going to be assessed. They may be, if the student is presented with a case study scenario
and data they have never seen before. But if they can answer perfectly adequately by

11

11


regurgitating the notes they took from the lecture you gave on the subject then little more
may be being assessed than ability to memorize.
analysis and evaluation are going to be assessed. They may be, if the student is presented
with a case study scenario and data they have never seen before. But if they can answer
perfectly adequately by regurgitating the notes they took from the answer perfectly
lecture you gave on the subject then little more may be being assessed than the ability to
memorize.
Validity has been extensively discussed in language testing. The validity of the test
is the extent to which it measures what it is supposed to measure and nothing else, in

another word, if a test can measure the right knowledge and skills of the learners
which it wants to measure, it is valid. The reliability of a test ensures its consistency
while validity ensures its meaningfulness.
There are five types of validity in a language test: face, predictive, concurrent,
content and construct validity
• Face validity: if a test looks right to the lay eye(not too long, too short, too
complicated or too easy), it has a face validity
• Predictive validity: the predictive(or statistical or empirical) validity of a test is
obtained by comparing its results with the results of some criterion measure such
as
- An existing test believed to be valid and given at the same time or
- The teacher's ratings or other forms of independent assessment then or
- The subsequent performance of the test takers on certain task measured by
some valid test, etc.
The test then is the predictor if it shows this validity in its relation to its future
criterion which it predicts, for example scores of the test at entry to the 10th form of a
senior secondary school related to academic degree success.
• Concurrent validity: A statistical procedure establishes concurrent validity. The
scores are related t an acceptable criterion which is concurrent and quantifiable.
This validity is a useful check on a test's validity and all types’ tests can make use
of it. Both concurrent and predictive validities are established by statistical
correlation and need quantifiable criteria.
• Content validity: If face validity is an appeal to the lay observer (non expert),
content validity is an appeal to the subject expert (tester, teacher). The expert uses
his own knowledge of the target language and language test to judge what extent
the test provides a satisfactory sample of the syllabus, the acceptable and
unacceptable content of the test. Proficiency and especially achievement tests
depend on content validity a lot
• Construct validity: A test, part of a test, or a testing technique is said have
construct validity if it can be demonstrated that it measures just the ability that it

is supposed to measure. The word construct' refers to any underlying ability, or
trait, which is hypothesized in a theory of language ability
c. Authenticity:
Bachman and Palmer (1996) defined authenticity as the degree to which a given language
test task's characteristics correspond to a target language use t features. Authenticity
relates a test's task to the domain of generalization to which we want our scores'
interpretations to be generalized it potentially affects test takers' perceptions of the test
and their performance.
12

12


d. Interactiveness
Interactiveness, according to Bachman and Palmer (1996) is the extent and type
involvement of the test taker's individual characteristics in accomplishing a test task.
Does the test motivate students? Is the language used in the test's questions and
instructions appropriate for the students' level? Do the test items represent the language
used in the as well as the target language? Al these questions represent the crucial
elements that affect a test interactiveness. Many recent views consider this notion the
core of language teaching and learning.
e. Impact:
According to Bachman and Palmer (1996), impact can be defined broadly interms of the
various ways a test's use affects society, an educational system, and the individuals within
them. In general terms, a test operates at the macro level of a societal educational system
while corresponding to individuals, i.e., west takers, at the micro level, An aspect of
impact that has been of particular interest to both language testing researchers and
practitioners is what is referred to as ‘washback’, which is defined as 'the effect of testing
on teaching and learning' (Hughes, 1992:1)
f. Practicality:

Bachman and Palmer (1996) coin the quality practicality as the relationship between the
resources that will be required in design, development, and use of the test and the
resources that will be available for these activities. They illustrated that this quality is
unlike the others because it focuses on how the test is conducted. Moreover, Bachman
and Palmer (1996) classified the addressed resources into three types: human resources,
material resources, Based on this definition, practicality can be measured by the
availability of the develop and conduct the test. Therefore, our judgment of the resources
required language test is whether it is practical or impractical

ACTIVITY 7
Some testing qualities are described in the following sentences (1-10). Choose the letters
(A-F) from the box and write in the space provided. Some letters can be used MORE
THAN ONE.
A. Reliability
D. Content validity

B. Face validity
E. Concurrent validity

C. Construct validity
D. Predictive validity

Testing situations
1. Some MCQ questions in an IQ test had only three options.
2. A student got 45 in a math test and failed. He opposed to the test
score and was re-marked by other two raters. One gave him 65 and
the other 75
3. The program has communicative performance objectives but
tests students using multiple-choice grammar tests
4. The scores during a student’s senior year and high school can

provide information about this student’s first-year college grade
point.
5. A researcher developing an IQ-test might ask his friends and
relatives to read the questions and make their judgments.
13

Qualities

13


6. An employment test is administered to a group of workers and
then the test scores are correlated with the ratings of the workers’
supervisors taken on the same day.
7. A pie chart taken from a web-based source was used in the
IELTS writing task but the colors of the chart were not clear and
students were confused about what each color represents.
8. A teacher wants to test her students’ writing skill, so she asks her
students to listen to a lecture and write a summary about it.
9. After the introduction of the new textbooks, teachers have taught
communicatively from Grade 6 to Grade 9, emphasizing speaking
and listening skills and fluency as well as accuracy. But on the
Grade 9 exam there is still no speaking or listening component
10. For the past 8 years the Grade 9 exam has used passages,
comprehension questions and grammar exercised for the exam by
memorizing the book. This year, the Foreign Language Specialist
writes the exam using parallel texts and exercises, not taken directly
from the book, without warning anyone
Part 2: Classroom-based assessment and testing
2.1. Test Specifications

A test’s specifications provide official statement about what the test tests and how it tests
it. The specifications are the blueprint to be followed by the test and item writers, and
they are essential in the establishment of the test’s construct validity. Alderson, Clapham
and Wall (1995) suggests a comprehensive framework to construct test specification,
which entails criteria
1. What is the purpose of the test? Tests tend to fall into one of the following broad
categories: placement, progress, achievement, proficiency, and diagnostic.
2. What sort of learner will be taking the test – age, sex, level of proficiency/stage of
learning, first language, cultural background, country of origin, level and nature of
education, reason for taking the test, likely personal and, if applicable, professional
interests, likely levels of background (word) knowledge?
3. How many sections/papers should the test have, how long should they be and how will
they be differentiated – one three-hour exam, five separate two-hour papers, three 45
minute sections, reading tested separately from grammar, listening and writing integrated
into one paper, and so on?
4. What target language situation is envisaged for the test, and is this to be simulated in
some way in the test content and method?
5. What text types should be chosen – written and/or spoken? What should be the sources
of these, the supposed audience, the topics, the degree of authenticity? How difficult or
long should they be? What functions should be embodied in the texts – persuasion,
definition, summarizing, etc?
6. What language skills should be tested? Are enabling/micro skills specified, and should
items be designed to test these individually or in some integrated fashion? Are
distinctions made between items testing main idea, specific detail, inference?
7. What language elements should be tested? Is there a list of grammatical
structures/features to be included? Is the lexis specified in some way – frequency lists,
etc.? Are notions and functions, speech acts or pragmatic feature specified?
14

14



8. What sorts of tasks are required – discrete point, integrative, simulated ‘authentic’,
objectively assessable
9. What is the relative weight for each item – equal weighting, extra weighting for more
difficult items?
10. What test methods are to be used – multiple choice, gap filling, matching,
transformation, short, answer question, picture description, role play with cue cards,
essay, structured writing?
11. What rubrics will be used as instructions for candidates? Will example be required to
help candidates know what is expected? Should the criteria by which candidates will be
assessed be included in the rubric?
12. Which criteria will be used for assessment by markers? How important is accuracy,
appropriacy, spelling, length of utterance/script, etc.?

ACTIVITY 8
Study the objectives of a reading course targeted for a group of student at B2 level
(CEFR). Then examine the test specification and the test, and make some comments.
Course objectives
On the completion of the course, the students are expected to meet partially B2 level by
the Common European Framework for Reference Levels of Languages. In addition,
students should have gained adequate knowledge related to certain business issues and
situations at intermediate level. Specifically, students will be able to:
• Acquire a sufficient range of language and knowledge related to business topics
such as employment, trade, quality control and customer service, business ethics,
leadership and innovation.
• Demonstrate fairly good use of reading skills such as skimming and scanning,
understanding examples and key details, and note-taking while reading texts that
consist mainly of high frequency everyday or business-related language.
• Tackle with confidence several reading question types, especially those in the

BEC Vantage Reading Test.
2.2. CITAS Software to evaluate a multiple-choice test
a. Test and Test Performance
- mean test score (easy/difficult)
- variance (how much diversity of the score)
- histogram (distribution of test score)
- standard error of measurement (SEM): the smaller the better
2SEM plus/minus the observed score ~a range we are 95% confident contains the true
score
- reliability (KR-20 or α ): 0 ≤ α ≤ 1
b. Item Analysis
Item Difficult
- p-value: proportion of examines who answer the item correctly
- p < 0.50: difficult item, given 0.25 guessing for four-option MCQ
- p > 0.95: quite easy item
- average p ~ 60: difficult test
- p range ~ 0.70-0.80
Item Discrimination
15

15


- differentiating examinees of high & low levels
- item-total correlation or point-biserial correlation or
answering the item gets a high score correctly

R
- R
-


pbis

R

pbis

- whether a student

the higher the better

negative → problems with key or too attractive distractive distractor(s) or too
hard/easy item
Distractor Analysis
Attractivity of a distractor (a-value): proportion of examinees making that response,
watch out when a>p
Sample of Output
2.3 Alternative assessment
Alternative assessment is an ongoing process involving the student and teacher in
making judgments about the student’s progress in language using non-conventional
strategies. Alternative assessments include performance assessment and continuous
assessment, which have been discussed in the previous section. This part introduces two
specific form of alternative assessment: self assessment and portfolio assessment.
a, Self assessment
Self assessment is described as the process in which learners simultaneously create
and undergo the evaluation procedure, judging achievement in relation to themselves
against their own personal criteria, in accordance with their own objectives and learning
expectations (Henner-Stanchina and Holec, 1895). Some studies have reflected that
application of self assessment is limited in formal education where the learning outcomes
are fixed; end-of-course assessments are mostly used; classes are large; resources are

limited; and most students are passive learner, and not autonomous in self assessment
process. However, self assessment enables students to become more active, and to realize
their strengths and weaknesses. Systematic self assessment may be done n the form of
reflection. A simple example of self assessment by a 12-year-old student, done on
completion of a project, illustrates the ability of reflection of a young learner.
pbis

1 What skills have I practised?
Writing. When I take information of any book I learn new words and I write every
time. Reading. I must read the information if want to know things for the project.
2 What language have I learnt?
Some vocabulary (words) and some words I didn’t write well, now I write them better.
About grammar not much.
3 What other information have I learnt?
I have learnt a lot of thing about Incas, their life, their food, etc.
4 Disadvantages
Sometimes is boring, but not all the time and I don’t find more disadvantages.
5 Advantages
You learnt about the project (in my case about Incas). And to organize the work.

Below is another sample of self assessment, in the form of continuous assessment card
16

16


Unexpected End of Formula
Test No →
Type of test and date
Self

assessment

Test result
Comments (by teacher
or learner)

1
Interview
21 January
‘I thought I could
answer about half
of the 10
questions
satisfactorily.
Weak on
pronunciation’
7/10
‘Slight under
estimation
Pronunciation not
too bad’
(Teacher)
‘Better than I
thought’(Student)

Name:
Peter
Anderson
2
Role-playing tasks

19 February
‘Went very well. But
there were a few
words and phrases I
didn’t remember
(Important?)
Good
‘You sounded a bit
blunt, perhaps’
(Teacher)

3






‘Must practice
polite phrases’
(Student)

Continuous assessment card (Fulcher, 2010:72)
b. Portfolio assessment
A portfolio is a limited collection of a student’s work that is used to either present
the student’s best work(s) or demonstrate the student’s educational growth over a given
time span. Portfolios, however, are not just a collection of finished work(s), they can
include biographies of work, range of work and reflections (Nitko, 2001; Gary, 1996;
Wolf, 1989). Teachers and researchers admit portfolios are time-consuming and increase
workload, but agree upon the benefit that they can lead to important changes in

classroom.

ACTIVITY 9
Discuss with your friend, and brainstorm how you can investigate alternative
assessments in assessing your student’s language competence.
2.5. Framework for assessment
Classroom learning is diverse from note memorization of vocabulary, facts,
concepts to reasoning, critical thinking, problem solving. To help teacher identify and
assess different kinds of academic learning, several frameworks for assessment have been
developed. One of the most frequently used frameworks is Bloom’s taxonomy of the
cognitive domain. One of the strengths of Bloom’s is that it is useful in developing
instructional objectives and assessment targets. However, it has become out of date, and
many experts have raised the concern on the hierarchical level of cognitive development.
Suggested questions on next pages for Bloom’s revised taxonomy can be used to
design test questions and instructional objectives.
C. CONCLUSION:
17

17


Testing and assessment play an important part in improving student performance.
.
In order to make the testing and assessment more effective, teachers need to think
carefully about a developing strategies techniques, It would be more advisable for
teachers of language to join in an exchange club where every language issue will be
discussed, shared, experienced and applied.

18


18



×