-
Thông tin
-
Quiz
Scoring and evaluating learners Used in teaching language testing and assessment môn Tiếng Anh | Học viện Nông nghiệp Việt Nam
In the previous three units we have dealt with tests and assessmentsthat classroom teachers use to track development over one academic year. Administrators (e.g., school authorities) use the test results to certify or promote learners to a higher grade. Furthermore, there are proficiency tests that can be used for filtration or gate-keeping purposes as is done through entrance examinations or job selection interviews/ Tài liệu giúp bạn tham khảo, ôn tập và đạt kết quả cao. Mời đọc đón xem!
Tiếng Anh (HVNN) 87 tài liệu
Học viện Nông nghiệp Việt Nam 593 tài liệu
Scoring and evaluating learners Used in teaching language testing and assessment môn Tiếng Anh | Học viện Nông nghiệp Việt Nam
In the previous three units we have dealt with tests and assessmentsthat classroom teachers use to track development over one academic year. Administrators (e.g., school authorities) use the test results to certify or promote learners to a higher grade. Furthermore, there are proficiency tests that can be used for filtration or gate-keeping purposes as is done through entrance examinations or job selection interviews/ Tài liệu giúp bạn tham khảo, ôn tập và đạt kết quả cao. Mời đọc đón xem!
Môn: Tiếng Anh (HVNN) 87 tài liệu
Trường: Học viện Nông nghiệp Việt Nam 593 tài liệu
Thông tin:
Tác giả:
Preview text:
lOMoAR cPSD| 46836766
Language Testing for the ESL Classroom
UNIT 4 SCORING AND EVALUATING LEARNERS Structure 4.0 Objectives 4.1 Introduction 4.2
Types of Measurements Used in Evaluation 4.2.1
Scores and Letter Grades Measurement 4.2.2 Criterion Based Measurement 4.3 Interpretation of Performance 4.3.1 Classroom Based Performance 4.3.2 Group Performance 4.4 Giving feedback 4.4.1 Types of Feedback 4.4.2
Feedback to Promote Assessment for Learning 4.5 Unit Based Questions 4.6 Let Us Sum Up 4.7 Further Readings 4.8 Answers 4.0 OBJECTIVES In this unit you will •
be introduced to the different ways of assessing learner performance based
on psychometric tests and performance based assessments; •
understand how to interpret scores and grades; •
look at the design of assessment criteria and critical issues in using the criteria; •
know about ways to provide feedback to learners to promote assessment for learning. 4.1 INTRODUCTION
In the previous three units we have dealt with tests and assessments that
classroom teachers use to track development over one academic year.
Administrators (e.g., school authorities) use the test results to certify or promote
learners to a higher grade. Furthermore, there are proficiency tests that can be
used for filtration or gate-keeping purposes as is done through entrance
examinations or job selection interviews. 286 lOMoAR cPSD| 46836766
Scoring and Evaluating Learners
But assessments have a larger role to play in education and have a great amount
of social value outside of the classroom context. It is because they have an
immediate impact on the career of the learners as well as a long-standing impact
on the attitudes of people towards how they value scores and the mechanism of
testing. This impact is washback and it can be either positive or negative.
To create a condition for positive washback within the school and higher
educational contexts, it is imperative that teachers understand how to use tests
and assessments for promoting learning. In the previous unit we have looked at
the ways to design test items to ethically and systematically obtain information
about learners’ understanding and use of English as a second language. Now in
this unit we focus on ways to assess learner abilities (i.e., proficiency levels)
and give them feedback so that the testing and assessment feeds into the
learning process such that the testing-learning cycles can be continual and formative processes.
In this unit, we will turn our attention to two important and related aspects of
testing and assessment. Firstly, how can classroom teachers gather meaningful
information about their learners based on their performances? To answer this
question we need to discuss ways to interpret test results and give feedback to
the students to further their learning process. Secondly, how can teachers use
the information gathered from tests to provide feedback so that learners know
which areas to improve. We will conclude the unit with the types of feedback
ESL teachers can provide to promote learning. 4.2
TYPES OF MEASUREMENTS USED IN EVALUATION
When a teacher offers a course, then she would want to know what have the
learners, who have opted for that course, learned or gained from the course. The
learning can be checked periodically, say monthly, and at the end of the course.
In schools and colleges, we follow a system of using periodic tests or formative
tests and the end term or summative test. The tests are usually in the pencil and
paper mode and have items to solve. Learner performance on the tests is
evaluated to understand the ‘amount’ of learning that has taken place. So the
evaluation stands for an ability of a learner at one point of time as a result of
the teaching input that the learner would have received from the course. This
also represents the effort that the learner has put in to obtain a specific result
from a test. But as teachers what we need to ask ourselves is whether the result
of a learner truly captures all that he/she has done to show that ability or
performance in the test? Remember that based on the test results, decisions like
promotion to a higher grade or certification for a proficiency level will be taken
and this will have future implications for the learner.
Another thorny issue in this area is how objective are you as an assessor?
Imagine that you have taught on a six-month long course on educational 287 lOMoAR cPSD| 46836766
Language Testing for the ESL Classroom
psychology to a group of pre-service teachers of English and have conducted
two tests, one in the middle of the course and another at the end of the course.
If two other teachers also evaluate the answers that you have assessed, will the
assessment of all the three evaluators be comparable? If they are, then the
principle of inter rater reliability (discussed in the Unit Two, section 2.4.3) will
have been fulfilled. But commonly there is not much consensus within a group
of assessors looking at the same set of answers, unless they have been trained
or given a common criterion to assess. Training is crucial to maintain
objectivity in assessment. So evaluation then pertains to fulfilling two
principles of assessment - reliability and washback. In this section we look at
the various ways of judging learner responses to obtain reliability in assessment. 4.2.1
Scores and Letter Grades Measurements
Consider the two schools and English tests conducted by each for grade VIII
learners of English. Do you notice any differences? Larsen Grammar School English Test (Final) Full Marks: 100 Reading 40 marks Grammar and Vocabulary 30 marks Writing 30 marks Maharishi Vidyalaya English Test (Final) Full Marks: 100 Reading 20 marks Writing 40 marks Interview 40 marks
There are two differences: (i) the components tested in each test are somewhat
different and (ii) the weight (percent of marks) assigned to each component varies in the two tests.
What we can infer from this is firstly, the ‘test construct’ or what constitutes
proficiency in English in the two schools. Hence the sub-parts tested and the
weight given to the sub-parts differ. So when you want to design a test, the
construct of language proficiency that you ascertain will be dependent on the
context of your teaching and the weight that you can give to each sub-
component will be justified according to how important that component is in
the test that you wish to administer.
In fixed response type items one can follow the answer key and score learner
performance. If as a teacher you have taken care to have well-constructed items,
where the testing point is clear and the options (answer and distractors) are
valid, then there would be no issue of inter-rater reliability. However, in the
previous unit we have seen that designing MCQ items, especially the 288 lOMoAR cPSD| 46836766
Scoring and Evaluating Learners
distractors, is tricky! So reliability of MCQ items linked to the ‘test construct’
is a crucial point. In some instances if the paper setter has not provided the key
and you are the one assessing, then you need to create the key and check with
one or two raters to ensure that there is no disagreement on the choice of correct
answers. Of course after the test has been administered and if the options are
found to be faulty, then the item(s) become invalid as it flouts test design
reliability principle. Ideally the responses on that item should not be considered
for the total marks rewarded to the learners.
Total scores that learners obtain from a test are the raw scores. They are
converted to percent scores or a letter grade. The choice of scale – percent score
or letter grade – depends on the institutional policy of evaluation. As percent
scores are difficult to represent standardized proficiency levels, it is better to
convert percent into letter grades with a ten percent differential marking
scheme. The scale can be absolute or relative as shown in the figure below: Absolute scale Relative scale A = 60-80% A = first 25% of achievers B = 59-40% B = next 25 % of achievers C = 40-30% C = next 25% of achievers F = below 30% D = last 25% of achievers
In the absolute scale, percent scores are rank ordered. Thereafter, learners are
given a letter grade based on the band they fall into. This system is applied
across disciplines in an institution.
In the relative scale, however, the letter grade is awarded based on within group
performance and the performance of a specific learner is considered vis-à-vis
the group ability displayed through the test. While the absolute scale has a
specific cut-off point for each grade, the relative scale is based on the
performance of a group and is therefore more context-dependent and relevant
for classroom teaching. We will come back to this point in section 4.3 of this unit.
When items are limited response type and free responses then subjective or
impressionistic judgment might occur. So as a teacher-assessor you must have
an estimate that the scores you assign is linked to the ability of the learner. You
need to maintain objectivity in doing so. This can be achieved by using
criterionbased assessment. This we discuss in the next section. 4.2.2
Criterion Based Measurement
In evaluating responses that may have multiple appropriate answers like in free
response type items such as essay type items, there is every chance that the
score awarded to responses may emanate from a subjective understanding of
the teacher about a learner. The teacher may be biased towards the learner if the
language is fluent but content does not have a high standard. The teacher might 289 lOMoAR cPSD| 46836766
Language Testing for the ESL Classroom
be reminded of the polite behavior of the learner and over score the learner.
Such subjective judgments will cause a serious hindrance to the principle of
inter-rater reliability. So we need to know how to solve this problem.
To systematically assess free responses resulting from performance based
assessment like production of written or oral form of the language, teachers
should have a common understanding of which sub-components of language
and content will make for appropriate responses. So ‘systematic estimates’ of
performances according to (a) the task requirements and (b) your expectations
of what you have taught in a specific course should be drawn up and used for
evaluation. This is also known as ‘criterion based assessment’. It has two
aspects: a) draw descriptions of levels of performance; and b)
include sub-components of skills/tasks in the description.
These will be the parameters based on which free-responses may be assessed.
The criterion can be designed in two ways: •
a holistic criterion or a general description of the sub-components given
together as a language ability at three levels; and •
an analytical criterion or each sub-component is described at three different levels.
The criterion can also be generic, one that can be used across tasks and a more
specialized one called task-specific. The use of CEFR (refer to Unit Three,
section 3.2) is a good starting point for understanding the use of criterion based
assessment, which otherwise might appear to be a bit complex for you at the beginning.
Let us consider the following letter writing task and the different types of
criterion that can be designed to assess responses from this task:
TASK: [ Grades 8-10/lower intermediate ]
You wish to subscribe the magazine READER’S DIGEST for a year. Write a
letter in 150-200 words to the editor requesting him/her to give you the
subscription details. In your letter, you can ask about the subscription rate,
mode of payment, delivery and any other query that you may have. [10 marks ] Option 1:
Impressionistic judgment
You will be graded on content, language and organization. Option 2: Holistic criteria Description 290 lOMoAR cPSD| 46836766
Scoring and Evaluating Learners
Grade A(8-10 marks) Can write responses to everyday events, topics
related to general knowledge and communicate to
ask for and give information and express likes and
dislikes in an appropriate format; use language to
express most of the ideas clearly; and use simple to complex structures. Grade B(5-7 marks)
Can write responses to everyday events, topics
related to general knowledge, and can communicate
to ask for and give information and express likes
and dislikes. The format is not very clear and some
attempts to use language to express ideas in mostly simple structures. Grade C(4-2 marks)
Can attempt to write responses to everyday events
and communicate to ask for and give information
and express likes and dislikes. The use of format is
mostly absent and use of vocabulary is limited;
simple sentences are mostly used with some errors.
Option 3: Task-specific holistic criteria Sub-component & weight Description CONTENT
Enquires about subscription details, mode(s) of (5 marks)
payment, details of delivery, time to be taken, whom
to contact in case of problems. LANGUAGE
Uses vocabulary appropriate to express each language (3 marks)
function and a variety of sentence structures. ORGANIZATION
Begins with a formal address to the editor and (2 marks)
expresses interest about the magazine; presents all
enquiries about the subscription; concludes by
thanking the editor and intention to receive information at the earliest.
Option 4: Task-specific analytical criteria Sub-component & Grade A Grade B Grade C weight
CONTENT (refer to Mentions more than Mentions 3-4 Mentions less the content checklist 5 ideas ideas (4-3 than 3 ideas (2-1 for ideas) (5 marks) marks) marks ) 291 lOMoAR cPSD| 46836766
Language Testing for the ESL Classroom LANGUAGE Uses vocabulary
Attempts to use Vocabulary usage appropriate to the
vocabulary and is not varied and task and some structure to sentence structure variety in sentence
express meaning. has lots of errors. formation. (3 (2 marks) (1 mark ) marks ) ORGANIZATION Creates a letter Creates a letter Attempts to create format with formal format with a letter format and
address and closure. formal address follows a few of Attempts to follow and closure. The the steps but not the steps as steps as systematically. (1 mentioned in the mentioned in the mark )
content checklist. (2 content checklist marks) is somewhat followed. (1 marks ) Content checklist: •
formal address and expresses interest in the magazine •
enquires about subscription details, • mode(s) of payment, • details of delivery, • time to be taken, •
whom to contact in case of problems •
expresses intent to receive information at the earliest
If you cannot manage to design or use such detailed criterion as listed under
option 2 – 4, you may begin to consider the sub-components and assess
performance taking into consideration all these aspects on a scale of 1-3 or 1-
5. It will make your assessment more nuanced than option 1 and equip you
better to capture learner performance following the model of assessment for learning. Option 5:
Evaluating using a checklist Features 5 4 3 2 1 Content √ Grammar √ Vocabulary use √ Organization √
In sum, both the types of measurements i.e. score-based and criterion based
give two types of learner estimates. In the next section we discuss the relevance 292 lOMoAR cPSD| 46836766
Scoring and Evaluating Learners
of each with regard to the use of language tests and assessments to promote
assessment for learning within the ESL classroom context. Check Your Progress 1
Read the following two contexts. Identify what is measured and which kind of
measurement will be suitable in each and why.
Context P: For a job interview Mr. Mehta the CEO of an organic food industry
decides to use five of the employees in the organization to assess the candidates
who will be taken as new recruits. The candidates have participated in an
interview lasting 10 minutes, which will be assessed by the five employees.
This he does to maintain objectivity in evaluation.
Context Q: In a school in Hyderabad class nine learners will be given extra
training to appear in the prestigious Science Olympiad tests in Maths and
English. Both are MCQ tests and the English paper also has a short story writing
task. Three sections will receive the training and evaluated by three teachers.
Finally the ten best performing learners will get a chance to appear for the Olympiads.
............................................................................................................................. .
............................................................................................................................. .
............................................................................................................................. .
............................................................................................................................. . 4.3
INTERPRETATION OF PERFORMANCE
High stakes assessments done for certification and entry into higher academia
or job selection follow the system of assessing using scores as percentage or
score bands represented as letter grades. Based on the cut-off marks, decisions
are taken whether a learner can be certified to have passed the matriculation
exam or is selected to study at a University or is recruited for a job. But does
the percent score or the letter grade capture all the effort the learner has put in
to get that score or grade? Does it not mask all the gradual improvements that
the learner might have experienced as a result of preparing for the exam? What
if the result is not truly indicative of what the learner was supposed to have
scored and has either been over scored or underscored? Also how does the result
relate to the communicative ability of the learner? Is the result displayed
through the score or letter grade successful in predicting future success or does
it fulfill ‘predictive validity’? All these questions need to be carefully 293 lOMoAR cPSD| 46836766
Language Testing for the ESL Classroom
considered because one’s mental ability when measured and represented
through simply a score or a letter grade is not a realistic estimate because of its
static and absolute nature. The ability is likely to grow in future or may be better
captured through a different assessment type rather than a pencil and paper test.
But all the decisions taken about the learner is based on that static one-
dimensional score. So this creates negative washback. 4.3.1
Classroom Based Performance
Next let us consider the classroom scenario and the innumerable tests that you
as the language teacher may have conducted or would have to conduct in the
near future. Let us go back to the examples given in section 4.2.1. In the test
conducted by the teacher in Maharishi Vidyalaya a learner gets a total score of
70 marks. What does this mean for the learner – that he/she has faired well?
What does it mean for another learner who has got a score of 47 – that he/she
needs to improve? But what does the score actually reflect about the abilities of
these two learners? If they were to ask the teacher about their abilities would
the teacher be able to do so? Not likely. Because the total score does not really
give us an estimate of the learner performance in the sub-components which
were a part of the test construct. Also what if the learner who has got a lower
score did so because he/ she was not good at solving the item type used?
So as a teacher who has to understand and explain to the learner what the score
or letter grade represents, you must be able to describe the language ability with
respective sub-components and consider that things like item-type, topic
familiarity etc. might have influenced the performance negatively. To break up
the score into sub-components will give a better estimate of a learner’s ability.
Letter grade by themselves are better as they do not create an absolute estimate
but put learners in a range and thereby rule out unhealthy competition within
learners. Such absolute estimates are still okay for mathematics performance as
it is based on problem solving ability or athletic performance that is based on speed. 4.3.2 Group Performance
Moving on from interpreting individual performance, which we did in section
4.3.1 , let us now think of how a teacher can represent group behavior and why
would she need to do so. Firstly for reporting to the higher authorities about
group behavior, this is necessary. Furthermore, if the teacher wants to use the
information from test performance for diagnostic purposes and understand to
what extent the syllabus taught has been learnt or what fraction of the various
groups has achieved the learning objectives of a course.
One way to represent group behavior is to rank-order the scores and calculate
the median of the group performance which can be done by calculating the
average or mean of the group performance. The mean scores can be then
converted to percent scores as they are easy to compare. Let us look at the
following data to understand how to calculate this. The following are the scores
obtained by the learners who took the test conducted by the teacher in the Maharishi Vidyalaya: 294 lOMoAR cPSD| 46836766
Scoring and Evaluating Learners Student Reading Writing Interview Overall (20 marks) (40 marks) (40 marks) (100 marks ) S1 15 30 30 75 S2 15 20 25 60 S3 20 30 30 80 S4 10 15 15 40 S5 15 25 25 65 S6 20 20 20 60 S7 15 25 30 70 S8 10 25 30 65 S9 20 30 35 85 S10 15 25 30 70 Mean score 15.5 24.5 27 67.5 Percent mean 77.5 61.25 67.5 67.5 score
Once the mean scores are calculated for the overall test and for each
subcomponent, it gives a comprehensive picture to the teacher about group
behavior: i) overall performance is satisfactory (67.5%) ii) reading section has
been the most well attempted (77.5%);
iii) writing is high (61.25%) but interview performance is better than writing (67.5) ;
While reading is MCQ based, writing and interview tasks are free responses.
So the teacher has to maintain systematicity in assessing these two response
types. Thus, calculating mean scores (or average) and percent mean scores is a
measurement of group performance that gives more information to the teacher
and the higher authorities. The use of this information to further learning will
be discussed in the next section on ‘feedback’.
Another information about the group behavior that you can look at is range of
performance (highest to lowest scores) because this can give the distance of the
two scores, highest and lowest, from the mean score. Thereafter, you can divide
the learners into three bands of performance - first 30%, next 30% and the rest.
Based on who falls under which group, you can plan to give extra help for the
learners who are in the third category, while the rest of the learners can move onto the next unit. Check Your Progress 2
Read the following descriptions and do the following: 295 lOMoAR cPSD| 46836766
Language Testing for the ESL Classroom i)
identify what is the problem in assessing learners and how it will affect
them; also identify if some other principle of assessment is fulfilled/flouted;
ii) propose a solution for the problems you can identify
Case 1: Two sections of class V receive the same test where the total score
allotted is 65 marks. The two class teachers evaluate the scripts of
their respective classes. One teacher reports the raw scores whereas
the second teacher reports the scores in percentage.
Case 2: A private language-teaching consultancy ‘Mehrus Language
Consultancy’ has branches in five Indian cities. They conduct
speaking tests as part of their language syllabus on a monthly basis
and at the end of the semester. In one such end semester test an
interview-based task was conducted in seven cities. Of the seven
evaluators, two used criterion-based assessment while the rest scored
the performances out of 20 marks. 4.4 GIVING FEEDBACK
Just as assessment is a way to track growth in learning, another important tool
in promoting learning is giving feedback to learners. Feedback is drawing a
realistic estimate of learners’ strengths and weaknesses so that they can plan for
areas to work on in future. While feedback is not a part of high-stakes
examinations, classroom assessments have the scope for being used as a
learning tool. So you need to build feedback into your teaching-assessment
cycle. But to give feedback you need to be systematic about your judgments
and align your feedback to the learning objectives. This is because giving
learners a generalized feedback that ‘you are good’ or ‘you need to improve’ would not be useful. 4.4.1 Types of Feedback
If we consider the example of performance in the English test conducted by the
Maharishi Vidyalaya (referred to in sections 4.2.1 and 4.3.2), the information
obtained from the group performance can be used to select which sub-group
requires the most help. Teachers can go back to individual performances and
section-wise explicitly point out the mistakes and provide the correct answer.
This is direct or explicit feedback. Teachers generally mark on the script with
a cross for a wrong answer or usage and write out the correct answer in the
margin. While this is common and easy for teachers to do, it does not help
learners to figure out how they can overcome the problems on their own as the
teacher has already given them the correct or appropriate answer.
Another form of feedback is where a teacher can underline the phrase or part
that is wrong and indicate in the margin with a question mark. Then they can
make learners understand the areas they have a problem in and help them work
out an appropriate answer (in case of limited and free responses) or correct 296 lOMoAR cPSD| 46836766
Scoring and Evaluating Learners
answer for MCQ items. If the teacher perceives patterns in problem areas, i. e,
when a number of learners have error in the same area, the teacher can also do
a collective feedback by discussing and helping learners come up with the
correct/appropriate answer(s). This is called indirect or implicit feedback. This
is a better alternative to direct feedback as it looks at language learning as a
process not as a product and attempts to use assessment as a tool for learning.
But very few teachers attempt this kind of feedback. Though this process is time
consuming, it is better for you to adopt this style of feedback for classroom
assessments. Learners would also be happier to work out a solution on their
own. They would attend to learning deeply rather than only know what would
be the correct answer as is done in direct feedback. As a step ahead of this,
teachers can also plan for other items and tasks to give more focused input in
areas that learners have fared poorly – which could be in structure (or grammar
usage) or vocabulary or in writing.
Feedback can also be individual or collective based on the manner in which a
teacher decides to provide feedback. Each can be used according to the learner
needs: for instance for writing development individual feedback may be
beneficial; for grammar or reading, collective feedback may be more practical
for the teacher and useful to the learners. 4.4.2
Feedback to Promote Assessment for Learning
Now let us consider ways to give feedback to promote assessment for learning.
Let us consider the following scenario:
Radha conducts a grammar and vocabulary test out of 20 marks. The items are
a mix of MCQ items and fill-in-the blank, supply type (limited response type).
100 learners take the test. To give individual feedback to each learner is
difficult for Radha as a single teacher. She wants to give indirect feedback. So
she brings the responses to class and keeping the names anonymous picks out
error patterns in grammar and vocabulary to discuss in class. She throws up
lists of error in one sub-area of grammar, tense uniformity (e.g., Yesterday the
woman gives a blanket to the poor man.) and asks the learners what is wrong
here and how it can be rectified. With the collective answer she gets she writes
it up on the board and asks learners to make a note of it. In this manner she
draws up ten areas of error from the test performance. Later she also gives them
separate exercises on some of the error areas.
In the example above, Radha manages to give indirect feedback and in the
process promotes learning or uses assessment for learning.
Feedback on skills and elements might have to be given differently because
they relate to different aspects of language use and usage. While the skills can
only be assessed using texts of a certain length, grammar and vocabulary can
also be tested as isolated items and as MCQ items. A model of feedback for
grammar and vocabulary has already been discussed in the example provided
above. Let us now look at how feedback pertaining to each skill can be provided. 297 lOMoAR cPSD| 46836766
Language Testing for the ESL Classroom
Feedback on reading and listening
Reading and listening are receptive skills, which have to be assessed using texts
of varying lengths, topics, and genre of writing (e.g. narrative, dialogues,
informative and so on). The assessment of these skills cannot trace any overt
behavior. Comprehension or internal behavior has to be tested. One way of
doing this is in the receptive manner where learners read or listen to texts and
solve MCQ items. So no writing or speaking is involved to express
comprehension. However, the latter modalities can be used if a teacher feels the
need to use these modalities to assess comprehension in an integrative manner
like reading leading onto writing or speaking. Teachers need to be clear about
the testing objectives or which sub-skills are to be assessed. They need to
identify which sub-skills do learners seem to find difficult: for instance factual
information local (to be identified from one paragraph) is easy but factual
information global (to be identified from across 2 or 3 parts or paragraphs of
the text) is difficult. So items pertaining to the latter sub-skill have to be
discussed collectively in indirect feedback form and learners need to be guided
to get the correct or appropriate answers. This will help them move from easier
levels of comprehension to more complex levels. The teacher-assessor can
provide this kind of fine-tuned and focused feedback during classroom assessments.
Feedback on writing and speaking
As writing and speaking are productive skills, these two can be overtly
observed. For instance when learners are asked to perform on a story writing
task or participate in a role-play, they would show writing and speaking
behavior respectively. While writing is more tangible and permanent, speaking
will be over once the task performance gets over. So speaking assessment and
feedback is challenging because of its transient nature. It is a good idea to record
the speech samples and assess and provide feedback on the recorded matter.
For both writing and speaking feedback, teachers need to have a checklist of
sub-skills they want the learners to improve on. All the sub-skills cannot be
targeted through one or two tasks. Each sub-skill will require focused attention
over a period of time. So teachers can begin with fluency (or content
development) and organization at macro level or coherence (or paragraph
formation in written responses and different communicative ideas in spoken
interaction). For each task, the teacher can choose either holistic or task-specific
criteria. Holistic criteria can give feedback about the general performance and
areas to develop whereas task-specific analytical criteria (refer to option 4 in
section 4.2.2 ) can be used to point out to the learners which areas they are good
at and which they need to work upon. This way the feedback will be useful, as
it will provide them with information on which areas they need to improve and practice. Check Your Progress 3 298 lOMoAR cPSD| 46836766
Scoring and Evaluating Learners
Read the following descriptions and identify the type of feedback given and on
which aspect of language. Do you think the learners would have benefitted from
the feedback? Explain. You can work in pairs or small groups to do this activity.
A. Suneeta has taught English to grade ten learners for a year now and she has
given them short tests and essay type questions on the topics she had
covered throughout the course. This is in addition to the three term based
tests. For the short tests she has given feedback to the learners by telling
them who has got which items wrong.
B. In a course in writing held in year one in college, the learners have been
encouraged to take up writing in topics or areas of their interest. They are
also made to work in smaller groups to get ideas and write the texts
together. The teacher helps them work with the writing by giving them
comments on the style of writing and organizing the ideas. She does not
identify the errors. She also encourage them to write at least two drafts of each text.
C. A teacher has taught grammar in two sections in grade five. She has asked
the learners to make a chart showing usage of tense through pictures and
sentences in one section; in the other section she has conducted role-plays
based on different communicative situations (at home, in the park, at the
clubhouse and so on) to get a sense of the amount of learning the learners
have in this area of grammar. After the performance of each section is over,
she realizes that both are useful activities for her learners and decides to
swap each activity in the two sections. When the learners in both the
sections do the activities for the second time, she also asks the sections to
observe each other. The learners felt that each activity was very interesting
and they all participated with great enthusiasm. Also by the end of the
activities they build good rapport across the two sections. 4.5 UNIT BASED QUESTIONS
1) List the advantages and disadvantages of score, letter grade and criterion
based assessment and in which situations can each be used.
2) Following are the scores obtained by four sections of grade VII out of an
English test of 100 marks. Two teachers have taught these four sections.
They have been asked to prepare a report of learner performance across
the four sections. If you were one of the teachers how would you write the
report? What points would you highlight? Prepare a report of 100 words. Sections Listening Speaking Grammar & Total (40) (40) Vocabulary (20) (100) VII A 20 38 20 78 299 lOMoAR cPSD| 46836766
Language Testing for the ESL Classroom VII B 10 25 20 55 VII C 20 35 20 75 VII D 20 30 18 68
3) Prepare an evaluation criteria or a checklist-based scale based on which
you can evaluate responses that learners gives to solve the following writing task.
Write an email to the sales manager of Cambridge University Press,
Hyderabad branch, requesting for a soft and hard copy of their latest
catalogue. You also want to enquire whether you can make online
purchases from the CUP website using your Debit card issued by State
Bank of India, Tarnaka Branch. Your email should not exceed 150 words.
Special credits will be given for formatting and use of appropriate words
to ask for information. The address of the CUP stores at Hyderabad is:
Cambridge University Press, 1/12 Jubilee Hills, Hyderabad 500034.
4) An English teacher in Johnson Grammar School has been dealing with use
of simple present (e.g., I go to the park to play after school hours.) and
present continuous tense (e.g., ‘Right now I am busy, I am making a call’
– cried out Martha to her mother.) . But she is not too happy with the kind
of progress that the learners, who are in grade five, have made. So, she has
to give feedback to her learners in the next class.
List some ideas of feedback that the teacher can give to the learners to help them learn the area better. 4.6 LET US SUM UP
In this unit, we have presented the different ways in which estimates of learners’
language abilities may be formed. They range from pure static measurements
as represented through percent scores to letter grades on absolute and relative
scales to more richer descriptions of learner abilities presented through criterion
based assessment. We have discussed how to interpret learner performances or
what do scores or grades mean to learners, teachers and other stakeholders of
assessment. We have concluded with the role of feedback in language learning.
Through feedback teachers can overcome their overreliance on score based
measurement and create an environment for assessment for learning. They can
thus move away from assessment of learning, which is the focus of public and
other high stake examinations and represented through score based
measurement in pencil and paper tests.
By reading through this unit and solving the questions given in the unit, language
teachers will be able to do the following: •
distinguish between score/letter grade and criterion based assessment; 300 lOMoAR cPSD| 46836766
Scoring and Evaluating Learners •
understand how to interpret the assessment results from scores, letter grades
and criterion based assessment; •
how to report individual assessment and group behavior; •
decide which forms of feedback to give to learners and how to do the feedback
in an objective and systematic manner to promote assessment for learning; and •
understand the limitations of measurement entailed in pencil and paper tests
and not rely too much on this test-type for classroom purposes. 4.7 FURTHER READINGS
Brown, H.D.. & Abeywickrama, P. (2010). Language Assessment: Principles and
Classroom Practices. Chapter Twelve (2nd ed.). NY: Pearson and Longman Common European Framework of Reference. (2001). https://www.coe.int/t/dg4/
linguistic/Source/Framework_EN.pdf
Durairajan, G. (2015). Assessing Learners: A Pedagogic Resource. New Delhi: Cambridge University Press.
Lorna, E. (2013). Assessment as Learning: Using classroom assessment to maximize
learning. (2nd Edn). UK: Sage Publications.
McMillan, J. H. (1997). Classroom Assessment: Principles and practice for effective
instruction. Boston: Allyn and Bacon. 4.8 ANSWERS Check Your Progress 1 What is measured? Which unit to be used? Context P: Interview based performance
criterion based; rater reliability
Context Q: MCQ items score/letter grade; measurement Story writing
criterion based to give feedback Check Your Progress 2 Problem in Other principles Solution assessment Case 1 Raw scores and Inter rater reliability Convert scores to percent scores in two (is it maintained? – percent scores, rank sections will not not clear) order and find group allow comparability mean to compare Case 2 Free responses: Inter rater reliability Use criterion based 2 systems are used – (across evaluators); assessment; train all 301 lOMoAR cPSD| 46836766
Language Testing for the ESL Classroom not reliable washback raters; give feedback based on abilities Check Your Progress 3 Type of feedback Is it beneficial? A Direct and individual
Not much; learners need to work out the errors. feedback (all language
They need to get more focused practice. aspects) B Indirect feedback on
Yes, feedback and redraft will improve writing writing: style, organization
over a period of time; writing is seen as a process not product. C Indirect feedback on
Yes, both groups have benefitted from the grammar (tense usage)
interesting tasks and observing each other’s
behavior; language use seen as a process not a product. Unit based questions 1. Advantages Disadvantages Contexts to be used Scores, High-stakes Feedback cannot be Letter Can be used with MCQ exams; items and score large provided as grades Classroom numbers without performance is having the issue of rater quantified assessments reliability Aligning letter grade Group behavior can be to learner capability tracked is not easy Criterion Maintains objectivity Training is not Free-response based and rater reliability provided type items assessment Useful to give everywhere; time feedback consuming
2) Report on performance in an English test: •
All the four sections have fared well as the total mean scores range from 55% to 78%. •
Grammar and Vocabulary have been well attempted by all sections ( approx. 97% ) •
Speaking performance is also high; but listening is moderate across
all the sections (approx. 50%) Recommendation
i) More practice in listening; (ii) section VII B needs help through more practice
as they have scored the lowest. 302 lOMoAR cPSD| 46836766
Scoring and Evaluating Learners
3) You can prepare a content checklist and then use the assessment checklist to evaluate your learners’ performances. Content checklist:
greeted the manager; asked about the catalogue and online payment; formally ended the email Assessment checklist Sub-parts 5 4 3 2 1 Content Organization Language structure Vocabulary
4) To give feedback Martha can prepare two sets of activities:
Activity one: ask learners to describe a picture or a video clip
( use of present continuous tense )
Activity two: ask the learners to describe the likes and dislikes of two of their friends
( use of simple present tense – habitual actions )
By doing these two activities she can draw her learners’ attention to the fact
that certain communicative contexts necessitates the use of each of these two
tenses. This will be indirect feedback and practice of language rules will help
her learners be able to communicate better. 303