Testing: Reasons For Testing Students
Testing: Reasons For Testing Students
Testing: Reasons For Testing Students
Testing
Reasons for testing students
If they arrive at a school and need to be put in a class at an appropriate level, they may
do a placement test.
At various stages during a term or semester, we may give students progress tests.
At the end of a term, semester or year, we may want to do a final achievement test
(sometimes called an exit test)= measure the students’ abilities in all four skills, as
well as their knowledge of grammar and vocabulary.
testing in terms of ‘one-off’ events = taking place at the end of a period of time
(except for placement tests).
One form of continuous assessment is the language portfolio, where students collect
examples of their work over time, so that these pieces of work can all be taken into
account when an evaluation is made of their language progress and achievement.
Such portfolios (called dossiers in this case) are part of the CEF (Common European
Framework), which also asks language learners to complete language passports
(showing their language abilities in all the languages they speak) and language
biographies (describing their experiences and progress).
keeping a record of who speaks in lessons and how often they do it, how compliant
students are with homework tasks and how well they do them, and also how well they
interact with their classmates.
Good tests
have a positive rather than a negative effect on both students and teachers.
is valid. This means that it does what it says it will.
another kind of validity - in that when students and teachers see the test, they
should think it looks like the real thing - that it has face validity.
the students need to have confidence that this test will work (even if they are
nervous about their own abilities).
should have marking reliability. Not only should it be fairly easy to mark, but
anyone marking it should come up with the same result as someone else.
a test should be designed to minimise the effect of individual marking styles.
one of the things we have to take into account is the practicality of the test.
We need to work out how long it will take both to sit the test and also to mark it.
we have to think of the physical constraints of the test situation. Some speaking
tests, especially for international exams, ask not only for an examiner but also for
an interlocutor (someone who participates in a conversation with a student).
The washback effect occurs when teachers see the form of the test their students
are going to have to take and then, as a result, start teaching for the test; has a
negative effect on teaching if the test fails to mirror our teaching because then we
will be tempted to make our teaching fit the test, rather than the other way round.
When we design our own progress and achievement tests, we need to try to ensure
that we are not asking students to do things which are completely different from
the activities they have taken part in during our lessons.
That would clearly be unfair.
We need to remember that tests have a powerful effect on student motivation.
students often work a lot harder than normal when there is a test or examination in
sight.
they can be greatly encouraged by success in tests, or, conversely, demotivated by
doing badly.
we may want to consider the needs of all our students, not just the ones who are
doing well.
This does not mean writing easy tests, but it does suggest that when writing
progress tests, especially, we do not want to design the test so that students fail
unnecessarily - and are consequently demotivated by the experience.
Test types
Discrete-item testing means only testing one thing at a time (e.g. testing a verb tense
or a word).
Integrative testing means asking students to use a variety of language and skills to
complete a task successfully.
A direct test item is one that asks students to do something with language (e.g. write a
letter, read and reply to a newspaper article or take part in a conversation); are almost
always integrative = are those which test the students’ knowledge of language rather
than getting them to use it. activation
Indirect test items might focus on, say, word collocations or the correct use of modal
verbs. study
Indirect test items
Multiple choice
Sometimes students are instructed to choose the ‘correct’ answer (because only one
answer is possible), as in the example above.
But sometimes, instead, they can be told to choose the ‘best’ answer (because,
although more than one answer is possible, one stands out as the most appropriate).
easy to mark; it is simply a matter of checking the correct letters for each question.
One problem with multiple-choice questions lies in the choice of distractors, that is the
three incorrect (or inappropriate) answers.
there is a danger that we will either distract too many students (even those who should
get the question right) or too few (in which case the question has not done its job of
differentiating students).
can be used to test reading and listening comprehension (we can also use true/false
questions for this: students circle ‘T’ or ‘F’ next to statements concerning material
they have just read or listened to).
are very attractive in terms of scorer reliability.
Fill-in and cloze
testing involves the examinee writing a word in a gap in a sentence or paragraph.
are fairly easy to write, though it is often difficult to leave a gap where only one item is
possible.
A variation on fill-ins and gap-fills is the cloze procedure, where gaps are put into a text
at regular intervals (say every sixth word).
students are forced to produce a wide range of different words based on everything from
collocation to verb formation.
Most test designers use a form of modified cloze, trying to adhere to some kind
of random distribution (e.g. making every sixth word into a blank), but using
their common sense to ensure that students have a chance of filling in the gaps
successfully - and thus demonstrating their knowledge of English.
3. Transformation
students are asked to change the form of words and phrases to show their
knowledge of syntax and word grammar.
the students not only have to know the meaning of borrow and lend, but also how
to use them in grammatical constructions.
Testing
We can ask students to put jumbled words in order, to make correct sentences and
questions.
We can ask them to identify and correct mistakes or match the beginnings and ends of
sentences.
Direct test items
we ask students to use language to do something, instead of just testing their
knowledge of how the language itself works.
write instructions for a simple task (such as using a vending machine or assembling a
shelving system) or to give an oral mini-presentation.
Reading and listening = might ask students to choose the best summary of what they
have heard or read; to put a set of pictures in order as they read or listen to a story, or
complete a phone message form (for a listening task) or fill out a summary form (for a
reading task).
Writing Direct tests = getting students to write leaflets based on information supplied
in an accompanying texts; we can ask students to write ‘transactional letters’ (that is
letters replying to an advertisement, or something they have read in the paper, etc).
Speaking = We can interview students, or we can put them in pairs and ask them to
perform a number of tasks; they might discuss how to furnish a room, or talk about
any other topic we select for them. We can ask them to role- play certain situations.
direct tests should have items which look like the kind of tasks students have been
practising in their lessons. Direct test items are much more difficult to mark than
indirect items.
Marking tests
if the markers only have to tick boxes or individual words (though even here human
error can often creep in). Simple
when we have to evaluate a more integrative piece of work. Complex
give an overall score (say A or B, or 65%). This will be based on our experience of the
level we are teaching and on our ‘gut-instinct’ reaction to what we read.
This is the way that many essays are marked in various different branches of education
and sometimes such marking can be highly appropriate.
the danger of marker subjectivity: involve other people. When two or three people
look at the same piece of work and, independently, give it a score, we can have more
confidence in the evaluation of the writing than if just one person looks at it.
: use marking scales for a range of different items. If we are marking
a student’s oral presentation, we might use the following scales:
the problem of knowing exactly why we should give a student 2 rather than 3 for
pronunciation. What exactly do students have to do to score 5 for grammar? What
would make us give students 0 for fluency?
Subjectivity is still an issue here (though it is less problematic because we are forcing
ourselves to evaluate different aspects of the students’ performance). One way of
trying to make marking scales more objective is to write careful descriptions of what
the different scores for each category actually represent. Here, for example, is a scale
for assessing writing, which uses descriptions:
5 Exemplary 4 Strong 3 Satisfactory 2 Developing l Weak
Ideas/Content
Original treatment of ideas, well-developed from start to finish, focused topic with
relevant, strong supporting detail.
Clear, interesting ideas enhanced by appropriate details.
Evident main idea with some supporting details. May have some irrelevant material,
gaps in needed information.
Some attempt at support but main topic may be too general or confused by irrelevant
details.
Writing lacks a central idea; development is minimal or nonexistent, wanders.
Organisation
Effectively organised in a logical and interesting way. Has a creative and engaging
introduction and conclusion.
Structure moves the reader smoothly through the text. Well organised with an inviting
introduction and a satisfying closure.
Organisation is appropriate but conventional. There is an obvious attempt at an
introduction and conclusion.
A lack of structure makes this piece hard to follow. Lead and conclusion may be weak
or nonexistent.
Voice
Passionate, compelling, full of energy and commitment. Shows emotion and generates
an emotional response from the reader.
Expressive, engaging, sincere tone with good sense of audience. Writer behind the
words comes through occasionally.
Pleasant but not distinctive tone and persona. Voice is appropriate to audience and
purpose.
Word Choice
Carefully chosen words convey strong, fresh, vivid images consistently throughout the
piece.
Word choice is functional and appropriate with some attempt at description; may
overuse adjectives and adverbs.
Words may be correct but mundane; writing uses patterns of conversation rather than
book language and structure.
Sentence Fluency
High degree of craftsmanship; control of rhythm and flow so the writing sounds
almost musical to read aloud. Variation in sentence length and forms adds interest and
rhythm.
The piece has an easy flow and rhythm with a good variety of sentence length and
structures.
The writing shows some general sense of rhythm and flow, but many sentences follow
a similar structure.
Conventions
The writing contains few, if any, errors in conventions. The writer shows control over
a wide range of conventions for this grade level.
Generally, the writing is free from errors, but there may be occasional errors in more
complex words and sentence constructions.
Occasional errors are noticeable but minor. The writer uses conventions with enough
skill to make the paper easily readable.
A marking scale for writing
This framework suggests that the students’ writing will be marked fairly and
objectively.
When marking tests - especially progress tests we design ourselves - we need to strike
a balance between totally subjective one-mark-only evaluation on the one hand, and
overcomplexity in marking-scale frameworks on the other.
Designing tests
We will think very carefully about how practical our tests will be in terms of time
(including how long it will take us to mark them).
it is important to try to work out what we want to achieve, especially since the
students’ results in a progress test will have an immediate effect on their motivation.
we need to think about how difficult we want the test to be. Is it designed so that only
the best students will pass, or should everyone get a good mark?
Progress tests should not work like that, however. Their purpose is only to see how
well the students have learnt what they have been taught.
it is helpful to make a list of the things we want to test. This list might include
grammar items (e.g. the present continuous) or direct tasks (e.g. sending an email to
arrange a meeting).
we can decide how much importance to give to each item.
we might give a writing task double the marks of an equivalent indirect test item to
reflect our belief in the importance of direct test types.
When we have decided what to include, we write the test.
it is important that we do not just hand it straight over to the students to take. It will be
much more sensible to show the test to colleagues (who frequently notice things we
had not thought of) first.
once we have given the test and marked it, we should see if we need to make any
changes to it if we are to use some or all of it again.
Conclusions
good tests are both valid and reliable - and that face validity (‘looking good’) is also
important.
test design may be influenced by physical constraints (e.g. time and money).
the washback effect which can sometimes persuade teachers to work only on exam
preparation with their students while ignoring general language development.
discrete test items (one thing at a time) and integrative test items (where students use a
variety of language and skills); direct test items (where students are asked to do things
with the language - e.g. writing a report) and indirect test items (where they are tested
about the language - e.g. grammar tests).
when preparing tests, we need to decide what we want to test and how important each
part of a test is in relation to the other parts.
teachers should show their tests to colleagues and try them out before using them ‘for
real’.