Authenticity in Language Test Design
Authenticity in Language Test Design
Written By:
Miriam A. Alkubaidi
2009
The goal Wiggins and Mc Tighe are referring to is the imparting of knowledge to
language learners. Teachers may assertively achieve this by accurate test design
through the implementation of authentic testing. Although authenticity has been
perceived from various perspectives, on the whole it has never been fully realized to
the high standards scholars propose. However, this paper suggests a possible
definition of authenticity wherein a test may be seen to be authentic in terms of
language testing. It is possible to design an authentic test by the definition proposed.
This paper will argue and defend its definition by various readings, and support the
notion of the impossibility to create a truly authentic test.
Understandings of authenticity:
listening, for instance, numerous questions arise; do we merely read the passage to the
participants, or should the passage be extracted from a radio broadcast, for example?
Which is more a replication of a real-life situation? And is it, in fact, measuring what
the test is intended to measure (validity)?
Issues and problems revolving around authenticity and its relation to the
construct validity:
There are different reasons why tests cannot be truly authentic. Throughout the
literature about authenticity it is suggested that an authentic test is what replicates
real-life. Davis et al (1999, p. 13) confirm that authenticity can never be completely
achieved. This is highly probable for four reasons. To begin with, the fact that a test
is under assessment shatters the concept of real-life situations (Davis et al, 1999, p.
13) that are genuine. This is called the 'real-life' approach (RL) to authenticity
(Bachman, 1990, p. 301). Spolsky (1985, p. 31) confirms this notion and as a result,
suggests that observation of authentic language behaviour, or 'authentic test
language' (Stevenson & Spolsky, cited in Shohamy & Reves, 1985, p. 54), produced
by participants may produce an authentic form of assessment. Spolsky adds here that
participants undertaking tests are placed under scrutiny, arousing anxiety which in
turn affects the results of testing either in a positive of negative way. This is a
logical conclusion since the 'communicative context' is an assessment context
(Stevenson, 1985, p. 44).
administered in a specific time, place, and with specific participants contradicts the
notion of authenticity in testing whereby authenticity is perceived at times as a
reproduction of real-life situations. In addition, in real life variables change and
variously affect the process of language, whereas tests have controlled variables.
Finally, this approach is merely concerned with 'face validity' (Bachman, 1990, p.
315), which in turn neglects accurate assessment. For instance, when students are
debating a topic the teacher is attempting to assess their speaking proficiency;
however, with the natural flow of language, how can criteria be designed to
accommodate efficient assessment? The fact remains that the assessment is murky
and therefore, erroneous in itself.
Equally,
Bachman
(1990,
p.302)
approaches
authenticity
from an
The authenticity of testing is embedded within the test design. Brown (2001,
p. 463) emphasises that authenticity is in the implementation of the activity, not the
test design, Bachman (1990, p. 300), however, identifies authenticity in the recreation
of language use through testing. In other words, to achieve authenticity, test design is
an essential factor. To assess a test, verification of its constructs needs to correlate
with learning outcomes. All in all, construct validity is operationalised through test
items.
Furthermore, Weir (2005, p. 14) indicates that we need to explicitly define the
construct of measurement to a precise procedure before designing the test so as to
achieve accurate validity in a test. Because construct validity is a psychological trait
which operates in the brain it needs to be interpreted with great care (Brown, 1996,
pp. 239-240). Through accurately designing tests' constructs, authenticity may be
partially achieved as constructs represent the purpose as well as the back bone of its
design. As initially stated in this paper, in order for authenticity in tests to be
achieved, the constructs must possess crystal clear objectives. It is important that
objectives must be measured to obtain the level of proficiency. That is to say,
constructs should be quantified in measurable terms (Fulcher & Davidson, 2007, p. 7).
For instance in order to assess speaking, the examiner must formulate a scale in which
arrays of constructs may measure the learners' proficiency level. As a result, there will
be a direct correlation between the tests' constructs and its design.
However, the theory should not dominate the test design but rather be shaped around
its rationale.
Another fundamental feature Messick (1995, pp. 745- 749) mentions is the
'structural' aspect. In simpler words, to what extent does the scoring reflect upon the
task? There is a direct negative correlation between the scoring and the RL approach
introduced by Bachman. The approach in turn is difficult or impossible to evaluate
with precision. For this purpose the third aspect of construct validity, 'consequential',
is also imprecise. Messick suggests that consequential validity is the implications
and outcomes of tasks scoring. That is to say that the implications and test use are
linked to validation. It is a 'progressive matrix formation'. Undeniably, we can safely
conclude that such an approach is irrespectively a weak one as it results in an array of
contradictions.
Conclusion:
11
Shohamy, E., & Reves, T. (1985). Authentic language tests: where from and where
to? Language Testing, 2(1), 48.
Spence-Brown, R. (2001). The eye of the beholder: authenticity in an embedded
assessment task. Language Testing, 18(4), 463.
Spolsky, B. (1985). The limits of authenticity in language testing. Language Testing,
2(1), 31.
Stevenson, D. K. (1985). Authenticity, validity and a tea party. Language Testing,
2(1), 41.
Wall, D., Calpham, C., Alderson, C.J. (1991). Validating tests in difficult
circumstances. In C. J. Alderson & B., North (Eds), Language testing in the
1990s (pp. 209-225). London: Macmillan Publishers Limited.
Weir, C. J. (2005). Limitations of the Common European Framework for developing
comparable examinations and tests. Language Testing, 22(3), 281.
Wiggins, G. P., & McTighe, J. (2005). Understanding by design: Association for
Supervision & Curriculum Development.
Wu, W. M., & Stansfield, C. W. (2001). Towards authenticity of task in test
development. Language Testing, 18(2), 187.
12