Skehan Modelsof Speakingandthe Assessmentof Second Language Proficiency
Skehan Modelsof Speakingandthe Assessmentof Second Language Proficiency
Skehan Modelsof Speakingandthe Assessmentof Second Language Proficiency
net/publication/371469491
CITATIONS READS
0 153
1 author:
Peter Skehan
Institute of Education University College London
105 PUBLICATIONS 15,499 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Language Aptitude: Pushing the Boundaries in Theory and Testing View project
All content following this page was uploaded by Peter Skehan on 10 June 2023.
Peter Skehan
This is an earlier version of the article which appeared as: Skehan, P. (2009c). Models of speaking and the
assessment of second language proficiency. In A. G. Benati (Ed.), Issues in Second Language Proficiency
(pp. 202-215). London: Continuum International Pub Group.
Within applied linguistics, the assessment of oral second language proficiency has been
what is involved in the speaking process. It is the purpose of this article to explore the
relevance for such proficiency assessment of theory in the area of first (and second)
that these two ancillary areas have much to contribute to how we design and evaluate oral
assessment procedures.
Models of Speaking
The dominant model of first language speaking is that of Levelt (1989, 1999, Kormos
2006). This model proposes that there are three general stages in speech production, as
1
• speaker selects relevant information in preparation for construction of an intended
utterance
Formulation
• includes the process of lexicalisation, where words that the speaker wants to say
are selected
• includes the process of syntactic planning where words are put together to form a
sentence
• includes the process of phonological encoding, where words are turned into
sounds
Articulation
The first stage, Conceptualisation, is concerned with developing and organising ideas to
be expressed, in a particular situation, and with a particular emphasis and stance. The
second stage, Formulation, involves lexical selections to match the preverbal message,
triggereing lemma retrieval from the mental lexicon, and then syntax building from the
information retrieved through this retrieval. The third stage is concerned with
transforming this form of representation of ideas, lexis and syntax into actual speech.
2
Fairly obviously, if we are dealing with native language speech, certain assumptions are
made. Most central for present purposes is that the Conceptualiser delivers challenges to
the Formulator which the Formulator is able to meet, since the driving force for its
operation is the mental lexicon, which is extensive, well-organised, and contains rich
capacity to retrieve such rich information very quickly that enables speech production to
Conceptualiser can be working on one stage of operation, while the Formulator is dealing
When we turn to second language speaking, things are very different. The ‘driver’ for the
Levelt model is the mental lexicon, which is the repository of considerable information
about each lemma, information which underpins the production of native language
speech in real time. With second language speakers, this mental lexicon is:
smaller, so that many lemmas required by the pre-verbal message, will not be
available
incomplete, so that where many lemmas are part of the mental lexicon, they are only
information etc.
3
less organised, so that connections between lemmas do not prime one another, or
less redundantly structured, in that collocational chunks are less available, so that
speech production has to be done more often ‘from first principles’ on the basis of
rule-generated language
The result of all these omissions is that during speech production, the pre-verbal message,
try to find alternative methods of expressing their meanings, or to find ways of using the
resources that they have sufficiently quickly so that ‘normal’ communication can
proceed. This is likely to present serious difficulties for the modular, parallel operation of
the Levelt model, so that some of the time, this ideal set of interlocking processes is
severely impaired.
Task Research
The preceding discussion has been general, and data-free. It has assumed that second
language speakers, of different proficiency levels, will encounter difficulties, and that it is
meaningful to express these difficulties in terms of the Levelt Model. What is necessary
this end, we will next review findings from the literature on task-based performance.
This literature provides a wide range of generalisations regarding the way task
4
characteristics and the conditions under which tasks are done might have an impact on
different aspects of second language performance. These findings can take us beyond the
Broadly, the task findings can be portrayed in terms of the information that tasks are
based on, the operations that tasks require to be performed on this information, and the
conditions under which tasks are done. These three headings organise the findings which
have emerged in the task literature. Regarding information, we can say that:
the scale of the information (e.g. the number of elements, the number of participants)
the type of information has a strong influence, with concrete information being easier
to handle than abstract information which in turn is easier to handle than dynamic
information (Brown et al. 1984), and also with there being a tendency for concrete
greater familiarity with the information is associated higher levels of accuracy and
greater organisation and structure in the task information leads to performance which
is also more accurate and more fluent (Tavakoli and Skehan 2005)
There are interesting theoretical issues which relate to this range of findings. For now, we
will simply say that these are a set of empirically based generalisations which have
5
any case, we move on to consider the operations which are carried out on the information
in tasks. Here another set of generalisations can be offered, contrasting simple and
complex operations. Simple operations, e.g. listing, enumerating (Willis and Willis
2007), or retrieving (Foster and Skehan 1996) or simply describing are associated with
information, such as sequencing (Willis and Willis 2007) or reorganising (Foster and
(Robinson 2001, Robinson and Gilabert 2007) are associated with greater language
between the performance areas of complexity, on the one hand, and accuracy and
fluency, on the other. Complex operations seem to drive greater language complexity,
while simpler operations are more likely to be associated with higher fluency and
accuracy.
The final area to consider here is the conditions under which tasks are done. This has
provoked considerable research effort. Robinson (2001) reports higher levels of accuracy
(input presence). A whole series of studies has examined the effect of pre-task planning
on performance. The results suggest that such planning is consistently associated with
greater complexity and fluency, with these generating sizable effects, while accuracy is
generally advantageous, but not to the same degree (Ellis, in press, Foster and Skehan
1999). Yuan and Ellis (2003) have reported that opportunity for on-line planning, i.e.
more relaxed time conditions during performance, are associated with greater accuracy.
6
Skehan and Foster (1997, submitted) have proposed that requiring second language
accuracy. Finally, Skehan (in press) proposes that tasks which contain unavoidable and
more difficult lexis are associated with lower levels of accuracy and complexity.
The generalisations about second language task-based performance provide the basis for
examining the relevance of the Levelt model in accounting for the second language case.
An initial assessment would suggest that it has to be applicable. Essentially, the model
proposes that thought precedes language, and that the functioning of the model is
essentially concerned with how Formulator operations, and associated resources, such as
the mental lexicon, can enable the thoughts embodied in the pre-verbal message to be
expressed through externalised language. One is drawn into saying that anything other
than this would not be satisfactory on general, logical grounds. Communication requires
ideas, and the ideas, if they are to be transmitted effectively, require language.
The difficulty, of course, is that first language speakers are able to function in this way
because they have mental lexicons which are extensive, elaborate, analysed, and
accessible. Pre-verbal message demands can therefore be met, and ongoing speaking in
real time is possible. In particular, the different modules within the Levelt model can
function in parallel, i.e. the Conceptualiser may be working on the next pre-verbal
message, while the Formulator is working on the current one, even while the Articulator
7
is giving voice to the previous. This is possible because the different modules can handle
the demands which are placed upon them in real-time, and smoothly, such that successive
Problems occur when a second language speaker attempts the same ideas-to-language
mapping, but is equipped with a much more limited mental lexicon. The result is, as
Kormos (2006) argues, that parallel processing below the level of consciousness is
replaced by serial, effortful and conscious attempts to deal with pre-verbal message
demands on the mental lexicon – demands which are either beyond it, or which require
and Kasper 1983). This is turn brings us back to the question of the applicability of the
model. What is argued here is that the model is still relevant (for the general reasons
proposed earlier), but that it is inappropriate to regard the model as an ‘all or none’ affair.
The generalisations from the previous section show that performance is systematically
influenced by a range of information, operation, and condition influences, and that these
argued that what we have learned from this research is the conditions which facilitate the
relevance of the Levelt model and those that do not. So the question needs to be recast so
that we explore not whether there is a simple positive or negative answer to applicability,
but rather the influences which make the model a more useful explanatory account and
those which suggest that its potential for second language speakers is limited.
8
So far, in reviewing empirical work, we have presented research in the terms which are
necessary to consider an alternative perspective, and to organise the findings which have
reviewed in a way more consistent with the model, and transparent in terms of its
monologic dialogic
post-task condition
Two introductory points are helpful here. First, the table is organised around the central
‘spine’ which shows the relevant stages of the Levelt model. There is a section which
focuses on the Conceptualiser, and then a separation between the two Formulator stages
9
of lemma retrieval (and access to the mental lexicon), and syntactic encoding (which is
driven by the information made available when lemmas are accessed). These three stages
organise the presentation of the information from the task-based performance literature.
Second, the outer columns are concerned with four general influences on performance,
two which indicate difficulty (complexifying, pressuring) and two which are more
expressed more demanding, elaborate, or extensive. These influences are mostly relevant
more difficult, but are not connected with differences in the complexity of the message.
In contrast, they are concerned with the time under which a task is done, or the amount or
nature or inflexibility of the material which is involved. Easing is, in a sense, the reverse
of Complexifying, and entails the ways in which the pre-verbal message can be arrived at
in a more direct manner. Finally Focussing (which is not the reverse of Pressuring),
concerns the way in which performance conditions themselves introduce some level of
This framework is useful for re-presenting the results from the task literature in ways
which bring out more clearly how the findings we have can illustrate the differing
degrees of applicability of the Levelt model to the second language speaker case. We will
10
There are three clear influences on Complexifying. The first involves planning. At the
outset, it needs to be said that planning appears more than once in Table 2, reflecting the
different things that can happen during planning. Here the facet of planning time which
is focussed on is when preparation is associated with making the ideas in a task more
complex than they otherwise would be. It is clear (Skehan and Foster 1999, Ortega 2005,
Skehan and Pang, 2008: ms.) that this happens some of the time. Speakers use planning
time to explore the ideas in a task, and consequently approach the task as more
challenging than it otherwise would be. For example, in an Agony Aunt task, planning
may be used to generate more complex advice to the writer of a problem letter, or in a
A second influence here could be the more complex operations which a task inherently
provided, e.g. Foster and Skehan’s (1996) narrative task, where a story had to be invented
from a series of pictures, rather than a pre-existing storyline simply narrated. Or there
might be the need to integrate information to tell a story effectively, as in Tavakoli and
Skehan’s (2005) task where background and foreground elements had to be connected to
one another. There might also be a greater need to use reasoning, as in Michel et al’s
task (2007) involving selection of a cellphone where there are many features which
The third kind of influence concerns the type of information which is involved. Brown et
al (1984) showed that dynamic information, e.g. relating to a changing scene, is more
11
difficult to deal with abstract information, which in turn is more difficult to deal with than
static information. It seems that the greater difficulty in dealing with such information
types is more demanding of memory resources, and where there are also more complex
operations involved, this adds to task complexity. It is assumed that all three influences
here concern the Conceptualiser stage of speech production, and the nature of the
We turn next to a series of influences which increase the pressure on the operation of the
Formulator stage. Regarding the lemma retrieval stage, the first influence is the
infrequency of the lexis which is involved in a task. It appears that when less frequent
lexis is required, this has damaging implications for the complexity and accuracy of the
language which is produced (Skehan, in press). This contrasts with native speaker
– less standard lemmas seem to drive more complex syntax in a harmonious way. So
tasks which push learners to need more difficult lexis seem to give them lemma retrieval
problems which spill over, because of their attentional demands, into other aspects of
performance. Lemmas are retrieved slowly and imperfectly, and the additional effort
required for this disrupts the parallel processing of the material for speaking.
Similarly, non-negotiable tasks (Skehan and Foster, 2007) also cause pressure. Native
speakers, when generating language, are able to draw upon a range of alternative choices
relatively effortlessly. So during speech production they can make a range of selections
as they are producing an utterance (Pawley and Syder 1983). Non-native speakers do not
12
have the luxury of such choices, and as a result are less able to adapt if a first lexical or
syntactic choice is unavailable in the mental lexicon. Where tasks are negotiable, such
speakers can adapt and revise the pre-verbal message so that it meshes more easily with
resources which are available. When tasks are non-negotiable, however, as with
narratives where the input is given, this is not possible, and disruption of performance
results.
These two pressure-inducing factors are concerned with the nature of the meanings that
are required for a task. The remaining pressuring influences are, in one way or another,
simply associated with time. Most obviously, requiring tasks to be done under timed
conditions is going to add to the speaker’s problems. Ellis (2005) has reviewed these
studies and shown that on-line planning is a possibility when time pressure conditions are
relaxed, with the result that greater accuracy is obtained. When there is greater time
pressure, in contrast, the result is that accuracy is lowered. But a related issue concerns
whether a task is monologic or dialogic. Monologic tasks are consistently more difficult
(with lower accuracy, and sometimes, lower complexity Skehan and Foster 2007). What
seems to happen here is that the speaker, being responsible for keeping the discourse
going, has to plan, execute, monitor and continue speaking without any respite. The result
contrast, a more dialogic condition does enable one speaker to have a break while
interlocutors are speaking. As long as the other speakers’ contributions are processed
sufficiently, there may spare capacity available while listening to enable a speaker to plan
13
interlocutors if it is appropriate to his or her own contribution. Monologic tasks do not
give any easy natural breaks. Dialogic tasks do and so reduce the pressure, overall
(Skehan, in press).
We turn next to factors which ease performance. Some of these can simply be dealt with
as the reverse of the effects we have already covered on complexifying or pressuring. For
example, regarding information, we can say that tasks based on concrete, static
information (Foster and Skehan 1996); tasks involving less information (Brown et al.
1984); and tasks which require simpler operations, such as listing (Willis and Willis
(accuracy, fluency) are likely to be raised. Similarly, task conditions which reduce
pressure, such as a dialogic task, also make the task easier, as more time is available to
plan on-line. But there are other influences which are not mirror images of what has been
said before. For example, planning figures here, but now in different ways. If planning is
directed to organising what will be said, actual performance is eased (Pang and Skehan
2008: ms.). The planning, effectively, handles the wider plan of what will be said, so that
during actual performance, major ideas do not have to be developed and the speaker can
focus on the surface of language. Similarly, if a task is structured (Tavakoli and Skehan
2005), speakers are more likely to be able to exploit the macrostructure of the task, and
not need to engage in deeper planning. They too can focus on the surface, and are able to
mobilise Formulator resources more effectively, and thereby achieve higher levels of
ideas, which complexifies, and organising, which eases, planning may be directed to
14
rehearsing language for actual performance (Ortega 2005). This too eases, but on the
assumption that what is rehearsed is both remembered and is actually useful during
performance. If these conditions are met, the result is that performance is eased.
So far, we have been looking at the interplay of ideas and their realisation through
terms of language complexity, accuracy, and fluency, and have tacitly assumed in
assumption has been that speakers have limited capacities, and that task difficulty, as well
Performance, in this view, is a reflex of other influences. But there are also studies which
suggest that second language speakers, with their limited attentional capacities, may
choose to prioritise particular performance areas. We have seen this slightly with the way
planning time can be directed towards rehearsal, where speakers use the preparation time
in order to be ready with specific language, and as a result target accuracy, or sometimes,
particularly complex structures. But there are other ways in which the same selective
effect can occur, always with a focus on some aspect of form, either directed towards
accuracy or complexity. We have already seen how interactive tasks can help speakers
since they provide on-line planning opportunities (Foster and Skehan 1996; Skehan and
Foster 1997). But the effects of such tasks may be wider. Dialogic tasks make salient the
existence of an interlocutor, and it may be that speakers increase their focus on accuracy
selectively attending to accuracy and avoiding error (Pang and Skehan, 2008: ms.). In
15
this way, they prioritise attentional focus through awareness of their interlocutor’s
comprehension needs.
These claims, though, are based on interpretations of research studies with different
research foci. Two studies, though, specifically examine how speakers may have the
capacity to prioritise particular performance areas. Skehan and Foster (1997) showed that
accuracy during their task performance. Subsequently, Skehan and Foster (submitted
and language-focussed condition, and hypothesised that this would have a stronger effect
specifically on accuracy. This prediction was borne out, not only for a decision-making
task, as in Skehan and Foster (1997) but also for a narrative task. In addition, for the
decision-making task, complexity, too, was also significantly raised. These results
suggest that effective task conditions can lead speakers to focus on particular aspects of
16
The analysis and review of research presented so far suggests that attentional limitations
are vital in understanding performance on second language tasks and that one can
operations upon information push speakers to express more complex ideas. This set
Easing, with factors such as the reverse of the last set and which therefore simply
reduce the work the Conceptualiser has to do, coupled with other factors, e.g.
speakers clear macrostructure for what they want to say, or provide more on-line
have less opportunity to regroup while speaking, and are deprived of on-line planning
form is injected into task performance so that attention is directed in particular ways,
These are a set of factors which influence what is going to be said and how it is going to
be said. What is central to this analysis is the balance between Conceptualiser pressure
(or lack of it) and Formulator pressure (or lack of it), and how these two sets of pressures
17
attentional system more things to do, depriving other areas of resources. Easing has the
reverse effect, where ideas are manipulable or packagable more easily, thus releasing
attention for use in other aspects of speech performance. Pressuring has the general
effect of depriving all other areas of performance of time that would be useful. Finally,
on elsewhere.
We now need to switch and try to consider how this analysis might have relevance for the
some of the basics of language testing – that testing concerns the ways we use the
provide learners with tasks to do, of increasing levels of difficulty, and then observe what
is the maximum level of difficulty that can be successfully handled. (This is like treating
only one dimensions (as when the bar gets heavier in weightlifting). Sampling, in this
view, means assessing learners along this one-dimensional scale of difficulty. But it can
dimensional (Skehan 1984), with the result that sampling has to be directed at probing the
different dimensions that are important (whatever they may be) and then decisions have
to be made about how strengths and weaknesses in performance across these dimensions
18
The major insight from the analysis presented in this chapter is that for spoken language
performance we have to analyse test tasks first in terms of the influences covered in Table
2, and particularly in terms of the demands they make on Conceptualiser and Formulator
different from what influences the Formulator and so Conceptualiser difficulty is not the
same as Formulator difficulty. In a sense, it is what the Conceptualiser does that shapes
the overall difficulty of the ideas which are expressed in a task, and the influences on the
Formulator are then constrained by challenges set by the Conceptualiser, although this
sampling, and what the above analysis does is clarify the basis on which sampling needs
to take place. It is a truism of testing that one-item tests are non-functional, and so if we
apply this to the assessment of spoken language performance, this means that a series of
tasks will be necessary for any effective assessment to be made. The matrix in Table 2
helps clarify how a range of tasks and a range of performance conditions can be sampled
as the basis for language testing. Different Conceptualiser influences and different
Formulator influences need to be drawn on if any rounded estimate of ability for use is to
be provided.
This approach interacts with how performance itself is measured (Pollitt 1990).
counting. There needs to be a more complex rating of performance, and this in turn
means that one has to decide what areas of performance should be represented in the
19
different rating scales which are used. We have seen that Conceptualiser work is reflected
in language complexity, and so this aspect of performance requires valid and reliable
rating in terms of what is often termed Range in language assessment, i.e. use of syntax
and vocabulary. But Formulator activity is associated with greater fluency and accuracy,
other areas where rating scales exist, with precisely these headings. In other words, a
complexity (range), accuracy, and fluency if we are to obtain any satisfactory overall
assessment of the quality of a second language speaker’s performance. Only with such
information will we be able to make effective prediction about how second language
References
Brown G., Anderson A., Shillcock R. and Yule G. (1984), Teaching Talk: Strategies for
Ellis R. (in press), ‘The Differential Effects of Three Types of Task Planning on the Fluency,
London: Longman
20
Foster P. and Skehan P. (1999), The effect of source of planning and focus on planning
Foster P. and Skehan P. (2009, submitted), The effects of post-task activities on the
Kormos J. (2006), Speech Production and Second Language Acquisition, Mahwah, N.J:
Lawrence Erlbaum
Levelt W.J. (1989). Speaking: From intention to articulation, Cambridge, Ma: MIT Press
Levelt W. (1999), Language production: a blueprint for the speaker, In C. Brown and P.
University Press
Michel M.C., Kuiken F. and Vedder I. (2007), The interaction of task condition and task
Ortega L. (2005), What do learners plan? Learner-driven attention to form during pre-
Pang F. and Skehan P. (2008: ms.), Using a model of speaking to explore second
Pawley A. and Syder F.H. (1983), Two puzzles for linguistic theory: nativelike selection
and nativelike fluency, In Richards J.C. and Schmidt R. (Eds.), Language and
21
Pollitt A. (1990), Giving students a sporting chance: assessing by counting and judging,
In C. Alderson and B. North (Eds.), Language Testing in the 1990s, pp. 46-59,
London: Macmillan
Robinson P. (2001), Task complexity, cognitive resources, and syllabus design: a triadic
Robinson P. and Gilabert R. (2007), Task complexity, the Cognition Hypothesis, and
Skehan P. (1984), “Issues in the testing of English for Specific Purposes”, Language
University Press
Skehan P. and Foster P. (1997), The influence of planning and post-task activities on
1,3, pp 185-211
Skehan P. and Foster P. (1999), The influence of task structure and processing conditions
Skehan P. and Foster P. (2007), ‘Complexity, Accuracy, Fluency and Lexis in Task-based
22
A. and Kuiken F., Pierrard M. and Vedder I. (Eds.), Complexity, Accuracy, and
Brussels Press
Tavakoli P. and Skehan P. (2005), Planning, task structure, and performance testing, In
Willis D. and Willis J. (2007), Doing Task-based Teaching, Oxford: Oxford University
Press
Yuan F. and Ellis R. (2003), The effects of pre-task planning and on-line planning on
23