(Task-Based Language Teaching) Peter Skehan - Processing Perspectives On Task Performance (2014, John Benjamins Publishing Company) PDF
(Task-Based Language Teaching) Peter Skehan - Processing Perspectives On Task Performance (2014, John Benjamins Publishing Company) PDF
(Task-Based Language Teaching) Peter Skehan - Processing Perspectives On Task Performance (2014, John Benjamins Publishing Company) PDF
Editors
Martin Bygate John M. Norris Kris Van den Branden
University of Lancaster University of Hawaii at Manoa KU Leuven
Volume 5
Processing Perspectives on Task Performance
Edited by Peter Skehan
Processing Perspectives
on Task Performance
Edited by
Peter Skehan
St. Mary’s University, Twickenham
chapter 1
The context for researching a processing perspective on task performance 1
Peter Skehan
chapter 2
On-line time pressure manipulations: L2 speaking performance under
five types of planning and repetition conditions 27
Zhan Wang
chapter 3
Task readiness: Theoretical framework and empirical evidence
from topic familiarity, strategic planning, and proficiency levels 63
Bui Hiu Yuet Gavin
chapter 4
Self-reported planning behaviour and second language performance
in narrative retelling 95
Francine Pang & Peter Skehan
chapter 5
Get it right in the end: The effects of post-task transcribing
on learners’ oral performance 129
Li Qian
chapter 6
Structure, lexis, and time perspective: Influences on task performance 155
Zhan Wang & Peter Skehan
chapter 7
Structure and processing condition in video-based narrative retelling 187
Peter Skehan & Sabrina Shum
viii Investigating a Processing Perspective on Task Performance
chapter 8
Limited attentional capacity, second language performance,
and task-based pedagogy 211
Peter Skehan
Author Biodata 261
Index 263
Series editors’ preface to Volume 5
It is our pleasure to introduce the fifth volume in this series, a collection edited by Peter
Skehan and entitled Investigating a Processing Perspective on Task Performance. This
book is in many ways a culmination of work initiated by Skehan some two decades ago,
as it builds upon the theoretical perspectives of his Tradeoff Hypothesis and extends
from the considerable associated research into task types, characteristics, and imple-
mentation conditions. Of primary interest in this volume is the relationship between
task design variables and their effect on how language learners produce speech for
communicative purposes. Tasks here are generally brief spoken narratives of the sort
that have grown in popularity as primary pedagogic tools of task-based instruction that
seeks to provide a focus-on-form and – meaning simultaneously. Beyond their apparent
face value as opportunities for practicing L2 speech and developing fluency, such tasks
offer the intriguing possibility of drawing learners’ attention to form-meaning con-
nections, initiating learner analysis and restructuring of their interlanguage, improv-
ing their control of the language, and ultimately pushing the development of language
knowledge and proficiency. The main goal here is for learners to be able to produce
complex, accurate, and fluent L2 speech, with tasks being employed to integrate the
various learning processes; and the key question then is “how?”
Beginning in the early 1990s, and presented first in an influential article “A frame-
work for the implementation of task-based instruction” followed by the highly cited
book A Cognitive Approach to Language Learning, Peter Skehan proposed that learner
performance on these kinds of tasks was determined in part by the fundamentally
limited cognitive resources that a person has available during speech. Telling a story,
describing a picture, explaining a process – these tasks consume the attention that
learners have at their disposal, and which, as a consequence, needs to be divided
between fluency, accuracy and complexity of their performance. In this respect, cer-
tain tasks have been claimed to make greater or lesser demands on cognition; in a
similar vein the conditions under which learners are asked to perform tasks may influ-
ence what they focus on in their production. For example, providing learners with the
opportunity to plan prior to telling a story may free up attentional resources, result-
ing in spoken narratives that are lexically more diverse, syntactically more complex,
grammatically more accurate, and so on. Building from these observations into peda-
gogic implications, Skehan advocated for cycles of tasks that were selected, designed,
and sequenced to intentionally shift learners’ attention between a focus on fluent and
efficient communication versus the opportunity to restructure and ‘push’ language
production at the cusp of interlanguage development.
Skehan’s groundbreaking ideas, along with the competing theoretical position of
Peter Robinson’s Cognition Hypothesis (see volume 2 in the TBLT:IRP series), inspired
x Investigating a Processing Perspective on Task Performance
a generation of task-based research into the effects of task features (types, conditions,
characteristics) on the cognitive complexity of task demands and the resulting influ-
ences on L2 performance. It is no exaggeration to suggest that this line of work has
led to the majority of published empirical research related to TBLT during the 1990s
and 2000s. In the current volume, Skehan and his students at the Chinese University,
Hong Kong, make two major contributions to this accumulating domain of research.
First, the empirical studies compiled here reflect a relatively comprehensive agenda of
research into key dimensions of the Tradeoff Hypothesis: Taken together, they illumi-
nate, strengthen, and extend patterns of findings previously attested in the literature.
In particular, they address the role of pre- and during-task planning, the effect of post-
task demands on during-task performance, and the relationship between greater or
lesser task structure and performance. Second, in his introductory and concluding
chapters, Skehan offers an extended and updated explanation of the theory underly-
ing his processing perspective on tasks, emphasizing the grounding of his ideas in
a Leveltian psycholinguistic model of speech production. He also provides a pains-
taking review and synthesis of findings from the studies in this volume in conjunc-
tion with previous research, thereby updating, clarifying, and expanding the Tradeoff
Hypothesis in critical ways.
The chapters collected in this volume, based on creative and insightful research
designs emanating from a core theoretical coherence, offer concrete findings that have
the potential to inform task design and implementation for language learning pur-
poses in important ways. To be certain, the volume is about specific kinds of tasks
(relatively brief speaking tasks) that are controlled and manipulated according to a
handful of theoretically motivated principles, and the claims forwarded here cannot
be automatically generalized to all ‘tasks’ that might shape what happens in task-based
instruction. In that regard, and in order to encourage future research that builds upon
the solid foundations laid in this volume, we would like to repeat here a suggestion that
we forwarded in our preface to volume 2 of this series:
It is, therefore, perhaps not overly ambitious to suggest that a next phase in this
particular research agenda would turn to the embedded investigation of cognitive task
complexity as one aspect of TBLT educational design and implementation. Certainly,
it will be only through such ecologically valid research that the ultimate contribution
of these important ideas – in interaction with the variety of other factors at play in
long-term and otherwise complex language teaching and learning – will be realized.
We look forward to featuring such work in a future volume of this series.
Along these lines, we hope very much to see work in the near term that takes important
theoretically motivated and empirically attested ideas, like those presented in the cur-
rent volume, and explores their implementation in task-based educational p ractice –
such is the potential and intent of TBLT as a truly researched pedagogy.
Producing a book is always nice, but this book is the source of particular pleasure.
In January 2004 I took up a post as Professor of Applied Linguistics in the English
Department at the Chinese University of Hong Kong, and spent six years there. Many
things characterised this period, but outstanding amongst these were the opportuni-
ties for doing empirical research. These came in two major guises – opportunities
to apply for research grants (both internally to the university, and more importantly,
externally to the Research Grants Council of Hong Kong), and the delight of supervis-
ing doctoral students. In the former category, I was lucky enough to obtain one inter-
nal grant, and two external grants, both targeting task-based performance, enabling
me to undertake a programme of research, rather than one individual study. In the
latter case, I was extremely fortunate to supervise a succession of talented doctoral
students who were a superb combination of being individually highly motivated, as
doctoral students must be, and also willing to work within a framework. The delight
was that they each pursued a personal research agenda, but that agenda fitted within
the wider programme which motivated my research.
The result of all this is the present volume, which is an edited collection, but with
the difference, that while it has a sustained focus on task-based performance, this is
studied within certain parameters:
–– a view that attention is limited and that we must explore what the consequences
of this are
–– a concern to explore how task characteristics influence performance
–– an interest in the conditions under which tasks are done, with a certain amount
of focus on planning
The result is a collection which makes a cumulative contribution, rather than being
characterised by disparate chapters brought together through quality, but lacking a
shared focus. So, it is a source of pride to me that collectively we have been able to
produce such a volume, with the aim of contributing to the understanding of second
language spoken performance as well as the literature on second language tasks. In so
doing, we also hope to bring out the ‘researchability’ of the area, as well as its practi-
cal relevance. A processing approach to second language performance has provided
fertile to us. We hope to convince other people of its utility.
There are many people to thank in relation to this volume. The Editors of the
series deserve considerable thanks, first for being encouraging about the idea of
the volume, and then for the considerable work each has put in with the chapters
of the volume, work which has strengthened the chapters considerably. I would
xii Investigating a Processing Perspective on Task Performance
also like to thank other Ph.D. students from the Chinese University, Hong Kong,
who worked as research assistants on various of the specific research projects: Dai
Binbin, Cai Jing, and Ren Hongtao. I am also grateful in that regard to the English
Department at CUHK which funded this research assistance, especially Tracey
Liang, and also M.A. students who played a part, by examining and rating candi-
date tasks that we were considering.
I would also like to offer thanks to two individuals who have strongly influenced
the processing research reported here. First there is Willem Levelt, whose model of
first language speaking is the starting point for many of the studies which follow. The
structure and theoretical foundations that this model provides have been immensely
influential and had a strong guiding role. Second, there is Peter Robinson, whose
Cognition Hypothesis stands in clear opposition to my own Tradeoff Approach. I am
grateful to him for the clear opposing standpoint to my own and because we share the
frame of reference by which the different positions can be judged. The disagreement
has been amicable and stimulating, and I feel my work has benefited enormously as a
result.
Finally, to return to the institutional context in Hong Kong, I would like to thank
the Arts Faculty at the Chinese University for an internal grant that funded research
that underpins Chapter 4, and the Research Grants Council for two Earmarked Grants
which were the main basis for Chapters 4, 6, and 7, as well, essentially, for much of the
material in Chapter 10. The RGC also deserves considerable thanks for the doctoral
studentships which supported all the remaining chapters.
chapter 1
Peter Skehan
St. Mary’s University, Twickenham
Introduction
In this chapter there is a need to set the scene, and explore why the contributing
authors were interested in tasks, task conditions, and task measurement, and why we
thought these were important enough to occupy us all for several years. To do that we
have to go back a little, since growth in interest in tasks has been a major development
in itself within applied linguistics. Communicative language teaching in the 1970s and
1980s represented a major shift in the goals and methods of language instruction rela-
tive to the structure-domination of the years before. An approach which gave much
greater importance to meaning and to language use as an important basis for develop-
ment brought about a shift in the entire profession, from the nature of coursebooks
and language teaching methodology, through assessment procedures, to the goals and
procedures of teacher education. All of these put the learner centre-stage, and saw
interaction as vital. One interpretation of a communicative approach has been to orga-
nise teaching around the use of language learning tasks, tasks which have meaning-
primacy, a focus on outcomes, and some connection with real-world language use.
In this respect, it has been fundamentally important that task-based approaches have
attracted considerable research and theoretical interest. Regarding the former I have
argued elsewhere (Skehan 2011) that the approach is distinguished most clearly by
the way claims are considered to be accountable to data so that tasks are judged not
simply by their appeal to the teacher, but also by their impact on performance, and,
ultimately, development. While there have been notable practical developments (Van
Den Branden 2006; Willis 1996) it is the research accountability that has been the
most distinctive contribution for many working in the task field, enabling a possible
move towards a researched pedagogy.
Tasks also have interesting theoretical linkages. Early approaches to researching
tasks were strongly influenced by the Interaction Hypothesis (Long 1996), and, mak-
ing the assumption that certain sorts of interactional processes are most propitious for
second language development, the initial priority was largely one of exploring how far
2 Peter Skehan
tasks might be designed and used to maximise opportunities for interactional features,
such as recasting and negotiation of meaning, which were deemed to be particularly
helpful in driving acquisition. It was assumed that the goal was to understand how
an interlanguage system develops most effectively, and good task design, empirically
grounded, was seen as the basis for this since it would generate more conversational
feedback – the oil that supported change and progress. Dangers were soon identified
with task-based approaches, and in this regard, the concept of ‘focus on form’ was devel-
oped (Long & Robinson 1998). This proposes that when communication proceeds,
there is a possibility that communicational needs will predominate, and that form will
lose focus (and with this chances of development and control being enhanced will be
compromised). In fact, however, research interpreted processes such as recasting and
negotiation of meaning as promoting a focus on form while communication itself was
still the primary goal of the encounter. In other words, communicational naturalness is
not compromised, but at the same time, conditions are being created which mean that
form is not forgotten and indeed is being nurtured through the targeted, personalised
feedback that becomes available (Doughty & Williams 1998).
Since the late 1980s a somewhat contrasting approach to researching tasks has
emerged. This strand is less concerned with interaction processes (although these still
figure (Robinson & Gilabert 2007)) and is more concerned with task performance
and the processing influences upon it. This performance has been frequently con-
ceived in terms of complexity, accuracy, and fluency. I have argued that it is possible
to conceive of an acquisitional dynamic implicit in these performance areas (Skehan
2009a), such that:
The boldfaced terms are the three areas (complexity, accuracy, fluency) which are
more frequently measured in studies exploring task characteristics and task conditions
and have, recently, been the major focus for a book (Housen, Kuiken & Vedder 2012).
Making the assumption that these three dimensions (which have been shown to be
distinct from one another; (Skehan & Foster 1997; Tavakoli & Skehan 2005) are a good
reflection of performance, the goal is to explore what makes it more likely that each
The context for researching a processing perspective on task performance 3
will increase, whether as a function of the task or of the way the task is done. However,
if it is the case that where attentional resources are limited, the natural priority, in a
communicative context, is to emphasise meaning, rather than form (Van Patten 1990,
1996), the danger would be that form can lose focus, and that advanced language, or
control over less advanced language, might be sacrificed to the primary goal of achiev-
ing fluency and meaning expression.
While this changed analysis of performance itself is important, what is of even
more significance is how influences upon that performance are researched, because
there has been a clear switch in general framework. This has involved a move towards
a more cognitive approach, one in which the functioning of attention and working
memory have become more central. While previous studies had considered factors
which were cognitive in nature (associated with the processes of successful negotia-
tion for meaning), these were not often interpreted within wider cognitive theory.
Researchers did not explore how a focus-on-form which arose during communication
(and which would be prominent in working memory) would then make the transi-
tion to long-term memory, or how many such exposures would be necessary to assist
this transition (but see Doughty (2003) for discussion of this issue). In this light, two
frameworks for studying tasks can be usefully contrasted. My own is to see limitations
in attention as fundamental to second language speech performance, an assumption
which leads to the need to explore what attentional and working memory demands
are made by a task, and the consequences this may have for different performance
dimensions. More demanding tasks are assumed to be likely to lead to prioritization
of fluency over accuracy and complexity – what has become known as the ‘tradeoff ’
hypothesis. Consistent with this view is the suggestion that tasks based on familiar or
concrete information favour a concern for accuracy, as do tasks which contain a post-
task phase. Similarly interactive tasks, or tasks requiring transformation or manipu-
lation of material, or tasks which have had pre-task planning might lead to greater
linguistic complexity. The fundamental principle here then is that demanding tasks
can create problems for a learner/ second language speaker because of processing limi-
tations, and that one can explore methods of mitigating these difficulties, and even try-
ing to nurture improved performance in all dimensions, through effective use of task
choice and task conditions which overcome attentional limitations.
In contrast to this is the position advocated by Peter Robinson, in the form of the
Cognition Hypothesis (Robinson 2001, 2011). He proposes that attention is not con-
strained in the same way as the Tradeoff position argues, and that it can expand to deal
with the demands placed upon it, under certain conditions. Further he proposes that
task complexity (a construct he discusses at some length) is what drives performance,
and that greater task complexity simultaneously raises accuracy and complexity. The
Tradeoff position, in contrast, argues that the range of attentional demands involved
can cause tasks to become more difficult and that a very common manifestation of
4 Peter Skehan
this will be complexity and accuracy being in competition with one another for these
limited resources, with the result that capacity given to one of them will be at the
expense of the other. Apart from arguing that task complexity can increase accuracy
and complexity simultaneously, Robinson also proposes that fluency will be lowered
when there is greater task complexity, an influence on which Tradeoff is generally neu-
tral, since task complexity is seen as influencing what is said rather than how it is said.
Interestingly, therefore, the field has a constructive dispute regarding the impact on
difference performance dimensions of making tasks more complex (Robinson 2011)
or difficult (Skehan 1998). Much research has been generated by researchers attempt-
ing to demonstrate that one position or the other is better at accounting for results –
something the present volume will also attempt to do!
It is clear that the Cognition Hypothesis is more ambitious than the Tradeoff posi-
tion. It is more wide-ranging and comprehensive in nature. It also has been the basis
for applications to pedagogy of a fairly extensive nature, especially as regards curricu-
lum or syllabus design (Robinson 2011). It even has the virtue of making what, for me,
are counter-intuitive predictions, predictions of the sort that are likely, if sustained, to
make unanticipated contributions to the field. In contrast, Robinson (2007), in com-
menting on the Tradeoff position, has suggested that it is vacuous in comparison to the
Cognition Hypothesis, arguing that it does not lead to predictions, so much as to post-
hoc rationalisations of results. Cognition, in principle, should not have this weakness,
since predictions should flow from it rather readily.
So it is useful at this point to clarify what the status of the Tradeoff Hypothesis is (in
an attempt to show that it is not vacuous), and the nature of the foundation it provides
for the chapters in this volume. The starting point here is that I do not think we are,
currently, able to put forward strong and wide-ranging models of task performance. It
seems to me that three points of reference are vital. First, we do need to use what we can
from neighbouring disciplines. In that respect, I have argued elsewhere (Skehan 2009a)
that a model of first language speaking, such as Levelt’s (1989, 1999; Kormos 2006) has
to be the starting point for a credible analysis of the psycholinguistic processes involved
in second language speaking. This model (Levelt 1989) is impressive, but it targets a
first language speaker equipped with a first language mental lexicon. It is obviously
not immediately transferable to the second language case, but it does contain struc-
tures and processes which are bound to relate to whatever is done in second language
speaking. So it becomes, for me, an inevitable starting point, even if it has been shown
to have certain limitations for this different context (De Bot 1992). Second, against the
background that this model provides for organising our thinking about second lan-
guage speaking, we then need to explore how attentional and working memory limita-
tions are important in accounting for the differences between first and second language
speaking: this is where the Tradeoff perspective comes in, because it assumes that the
existence of a far less impressive mental lexicon will have strong influences on how
The context for researching a processing perspective on task performance 5
second language speaking will proceed. Analysing second language speaking through
the Levelt model and its component stages allows us to make sense of where and why
problems might occur. The model can bring out the pressures which cause what ideally
should be a parallel process (Conceptualiser, Formulator and Articulator functioning
simultaneously and in, modular, automatic fashion) to become serial and effortful. In
this way, we have a sounder basis for making predictions about performance (as some
of the later chapters will show). Without such underpinning psycholinguistic theory,
it is difficult to see how we can make much progress. That is, in order to hypothesise
the effects of task complexity or difficulty on performance, it is essential to understand
the kinds of psycholinguistic processes that underlie the production of linear stream
of speech. Third, as a foundational target, we need to establish a wide range of gener-
alisations about patterns of performance and to gain understanding about important
variables, in other words to establish a database of findings, before we can go on to build
effective models for the second language case. Such models are desirable, obviously,
but it serves no-one’s interests if what is put forward is premature. The various research
studies in this volume are an attempt to gain understanding about major influences on
second language performance as a precursor to model building. Even so, on occasions
predictions are possible, even if they are quite limited in nature.
So, given the absence of a convincing model of second language speaking, one has
to have some framework for research to counter any tendency for individual studies to
be conducted in a piecemeal fashion, and to run the risk that they do not contribute to
any coherent picture. It is for that purpose that I have put forward a general framework
for the investigation of tasks, a framework which has the potential to organise findings
as they emerge. The outline of the framework is shown in Table 1:
{{ information pressure
–– post-task
{{ post-task activities
{{ post-task exploitation
that application to real situations where tasks are used is facilitated. First of all, we have
tasks themselves, and here a distinction is made between task types and characteris-
tics and task difficulty. The second of these, task difficulty, where difficulty is seen as
inherent in the task, rather than learner-dependent, has, in my view, seen surprisingly
little progress. We are little closer now to having a scale of difficulty which could be
used, for example, to locate any task within a more extensive syllabus. There may be
features which can be argued to have simpler or more difficult values, such as number
of elements, or concrete versus abstract (and some of these figure in Robinson’s analy-
sis of task complexity). So given two tasks, difficulty ranking might well be possible.
But there is the problem that any task is likely to subsume a bundle of features, and
not all of these features are likely to be jointly simpler or jointly more complex, such
as a small number of concrete elements versus a larger number of abstract elements.
Further, tasks, given their nature, are likely to be strongly influenced by context, and so
what is difficult in one (pedagogic) context or for one particular learner may not be so
difficult in another, through, for example, cultural knowledge or age differences. Then
we have the difficulty that many (good) tasks are capable of multiple interpretations, so
that one person could interpret a given task to make it more difficult than another (this
connects with the wider issue of the predictability of tasks). Foster and Skehan (1996)
illustrated this with different participants making radically different interpretations of
the depth at which they should address judgements about custodial sentences for some
crimes they were given to adjudicate on. Overall then, although some progress has
been made on establishing relative task difficulty, a lot of issues remain to be unraveled
before we can reach a position to make any sort of reliable and valid judgement about
a particular task and the identification of factors that impinge upon this.
In contrast, the study of task characteristics has seen greater progress. This
focusses, in a more micro way, on the relationship between specific task characteristics
and particular performance features. Within the constraints of variation through con-
text differences and different task interpretations by different participants mentioned
above, which push for less predictability, the search here is for connections to perfor-
mance from analysable task features, such as type of information (concrete-abstract;
familiar-unfamiliar material) or organisation of this information (e.g. structured-
unstructured). There is a range of findings in this area already (Ellis 2003; Skehan
2001; Foster & Skehan 2012), and we hope to contribute more through this volume.
The Tradeoff approach, fundamentally, tries to uncover such generalisations regard-
ing the link between task characteristics (concreteness, familiarity) and performance
dimensions (complexity, accuracy, fluency), and use them to gain a better under-
standing of how the Levelt model can illuminate the case of second language learners.
A range of generalisations can then provide a more robust basis for understanding
second language performance, as is discussed in the final chapter of this volume where
the impact of influences such as information familiarity, and task structure on differ-
ent aspects of performance is discussed.
The context for researching a processing perspective on task performance 7
performance. But it is possible that, depending on the nature of the planning activities,
it could equally well be a ‘resource-directing’ variable (in Robinson’s terms) and linked
to particular aspects of performance (such as accuracy). Without relevant research, we
cannot know. Similarly, staying with the assumption in the Cognition Hypothesis that
time perspective is seen as a task complexifying variable, a framework such as that in
Table One allows us to explore visual support and input dominance (which both oper-
ate in the Here-and-now condition) separately, and not assume that a construct such
as task complexity needs to be invoked. The framework also allows interactions to be
explored. It is a simple matter to hypothesise that different variables have a conjoint
effect (i.e. condition-seeking research; McLaughlin 1980), such that together they pro-
duce an effect that is more than the sum of the parts, such as for example planning being
explored in relation to more complex tasks; or structured tasks being studied in relation
to There-and-then processing (see Wang & Skehan, this volume).
So, a framework such as the one shown has to be judged by its utility, by its capac-
ity to organise and generate interesting research results. The ultimate aim, though, has
to be model building, and a more theoretical account of performance (and possibly
development). In my view, the Cognition Hypothesis has rather jumped to this stage
before it has established an adequate empirical grounding. So I would prefer, as a theo-
retical position, something like a modified version of the Levelt model of first language
speaking, adapted for the second language case, and which takes account of the differ-
ences in these two cases, such as a less elaborate second language mental lexicon. This,
at least, gives us some theoretical credentials. But accountability to a range of findings
is fundamental. One has to remember the old saying “logic is the art of going wrong,
with confidence” to moderate our natural tendency to over-theorise. In any case, as we
shall see, the studies reported in this volume fit into the above framework quite nicely,
with several studies of planning, one of the post-task phase, and a number looking at
task characteristics, principally task structure. As we will see in the final chapter, the
framework enables us to consider just how ambitious models in this area can be. But
for that, we need to look at the individual studies in the volume.
There are eight empirically based chapters in the book, all based on data collected dur-
ing the time I worked in the English Department at Chinese University. They are based
either on Hong Kong Research Grants Council funded studies (three), all of which
were motivated by the framework from the last section and by my interest in the com-
parison of Tradeoff or Cognition accounts, or on individual Ph.D. dissertations that I
supervised (three) and these came from students interested in similar issues, but based
on research problems that intrigued each of them.
10 Peter Skehan
Three of the chapters are concerned with planning, and we will start with these.
Chapter 2 is by Zhan Wang (Jan), and is based on her Ph.D. at Chinese University. Jan
started out interested in general planning, and so examined the literature for inter-
esting possible research questions. She became intrigued by Ellis’ notion of on-line
planning (Ellis 2005). Planning, in general, is benign in its influence, and is associ-
ated with raised task performance, in a fairly general way. But the generalisation that
emerges in the literature is that complexity and fluency are consistently increased,
with large effect sizes, whereas accuracy is not so dependably affected, and when
it is, the effect sizes are smaller. This set of findings has been around for some time
now, and Ellis’ aim was to shed light on why accuracy seems to be more difficult to
influence through planning. He proposes that one can distinguish between pre-task
(or strategic) planning, and on-line planning. Pre-task planning is done, obviously,
before the task itself, with time allocations for this varying, but typically lasting ten
minutes. This, Ellis proposes, is good for complexity and fluency. On-line planning,
he proposes, is the sort of planning which takes place ‘on the fly’ while speaking is
taking place and speakers reorganize their speech while continuing to talk. The Levelt
model fits well with these two types of planning. The model contains modular stages,
so that initial Conceptualisation is followed by different sub-stages within the mac-
rostage of Formulation, and then Articulation, the actual speech production stage
completes the process. For any individual speaker all stages operate in parallel, since
different things are happening at the same time in each. Current Formulation, for
example, is the result of previous Conceptualisation, while current Conceptualisation
is yet to impact on the Formulator, but soon will. (See Wang, Chapter 2, this volume,
for a more extensive discussion)
Ellis proposes that when processing conditions are demanding, the second lan-
guage speaker has difficulty in sustaining this parallel processing, and as a result,
accuracy suffers. In contrast, when processing conditions are less demanding, it is
possible for the speaker, even the second language speaker, to give attention simul-
taneously to Conceptualisation (i.e. preparing plans to be ready for the next stage),
while at the same time devoting enough attention to current Formulation, thereby
achieving greater levels of accuracy. Jan followed this reasoning, but was unhappy
with Ellis’ actual operationalisation of this distinction between planning types.
She felt there was scope to introduce modifications to distinguish between the two
planning types more clearly. So her basic aim was to introduce a methodologi-
cal improvement. But she also wanted to explore more about Levelt’s third gen-
eral stage, Articulation, and she linked this with the use of a Repetition condition.
She reported a set of results which bring out the usefulness of pre-task planning,
and also that of on-line planning, and most interesting of all, of their synergistic
effects in combination. Repetition, too, proved to be a rewarding variable to have
worked with.
The context for researching a processing perspective on task performance 11
Bui Hiu Yuet Gavin (Chapter 3), also looked at planning, but in a different though
similarly creative way. As we have just seen, the typical research design in the plan-
ning literature is to give second language speakers ten minutes to plan before they
do a speaking task. But planning, more broadly, is an aspect of preparedness (see the
earlier section), and this is much wider in scope. Ten minutes are given, after all, to
plan something you had no idea about moments before. However, preparedness can
take many other forms, such as having told the same story before (and see Jan’s use of a
repetition condition in her chapter). Or it could mean being more deeply f amiliar with
what is talked about and even talking about something which is important already
in one’s life. Bui takes just such an approach in his study. In a clever research design,
he compares the effects of conventional planning (the approach that is typical in the
literature) with the effect of speaking about something familiar. Not only does he pro-
vide interesting results on this issue, he also puts forward a model to cover the various
senses of planning hinted at earlier in this paragraph and the previous section (i.e.
including on-line planning). He prefers the term ‘readiness’ to capture this wider view
of planning. In this way, he contributes to enabling more sophisticated use of the con-
struct of planning in future research.
The third chapter on planning (Chapter 4 in this volume) is based on a study
conducted by Francine Pang and myself as part of a Hong Kong RGC grant. Once
again the starting point is that the planning literature is extensive, that there are now
‘attested’ findings, but that there is more work to be done, particularly at the explana-
tory level. Specifically, there is the issue that although we have a range of findings,
we have tended to ‘black box’ planning, and embed it within quantitative research
designs. Increasingly studies do interview participants after planning (Tavakoli &
Skehan 2005), but this tends to be only to ask if they thought the planning time was
worthwhile, and whether they thought they benefitted. What is missing is thorough
research on what participants say they do when they plan, although there is one major
exception to this – the work of Lourdes Ortega (1995, 1999, 2005). However, as Ortega
herself states, it is remarkable that so little qualitative research has been done to ‘peer
inside’ planning processes. The study reported here is an attempt to redress this state
of affairs, a little at least. We gave participants a narrative to tell, and then Francine
engaged them in retrospective interviews (RIs), probing what they had done during
the planning time that was available. She then developed a coding scheme to catego-
rise the various processes they reported engaging in, a coding scheme which started
from the transcribed interviews, but which, as it happens, can be related to elements
of Levelt’s model of speaking. Next we did something we have not found anywhere in
the literature. We looked at the association between the planning behaviours which
were reported and the actual performances which were produced, generally with the
intention of exploring whether some planning behaviours are more effective than oth-
ers. The chapter reports on how we made sense of the successful behaviours. But it also
12 Peter Skehan
gives an account of something which surprised us. It was just as important to discover
which reported planning behaviours were harmful to performance, or at least to cer-
tain aspects of it. So it isn’t simply what you do during planning that counts. It’s just as
important to know what you shouldn’t do.
There is one more study which focusses on a task condition: Chapter 5 by Li Qian
is on what happens after a task, rather than before it. Li Qian (Christina) had become
interested in research that I had done with Pauline Foster (Skehan & Foster 1997;
Foster & Skehan 2013) which had explored the impact on task performance of antici-
pating what activity will follow the task. We had shown, tentatively at first (Skehan &
Foster 1997) but more robustly later (Foster & Skehan 2013), that a post-task activity
can have focused effects on performance. Our initial hypothesis had been that post-
task effects are confined to raising accuracy, as was the case in Skehan and Foster
(1997). The second of our studies (Foster & Skehan 2013) had shown that anticipa-
tion of the need to transcribe some of one’s performance post-task did raise accuracy,
with narrative and decision-making tasks, but also raised complexity with the second
of these tasks. We had hypothesised that a post-task activity would raise pedagogic
targets and induce participants to try to avoid error. In the event, the post-task activity
seemed to impact upon form in general, and also led participants to use more complex
language on one of the two tasks. Christina was interested in this, and liked the way
we had shown that self-transcription had desirable effects. But she felt that post-task
transcription was something of a blunt instrument, as we had used it (though she was
too polite to put it like that), and that there was scope to explore if different opera-
tionalisations of post-task transcription might have different effects on performance.
She complexifies what is possible with a post-task manipulation, using individual vs.
group-based transcription, and transcription with or without rewriting of an ‘ideal’
version. The chapter reports on her study, and shows that the impact of post-task tran-
scription is indeed more complex than had been thought. It also enables another point
to be made, one that could be brought up elsewhere, too. All the studies reported in
this volume have L1 Mandarin or Cantonese speakers, and the data was collected in
Hong Kong, Macao, or in Guangzhou. There are obvious issues of generalisation here,
and one can ask whether the research is relevant for people outside this relatively nar-
row geographical context. Christina’s research is interesting, however, because she is
partially replicating what had been done elsewhere (with a range of different L1s in the
Foster and Skehan research). The fact that she produced results which compare quite
well with those from the earlier studies does give us confidence that generalisation is
indeed possible, and that the findings reported in this volume do not apply only to
particular L1s.
We turn next to studies which have explored task characteristics. First of these is
the chapter I wrote with Zhan Wang (Chapter 6, this volume), which is the product of
an RGC grant on which she was a researcher. In the introductory section, I mentioned
The context for researching a processing perspective on task performance 13
Measurement issues
We have now set the scene for the chapters which follow, and we have introduced each
chapter separately. But there is one additional issue which cuts across all the stud-
ies, and that is how performance was measured. All studies reported in this book use
quantitative data (even the qualitative study of Pang & Skehan). As much as all studies
were conducted within the research framework outlined earlier in this chapter, they
were all based on (roughly) the same approach to measurement, and so it is useful here
to outline this approach. This saves repetition within the individual chapters. It also
offers suggestions for methodological progress in the field, since a fairly comprehen-
sive set of measures is being proposed.
It is worth saying at the outset that there are many things which could be mea-
sured in task performance. However, since such measurements are time-consuming,
what is chosen is invariably chosen at the expense of something else. Interactional
moves (Long 1996), symmetry (Van Lier & Matsuo 2000), discourse markers, con-
versational feedback (Lyster & Ranta 1997) are all possibilities and have been used by
others. But in the present case, and continuing to follow a rather cognitive approach,
the focus will be on complexity, accuracy, fluency, and lexis (Skehan 2009a; Housen &
Kuiken 2009). These dimensions have been used in a huge number of studies now.
They were introduced earlier, with the point that studies have shown independence
The context for researching a processing perspective on task performance 15
between these areas – higher proficiency does not simply mean that people score gen-
erally higher in each dimension. In fact, they can often function very distinctly from
one another, and so it is interesting in itself to explore when they correlate with one
another and when they do not.
Given that these areas are the focus of measurement, one still needs to discuss the
issue of specific versus general measures. Some researchers (e.g. Crookes 1989; Ortega
2005; Robinson et al. 2009) prefer to use specific measures, of, for example, article usage,
or verb concord, or pronoun use. Such a preference may have strong supportive argu-
ments, with the idea that specific measures have greater construct validity, and that
provided one chooses appropriately, individual measures can be used to detect the influ-
ence of experimental conditions. For example, specific measures can reflect the nature
of the task chosen. The use of pronouns, for example, might be justified in this way
with a narrative retelling where clarity of reference is particularly important. The cen-
tral problem with this approach, in my view, is not at all theoretical, but only practical.
When one is working with second language spoken performance, one has to work with
relatively brief performances, often less than five minutes in length. These, not infre-
quently, generate below two hundred words. The difficulty, therefore, is that there may
not be enough tokens to work with. If one wants, for example, to work with pronouns,
it is quite possible that in two hundred words or so, there may not be enough examples
of pronouns (or appropriate pronoun contexts) to give the sensitivity that is required to
detect differences between experimental groups. This, obviously, is a great limitation. It
is for this reason that we have, in this volume, preferred to work with generalised indi-
ces. These do not focus on particular areas, but use measures which draw on as much
of the sample as possible in order to have a sufficiently rich sampling of data. One loses
in precision of hypothesis-making, but one gains in detecting influence. (Logically, one
might use a two-step strategy here – the first phase would try to detect influences which
are strongest, through general measures, and the second phase, in follow-up research,
could use this information with specific measures which would then be more likely to
be sufficiently sensitive).
Next, we need to explore how exactly we used these general measures of complex-
ity, accuracy, fluency and lexis. Before that, though, some general discussion of analytic
procedures is necessary. The CHILDES system (MacWhinney 2000) exists to facilitate
data transcription (through the CHAT set of conventions), with subsequent analysis
of data possible through the associated CLAN suite of programs. The approach was
developed for first language acquisition data but is increasingly used for second lan-
guage spoken performance also (Marsden et al. 2003). It can be very useful, and so
the data I have been associated with for some time has been transcribed in the CHAT
format. But the programs in CLAN, excellent though they are, do not really provide
clear indices relevant to the measurement profiles we want to work with. Accordingly,
in all the research which follows, data is coded in a modified CHAT format, in that
16 Peter Skehan
additional timing information is provided, capturing the beginning and end points of
each AS-unit, and then an extra line is added to the coding for each AS unit (Foster
et al. 2000), and this line contains codes relevant for the calculation of a range of com-
plexity, accuracy, fluency and lexis measures. These will be described more fully below.
The issue, then, is how to analyse the additional fourth line. This is done, for all the
chapters in this volume, through the use of TaskProfile, a computer program written in
Delphi, which outputs a wide range of measures. The disadvantage of this is that tran-
scription and coding have to follow a set of conventions exactly. The advantage is that
once the time-consuming transcription and coding are done, the program generates
results virtually instantly. In addition, there is the advantage that if additional ideas
about performance emerge, new measures can be developed and incorporated into the
program fairly easily, and then new results obtained. Accordingly, the next section will
outline the measures that are available. Obviously the focus will be on the measures
which have actually been used in the studies that follow, but brief mention may also
be made about other measures which are output by TaskProfile, since, methodologi-
cally, they could have value for other researchers. The following sections will explore
measures of complexity, accuracy, fluency, and lexis.
The ‘standard’ measure of complexity which has been used in many task-based
studies is that of degree of subordination. No-one believes that subordination, in itself,
is complexity, but it is taken as a good, general-purpose surrogate measure. CHAT
codings are in terms of AS units, a measure Foster et al. (2000) argue is more appro-
priate for spoken language than the T-unit, and then TaskProfile computes the ratio of
total clauses, that is, main clauses plus other clauses, finite and non-finite, to AS units.
The minimum value here would be 1, with no subordination at all, so that the num-
ber of total (matrix) clauses and the number of AS clauses are identical. Typically, for
second language speakers, one gets values above 1.2, generally but not always below 2,
with group means often in the range 1.4 to 1.6. This index has been used widely and
has been shown to be sensitive to experimental differences in a consistent way. How-
ever, it is far from the only method of assessing complexity. Measures of range of struc-
tural use have also been used (Foster & Skehan 1996), but their use is not widespread.
More recently, Norris and Ortega (2009) have proposed that the subordination mea-
sure is not so effective at higher levels of proficiency. Indeed, studies which have native
speaker baseline data provide some supportive evidence for this, since native and non-
native speaker groups often do not differ very much on subordination scores (Skehan
2009b). Norris and Ortega (2009) propose instead that measures based on the number
of words in clauses capture a different dimension of complexity, and that this is more
sensitive to differences at higher proficiency levels. Accordingly, TaskProfile computes
the scores for number of words in AS units, in matrix clauses, in finite subordinate
clauses, and in non-finite subordinate clauses. Several of the chapters draw on this
possibility as appropriate and in so doing allow an investigation, in passing, of the
construct validity of the two types of complexity measures.
The context for researching a processing perspective on task performance 17
The next performance area to consider is that of accuracy. Here the ‘standard’
index is to compute the proportion of all clauses that are error-free, so that clearly we
are dealing here with values between the minimum of 0 and a maximum of 1. This
index too has been very serviceable, and has been used in very many studies. On the
basis of this work, one could claim that the index is a good way of detecting a differ-
ence if there is one. But although it may be the most widely used method of measur-
ing accuracy in task-based performance, it is not the only option. Mehnert (1998)
proposed that the measure is influenced by the clause structure of the language being
used, and that for L2 German, a more appropriate measure is errors per 100 words.
She proposes that this is a better general measure, since more or less ‘clausality’ in a
language does not affect it, and so it has greater crosslinguistic comparative utility.
Skehan and Foster (2005) introduced a yet different method of measuring accuracy.
They were worried that an error-free clauses measure was vulnerable to distortion
in cases where a speaker used a large number of very short clauses in their speech,
with these clauses being more likely to be correct. The resulting score, they proposed,
might be inflated, and not constitute a ‘true’ index of accuracy. So they proposed
building in to a measure of accuracy a safeguard through using clause length. Their
proposal is that one ranks all the clauses that have been produced for number of
words. Hence, one would bring together all two word clauses, all three word clauses,
and so on, up to whatever length of clause is produced. Then they propose calculating
the proportion of clauses that are correct for each length. Finally, one takes this infor-
mation, establishes a criterion, and then establishes the cutoff point that a particular
participant has reached. So, if someone produced the following data: 2-word, 100%
correct; 3-word, 100% correct, 4-word, 90% correct, 5-word, 80% correct, 6-word,
70% correct, 7-word, 60% correct, 8-word, 50% correct, 9-word, 30% correct, and if
one had set the criterion of 70% correctness, then this speaker would be awarded a
score of 6, since 6 words is the highest length that is being produced at the requisite
accuracy.
Of course, not all speakers are considerate enough as to produce scores which
lend themselves to scoring so neatly. Skehan and Foster (2005) therefore propose deci-
sion rules to handle more difficult cases. If a speaker, for example, fails a criterion at
a particular level, but then produces the next two clause lengths at criterion or better,
they are excused the ‘blip’ of the one level where they failed. But this only applies if
there is one blip. If they fail at more than one level consecutively, the score is given as
the length of clause at the last level that reached criterion. This additional rule handles
difficult cases very well, and so this becomes a reliable measure to use. It is important
to say that this illustrates the advantages of using a computer program to score coded
spoken language performance. As far as TaskProfile is concerned, there is negligible
additional effort in scoring accuracy in this way. The decision as to the score to be
assigned is done manually, that is, by the researcher, but this is based on the data which
is laid out by the program in tabular form, and so is the work of an instant.
18 Peter Skehan
Clause boundary
Silent
Mid-Clause
Pausing
Filled, e.g. ‘um’, ‘ah’
Non-silent
Flow Pseudo, e.g. ‘like’, ‘actually’
Reformulation
Repair Replacement
Repetition
False Starts
Pruned speech rate
Speed
Unpruned speech rate
There are a number of distinctions within this category. First there is the distinction
between silent and non-silent pauses. Regarding silent pauses, one first has to deal with
the problem of how long (or short) a pause needs to be to qualify as a pause or not. In
the present research, 0.4 seconds is used as the cutoff point. Other values have been
proposed to distinguish between simply taking breath and actually pausing, some as
low as 0.28 seconds. We use 0.4 in the present case as a sort of compromise measure,
brief enough to capture very small interruptions to the speech stream, but long enough
to make manual coding feasible. Within silent pauses (termed ‘breakdown fluency’,
Skehan 2009b) one can examine separately pauses which occur at clause boundaries
and those which occur mid-clause. Segalowitz (2010), in a major review of fluency, sug-
gests that it is particularly useful to look at measures which distinguish most effectively
between native and non-native speakers. In that respect, since native-speaker speech
contains quite a bit of clause boundary pausing, but much less mid-clause pausing, a
measure of such mid-clause pausing might be particularly effective (Skehan 2009b).
In addition, one could compute derivative measures, such as the ratio of boundary to
mid-clause pauses as another way to detect level of disruption to speech. One could
also explore the average length of pause, either at clause boundaries or mid-clause.
However, one can also regard certain verbal behaviours as constituting filled
pauses, and these come in two main flavours. First we have ‘classic’ filled pauses, such
as ‘um’, ‘ah’, etc, where some interjection is placed into the speech stream to ‘buy time’
20 Peter Skehan
as it were. There is no meaning, and possibly little difference between this and an
unfilled pause (save that it may be more effective at keeping the floor). But one does
need to consider such interjections as pauses. In addition, one can argue that certain
forms of actual speech are more properly regarded as pauses, in that they contribute
no meaning to the ongoing discourse, and serve mainly to ease the pressure of time,
and perhaps keep the floor also. Forms such as ‘you know’, ‘like’ can be analysed in
this form, as perhaps a word such as ‘actually’, although in coding data it is important
to distinguish between these words used meaningfully and their use as pseudo-filled
pauses. ‘Actually’, for example, is not always empty in meaning. Since all these pausing
measures, unfilled as well as filled, are affected by how much is said, and especially how
many words are used, they are standardised per 100 words.
Next to indices of a breakdown in the flow of speech, there are also occasions
where the speaker attempts to make changes to what is being said, rather than simply
having problems saying it. These attempts, which have been termed ‘repair fluency’ (or
more properly ‘repair dysfluency’), can be realized in many different ways, as reported
in the second language speaking literature. Reformulations are occasions where the
speaker changes what has been said by modifying the syntax or morphology, either
by changing something or by inserting or deleting something. In contrast, Replace-
ment, which also consists of change, is focused on lexical elements, so that syntax and
morphology remain unchanged, but something is done about the actual words which
are used. Repetition is self-evident: words or sequences of words are simply repeated,
without any intervening material. False Starts are occasions where something is aban-
doned, and some new form of expression is used. Of course, as with number of pauses,
there is the issue with all these measures that they occur more when speakers say
more. Accordingly, they are standardised per 100 words of discourse.
In contrast to measures of flow, one can also look at the speed with which lan-
guage is produced. Logically, one can separate flow and speed, and imagine some-
one who paused a lot, but who, when they were speaking, spoke fast, and the reverse,
someone who speaks slowly but without interruption to the flow. Hence, it is useful
to have measures of each, distinct from one another (DeJong et al. 2012). Typically
measures of speech rate, expressed as words or syllables per minute, are either pruned
or unpruned. In the latter case, the raw number of words is used, including repetitions,
reformulations and so on. In the former case, the additional material is removed, and
the measure is of meaningful, contributing words or syllables per minute.
In addition to the measures we have now covered, there are two more to be con-
sidered. They are both, effectively, composite measures of aspects of dysfluency. Pho-
nation time simply captures the proportion of the time speech is taking place, and
subtracts from total time the time spend pausing, so it reflects not simply number
of pauses but also length of pausing. Length of Run is a measure of the average span
in speech without any sort of interruption, whether a pause or a repair, and has been
The context for researching a processing perspective on task performance 21
a spoken text, and could be taken as an indicator of the propositional density of text.
(But see Skehan 2009b, for additional discussion of such measures.)
The final aspect of lexis to be considered is lexical sophistication. In ideal form,
this reflects the extent to which the speaker draws upon difficult words in what they
say. In practice, defining difficulty is not so easy, and the typical approach which is
taken is to use frequency (or rather low frequency) as a surrogate for difficulty. So,
the task becomes one of finding a measure to capture the extent to which less fre-
quent words are used by a speaker doing a task. Laufer and Nation’s (1999) Lexical
Frequency Profile is one means of doing this, but when the present research program
started, before my move to Hong Kong, there was no means of using this based on
any spoken language corpus. Accordingly, I adapted a computer program written by
Paul Meara, Plex (Meara & Bell 2001). This program divides a text up into ten word
chunks, and then calculates how many words in each ten-word chunk are of a lower
frequency. It then uses a Poisson Distribution (developed to model low frequency
events) to estimate a parameter, Lambda, which captures the extent to which the text
draws upon lower frequency words. This approach has been shown to be effective
with quite short texts (Bell 2003), and so it is useful for texts such as we use in the
task-based field. I adapted Meara’s original program in three ways. First, I was able
to use a different frequency corpus as the basis for the decision making with any
particular word. I based my version of his program on the spoken component of the
British National Corpus. Second, I built a BNC-based reference dictionary for the
program which was lemmatised, and so the program outputs both lemmatised and
non-lemmatised values of Lambda for any particular text (although typically it is
the lemmatised value which is used). Finally, the program allows the user to specify
the cutoff frequency to separate low and high frequency words. In all uses of the
program in chapters in this volume, this was set as 150 occurrences per million run-
ning words. Further details on this statistic, as well as a discussion of other aspects
of lexical measurement in second language spoken language are provided in Skehan
(2009b).
As a result, we have three measuring procedures, lexical diversity, lexical density,
and lexical sophistication (with these terms taken from Read 2000). The first question
to consider is what each of them measures, and second, how they interrelate. One can
propose the following:
–– Lexical diversity captures the extent to which a speaker draws upon a wide range of
words in what they say, compared to a speaker who recycles a smaller set of words.
The measure is neutral as to whether high or low frequency words are used. What
is important is how words relate to other words within that text. For this reason
lexical diversity is referred to as a ‘text-internal’ measure (Daller et al. 2003).
The context for researching a processing perspective on task performance 23
–– Lexical density reflects the penetration within a text of content words, as opposed
to reliance on structure words. It is thought to reflect the density of propositions
in the text, and is also considered to be likely to be different in spoken and written
language.
–– Lexical sophistication is an index of the speakers’ capacity or preference for using
less frequent words, which presupposes knowledge of such words (implying a
larger second language lexicon) as well as a capacity to mobilise them on-line
(lexical accessibility).
In a sense, these characterisations are tantalising. They hint at differences, but there seems
to be a considerable degree of overlap as well. Wouldn’t high lexical diversity tend to go
with lexical sophistication, for example? However, Skehan (2009b) demonstrated quite
a bit of independence between measures in each of these areas. The truth is that at the
moment we are equipped with measures but are not sure exactly what they are getting at.
Typically, in second language studies (Skehan 2009b), measures of lexical sophistication
are more likely to show differences between groups or conditions than do the other mea-
sures. But there is a case for including all of them: lexical diversity, because it has been the
measure of choice more often than any other; lexical density, because of Halliday’s (1975)
theoretical justification of this construct; and lexical sophistication, not only because
it may be the best bet for detecting differences, but also because size of mental lexicon
may be an issue with second language learners, and so such a measure may reveal how
different tasks and different task conditions enable second language speakers to draw on
that lexicon more or less effectively. In any case, we will see in some of the chapters in
this volume that characterising second language task performance without incorporat-
ing measures of lexical involvement is a hazardous undertaking.
In all, we can conclude that many measures are available. It is to be hoped, as the
reader goes through the chapters in this volume, that this range of measurement pos-
sibilities is itself put to the test, and we may gain some insights as to which measures
are most effective in such contexts.
References
Bell, H. (2003). Using frequency lists to assess L2 texts. Unpublished Ph.D. thesis, University of Swansea.
Crookes, G. (1989). Planning and interlanguage variation. Studies in Second Language Acquisition,
11, 367–383.
Daller, H., Van Hout, R., & Treffers-Daller, J. (2003). Lexical richness in the spontaneous speech of
bilinguals. Applied Linguistics, 24, 197–222.
DeBot, K. (1992). A bilingual production model: Levelt's “Speaking” model adapted. Applied Lin-
guistics, 13, 1–24.
24 Peter Skehan
De Jong, N., Steinel, M.P., Florijn, A., Schoonen, R., & Hulstijn, J. (2012). The effect of task complexity
on functional adequacy, fluency and lexical diversity in speaking performances of native and
non-native speakers. In A. Housen, F. Kuiken, & I. Vedder (Eds.), Dimensions of L2 perfor-
mance and proficiency: Complexity, accuracy, and fluency in SLA (pp. 121–142). Amsterdam:
John Benjamins.
Doughty, C. (2003). Instructed SLA: Constraints, compensation, enhancement. In C. Doughty &
M.H. Long (Eds.), The handbook of second language acquisition (pp. 256–310). Oxford: Blackwell.
Doughty, C. & Williams, J. (1998). Pedagogical choices in focus on form. In C. Doughty & J. Williams
J. (Eds.), Focus on form in classroom second language acquisition (pp 197–262). Cambridge: CUP.
Ellis, R. (2003). Task-based language learning and teaching. Oxford: OUP.
Ellis, R. (2005). Planning and task-based performance: Theory and research. In R. Ellis (Ed.), Plan-
ning and task performance in a second language (pp. 3–34). Amsterdam: John Benjamins.
Foster, P. (2001). Lexical measures in task-based performance. Paper presented at the AAAL Confer-
ence, Vancouver, Canada.
Foster, P. & Skehan, P. (1996). The influence of planning on performance in task-based learning.
Studies in Second Language Acquisition, 18, 3, 299–324.
Foster, P., & Skehan, P. (2012). Complexity, accuracy, fluency and lexis in task-based performance: a
synthesis of the Ealing research. In A. Housen, F. Kuiken, & I. Vedder (Eds.), Dimensions of L2
performance and proficiency: Complexity, accuracy and fluency in SLA (pp. 199–220). Amster-
dam: John Benjamins.
Foster, P., & Skehan, P. (2013). Anticipating a post-task activity: The effects on accuracy, complexity
and fluency of L2 language performance. Canadian Modern Language Review 69, 3, 249–273.
Foster, P., Tonkyn, A., & Wigglesworth, G. (2000). Measuring spoken language: A unit for all reasons.
Applied Linguistics, 21, 354−75.
Foster, P., & Wigglesworth, G. (2010). Towards a new measure of accuracy in task-based second
language performance. English Department, St.Mary’s University, Twickenham.
Halliday, M.A.K. (1975). Spoken and written language. Gelong: Deakin University Press.
Housen, A., & Kuiken, F. (2009). Complexity, accuracy, and fluency in second language acquisition.
Applied Linguistics, 30, 461–473.
Housen, A., Kuiken, F., & Vedder, I. (2012). Dimensions of L2 performance and proficiency: Complex-
ity, accuracy, and fluency in SLA. Amsterdam: John Benjamins.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Lawrence
Erlbaum Associates.
Laufer, B., & Nation, P. (1999). A vocabulary-size test of controlled productive ability. Language
Testing, 16, 33–51.
Levelt, W.J. (1989). Speaking: From intention to articulation. Cambridge: CUP.
Levelt, W.J. (1999). Language production: a blueprint of the speaker. In C. Brown & P. Hagoort (Eds.),
Neurocognition of Language (pp. 83–122). Oxford: OUP.
Long, M.H. (1996). The role of the linguistic environment in second language acquisition. In W.
Ritchie & T. Bhatia (Eds.), Handbook of Research on Second Language Acquisition, (pp. 413–468).
New York, NY: Academic Press.
Lynch, T. (2001). Seeing what they meant: transcribing as a route to noticing. English Language
Teaching Journal, 55, 124–132.
Lynch, T. (2007). Learning from the transcript of an oral communication task. English Language
Teaching Journal, 61, 311–319.
Lyster, R., & Ranta, L. (1997). Corrective feedback and learner uptake: Negotiation of form in com-
municative classrooms. Studies in Second Language Acquisition, 19, 37–66.
The context for researching a processing perspective on task performance 25
Long, M., & Robinson, P. (1998). Focus on form: Theory, research, and practice. In C. Doughty &
J. Williams (Eds.), Focus on form in classroom SLA (pp. 15–41). Cambridge: CUP.
MacWhinney, B. (2000). The CHILDES Project: Tools for analysing talk, Volume 1: Transcription for-
mat and programs (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Malvern, D., & Richards, B. (2002). Investigating accommodation in language proficiency interviews
using a new measure of lexical diversity. Language Testing, 19, 85–104.
Marsden, E., Myles, F., Rule, S., & Mitchell, R. (2003). Using CHILDES tools for researching second
language acquisition. In S. Sarangi & T. van Leeuwen (Eds.), Applied Linguistics and Communi-
ties of Practice (Vol. 18, pp. 98–113). London: BAAL/Continuum.
McLaughlin, B. (1980).Theory and research in second language learning: An emerging paradigm.
Language Learning, 30, 331–350.
Meara, P., & Bell, H. (2001). P_Lex: A simple and effective way of describing the lexical characteris-
tics of short L2 texts. Prospect, 16(3), 5–19.
Mehnert, U. (1998). The effects of different lengths of time for planning on second language perfor-
mance. Studies in Second Language Acquisition, 20, 83–108.
Norris, J., & Ortega, L. (2009). Towards an organic approach to investigating CAF in instructed SLA:
The case of complexity. Applied Linguistics, 30, 555–578.
Ortega, L. (1995). The effect of planning on L2 Spanish narratives. Research Note 15, Honolulu, HI:
University of Hawai’i Second Language Teaching and Curriculum Center.
Ortega, L. (1999). Planning and focus-on-form in L2 oral performance. Studies in Second Language
Acquisition, 21, 109–148.
Ortega, L. (2005). What do learners plan? Learner-driven attention to form during pre-task planning.
In R. Ellis (Ed.), Planning and task performance in a second language (pp. 77–109). Amsterdam:
John Benjamins.
Read, J. (2000). Assessing vocabulary. Cambridge: CUP.
Richards, B. & Malvern, D. (2007). Validity and threats to the validity of vocabulary measurement.
In H. Daller, J. Milton, J. Treffers-Daller (Eds.), Modelling and assessing vocabulary knowledge
(pp. 79–92). Cambridge: CUP.
Robinson, P. (2001). Task complexity, task difficulty, and task production: Exploring interactions in
a componential framework. Applied Linguistics, 22, 27–57.
Robinson, P. (2007). Rethinking-for-speaking and L2 task demands: The Cognition Hypothesis, task
classification, and sequencing. Paper presented at the 2nd International Conference on Task-
based Language Teaching, University of Hawaii, September 2007.
Robinson, P., & Gilabert, R. (2007). Task complexity, the Cognition Hypothesis, and second language
learning and performance. International Review of Applied Linguistics, 45, 161–176.
Robinson, P., Cadierno, T., & Shirai, Y. (2009). Time and motion: Measuring the effects of the
conceptual demands of tasks on second language speech production. Applied Linguistics, 30,
533–544.
Robinson, P. (2011). Second language task complexity, the Cognition Hypothesis, language learn-
ing, and performance. In P. Robinson (Ed.), Second language task complexity: Researching
the Cognition Hypothesis of language learning and performance (pp. 3–38). Amsterdam: John
Benjamins.
Samuda, V. (2001). Guiding relationships between form and meaning during task performance: the
role of the teacher. In Bygate, M., Skehan, P., & Swain, M. (Eds.), Researching pedagogic tasks:
second language learning, teaching, and testing. London: Longman.
Segalowitz, N. (2010). Cognitive bases of second language fluency. London: Routledge.
Skehan, P. (1998). A cognitive approach to language learning. Oxford: OUP.
26 Peter Skehan
Skehan, P. (2001). Tasks and language performance. In M. Bygate, P. Skehan, & Swain M. (Eds.),
Researching pedagogic tasks: Second language learning, teaching, and testing (pp. 167–185).
London: Longman.
Skehan, P. (2009a). Modelling second language performance: Integrating complexity, accuracy, flu-
ency and lexis. Applied Linguistics, 30, 510–532.
Skehan, P. (2009b). Models of speaking and the assessment of second language proficiency. In
A. Benati (Ed.), Issues in second language proficiency (pp. 202–215). London: Continuum.
Skehan, P. (2011). Researching tasks: Performance, assessment, pedagogy. Shanghai: Shanghai Foreign
Language Education Press.
Skehan, P. (2013). Nurturing noticing. In J.M. Bergsleithner, S.N. Frota, & J.K. Yoshioka (Eds.),
Noticing and second language acquisition: Studies in honor of Richard Schmidt. (pp. 169–180).
Honolulu, HI: National Foreign Language Resource Center.
Skehan, P., & Foster, P. (1997). Task type and task processing conditions as influences on foreign
language performance. Language Teaching Research, 1, 185–211.
Skehan, P., & Foster, P. (1999). The influence of task structure and processing conditions on narrative
retellings. Language Learning, 49, 93–120.
Skehan P., & Foster, P. (2005). Strategic and on-line planning: The influence of surprise information
and task time on second language performance. In R. Ellis (Ed.), Planning and task performance
in a second language (pp. 193–216). Amsterdam: John Benjamins.
Tavakoli, P., & Skehan, P. (2005). Planning, task structure, and performance testing. In R. Ellis
(Ed.), Planning and task performance in a second language (pp. 239–273). Amsterdam: John
Benjamins.
Towell, R., Hawkins, R., & Bazergui, N. (1996). The development of fluency in advanced learners of
French. Applied Linguistics, 17, 84–115.
Van Lier, L. & Matsuo, N. (2000). Varieties of conversational experience: Looking for learning oppor-
tunities. Applied Language Learning, 11, 265–288.
Van Patten, B. (1990). Attending to content and form in the input: an experiment in consciousness.
Studies in Second Language Acquisition, 12, 287–301.
Van Patten, B. (1996). Input processing and grammar instruction in second language acquisition.
Norwood, NJ: Ablex.
Van den Branden, K. (2006). Task-based language education: From theory to practice. Cambridge:
CUP.
Willis J. (1996). A framework for task-based learning. London: Longman.
chapter 2
Zhan Wang
University of Pittsburgh
Introduction
Processes of L1 speaking
Speaking in one’s native language is an easy, fluent, and automatic process. According
to Levelt’s model of first language (L1) speech production (Levelt 1989, 1993, 1999),
information processing in L1 speaking contains the stages of conceptualization,
formulation, and articulation as well as a speech comprehension system by which
28 Zhan Wang
the outputs of each stage can be monitored: the pre-verbal message, the inner speech
plan, and the overt speech plan. L1 speaking to a large extent is an incremental, paral-
lel, and automatic process (Kormos 2006, p. 7–8).
To say that L1 speaking is incremental means that speech information needs to flow
from Conceptualizer, Formulator, to Articulator for speech to be produced. Through
these processors, information is delivered from a larger unit in the hierarchy first to
intermediate, and then to even smaller, subunits (Lashley 1951; Meyer & Gordon
1985), which is the opposite process to word recognition and reading (i.e. decoding
the printed material to meaning and from smaller units such as phonemes, syllables,
and words, to larger units such as the meaning of sentences and discourse). For exam-
ple, when a large unit of speech intention has been formed, the brain searches for a
mapping between the pre-verbal message and specific target lexical nodes based on
the relevant semantic and syntactic information in the lexicon – the process of lexi-
cal selection (Levelt 2001). Then the activated lexical nodes (e.g. play-verb) get mor-
phologically encoded (e.g. played-past tense verb) and are passed to the phonological
encoder for the activation of phonological and phonetic encoding (e.g. /pleid/) (Levelt,
Roelofs & Meyer 1999; Schriefers, Meyer & Levelt 1990). Finally, through the motor
action from the Articulator, the small units of each phonetic word are articulated –
what we perceive as speech.
L1 speech production is also a parallel and highly automatic process, two impor-
tant aspects of fluent L1 speaking. In Levelt’s model, for example, as soon as the Con-
ceptualizer passes information to the Formulator, the Conceptualizer starts to work
on the next piece of information regardless of the fact that the last piece of informa-
tion is still being processed by the Formulator (Kempen & Hoenkamp 1987; cited
in Kormos 2006, p. 8). This efficient parallel processing is based on the claim that
conceptualization in speech production is likely to be the only controlled process that
requires conscious awareness. Both Formulator and Articulator to a great extent work
automatically without much conscious awareness1 (Levelt 1989, p. 20). Researchers
may not have consensus on exactly how morphological transformations are computed
when retrieving linguistic forms. For example, researchers who argue for statisti-
cal learning rules (Seidenberg 1997) posited that low frequency forms are less easily
retrieved from mental lexicon or mental grammar. That is why sometimes speakers
hesitate about the precise phrasing to be used. However, as to the high frequency L1
s uppose our technology can distinguish brain activity from the three processors (i.e.
Conceptualizer, Formulator, and Articulator), at Tn, Information Chunk n (ICn) is
processed through the Conceptualizer first. When ICn completes processing at the
Conceptualizer stage, it enters into the Formulator for morpho-phonological process-
ing at Tn+1. This demonstrates the feature of incremental processing. If we only observe
a time point at Tn+2, we find that none of the processors is at rest. Conceptualizer,
Formulator, and Articulator are working simultaneously in parallel. This remarkable
capacity for parallel processing ability is attributed to the largely automatic nature of
processing at the Formulator and the Articulator phases. These incremental, parallel,
and automatic features make L1 speaking continuous, fast, and efficient (Levelt 1989,
1993, 1999; Kormos 2006).
1. resource deficits,
2. processing time pressure,
3. perceived deficiencies in their own language output, and
4. perceived deficiencies in decoding the interlocutor’s message.
It seems that processing time pressure is one of the bottlenecks in L2 speaking given
that our working memory capacity is regarded as being limited (Baddeley 1986, 2003).
The bottleneck is likely to impede language learning and development. Table 2 illus-
trates why L2 speakers are experiencing real-time pressure during speaking. The
phases that have differences from L1 processing and make L2 speaking less quick and
efficient are highlighted.
degree have time pressure at conceptualization too). Beginning L2 learners have more
time pressure than their L1 peers in the compilation of articulatory motor programs
too. However, for intermediate and advanced L2 learners, since they have acquired
the basic acoustic templates of the target language phonetics, the articulation of the
encoded phonetic plans is likely to implicate automatic motor programming, which
requires little conscious awareness.2
Researchers often regard conceptualization as being the least different between
L1 and L2 speakers. However, for processing L2 forms either learned (but not proce-
duralized) or missing in the L2 mental lexicon, L2 speakers are generally slower than
their native peers. First of all, Kroll and Stewart (1994) have argued in their Revised
Hierarchical Model (RHM) that there is no direct link between concepts and the L2
initially, but a link begins to be established gradually when L2 proficiency increases.
They also argue that for low proficiency L2 learners, the link between concept and
its L2 representation is often assisted by their native language. This implies that in
contrast to the direct link between meaning and L1 representations, a longer time is
required to access and retrieve target L2 lemmas from the mental lexicon so that on
its own it is less efficient in processing the L2 than the L1. Therefore, L2 speakers are
likely to need more time to retrieve L2 words in real time speech, even for items that
have been learned.
Second, due to resource deficits in the L2 (Dörnyei & Scott 1997; Kormos 2006),
ideas which have been conceived so that they can be expressed in real-time speech
sometimes encounter gaps in the L2 mental lexicon. As a result, L2 speakers generally
are more tentative in the conceptualization of ideas. They have developed strategies
to make the best match between what to conceive that they are able to express and
what to compromise about due to the lack of sufficient L2 resources at hand – the
issues of ‘cognitive comparison’ and ‘selective attention’ (Doughty 2001). Therefore,
conceptualization for L2 speakers is a bi-directional3 process involving revision and
re-conceptualization (i.e. finding alternative ways to express thoughts) in order to
match speech with the L2 resources available.
At the stage of formulation, L1 speakers rely on automatic processing in most
instances to encode morphological and phonological information, which makes
speech production easy and fast once ideas are conceptualized. However, with late L2
. Similar to L1 speech processing, though motor programming of articulation to a large extent
is an automatic process, it still involves conscious awareness around the time course of the motor
action.
. L1 speaking at the stratum of conceptualization may have a bi-directional connection
between concepts and lemmas too, as argued by Roelof ’s WEAVER network (Levelt et al. 1999;
Roelof 1997) – this is not due to the lack of linguistic resources but due to the change of speech
plan.
32 Zhan Wang
1 Control watch+tell – – –
2 Watched watch, watch+tell Conceptualizer C, F Skehan & Foster 1999
3 On-line watch+tell Formulator A, F (–) –
Planning (slowed video)
4 Watched watch, watch+tell Conceptualizer+ C, A, F (–) Ahmadian &
On-line (slowed video) Formulator Tavakoli 2011;
Planning (on-line) Ellis 1987; Hulstijn &
Hulstijn 1984;
Yuan & Ellis 2003
5 Watched watch, planning, Conceptualizer + C, A, F Guará-Tavares 2008;
Strategic watch+tell Formulator Mehnert 1998; Ortega
Planning (strategic 1999; Skehan & Foster
problem-solving) 1997; Tajima 2003
6 Repetition watch+tell, Conceptualizer + C, A, F Bygate 1996, 1999, 2001;
watch+tell Formulator + De Jong & Perfetti 2011;
Articulator Gass et al. 1999; Lynch &
McLean 2000, 2001
# Location: The speech production stage(s) that the time pressure reduction intervention targets
^ Hypotheses:
C: increasing complexity
F: increasing fluency
F (–): decreasing fluency
A: increasing accuracy
for speakers to either take the ‘perspectives’ of either the speaker’s or the presumed
listener’s attitude, or to ‘preview’, that is, consider the background and foreground of
what is happening and what is about to happen – these are two important linguistic
resources that lead to ‘framing’ a narration toward discourse coherence (Bygate &
Samuda 2005: p. 48). Having the least possible planning time (before and within the
task) this condition is hypothesised as being the hardest among all the conditions –
involving what Ellis (2005) described as ‘pressured online planning’.
The ‘Watched’ condition (Condition 2) allows speakers to watch the video before
narrating the story. ‘Watching’ is a type of ‘pre-task activity’ (Skehan & Foster 1999),
which provides exposure to the task material before the real task performance.
This manipulation is intended to reduce time pressure at speech conceptualization
by exposing to speakers the story content first. In this way, a greater proportion of
attention becomes available to attend to formulation (and to ‘focus on form’) while
the speaker is performing the task, and his/her speech performance can therefore be
enhanced. Few studies have explored the effect of pre-watching. Skehan and Foster
(1999), as an exception, compared the effects of having and not having a pre-watching
On-line time pressure manipulations 35
pportunity. However, contrary to its theoretical assumptions, their study did not find
o
significant results on the CAF measures. A similar line of research that involves easing
speech conceptualization may be strategic planning. For example, Pang and Skehan
(this volume) in a qualitative study found that during strategic planning, learners
reported that they selectively prepared for speech content or language forms. The lit-
erature on strategic planning (see Ellis 2003, 2008, 2009 for reviews) generally shows
that having the chance to plan before speaking helps raise speech fluency and syntac-
tic complexity4 (Crookes 1989; Foster 1996; Foster & Skehan 1996, 1999; Skehan &
Foster 2005; Tavakoli & Skehan 2005; Wendel 1997; Yuan & Ellis 2003). Therefore, this
study hypothesizes that watching has a similar effect to strategic planning and can help
learners produce more complex and more fluent speech.
The ‘Online Planning’ condition (Condition 3), is the same as the control condi-
tion regarding no pre-watching opportunity, but is different from the control con-
dition as it provides extra time for speakers to conduct on-line planning. The video
used for narration is of the same content for all the conditions, except that the video
for the two online planning conditions (i.e. ‘Online Planning’ and ‘Watched Online
Planning’) was edited to be played slowly so that extra time was created for conduct-
ing on-line planning (see ‘Methodology’). By comparing Condition 3 with the control
condition, it is hypothesised that the extra time for conducting online planning can
help learners attend to morpho-syntactic formulation so as to increase speech accu-
racy but speech fluency especially regarding speech rate (Yuan & Ellis 2003) is affected
by the detrimental effect of on-line planning. This condition leads to the question:
because the ‘Online Planning’ condition does not allow pre-watching, when provided
with extra time for on-line planning, can learners successfully take the opportunity to
focus on form? That is, can they successfully divide their attention between linguis-
tic formulation and story conceptualization simultaneously when both pressures are
high? A p ossible answer can be offered by comparing the result of this condition with
that of the ‘Watched Online Planning’ Condition (Condition 4).
The ‘Watched Online Planning’ condition (Condition 4) provides the opportunity
for pre-watching as well as the extra time for speakers to conduct on-line planning while
narrating. It is likely that having the chance to watch a video before speaking and then
benefitting from additional time when speaking (so that on-line planning is facilitated)
will reduce time pressure at both the conceptualization and the formulation phases so
that learners can speak with higher accuracy and complexity (Yuan & Ellis 2003). This
. There have also been studies which have found positive effects of pre-task planning on
accuracy, but these studies were either in a language testing context (Tavakoli & Skehan 2005;
Wigglesworth 1997) or provided planning time as long as 10 minutes and allowed note-taking,
which are very different from the ‘watching’ condition in this study. (Guará-Tavares 2008; Ortega
1999; Skehan & Foster 1997; Tajima 2003).
36 Zhan Wang
prediction is based on the on-line planning literature. Most of the studies in the litera-
ture used narrative tasks in which the content of the narration was already known to the
speakers (Ahmadian & Tavakoli 2011; Ellis 1987; Yuan & Ellis 2003), which is similar
to ‘having watched’ the video in this task condition. These studies found that providing
on-line planning to the speakers who has already known the story content had positive
effects on speech accuracy and complexity. Therefore, it is hypothesized that ‘Watched
Online Planning’ can help increase complexity and accuracy in learners’ speech per-
formance. Meanwhile, due to the time used for careful formulation, it will also have a
detrimental effect on speech fluency, especially on speech rate.
The ‘Watched Strategic Planning’ condition (Condition 5) provides an opportu-
nity for pre-watching as well as extra time (3 minutes) for speakers to conduct strategic
planning before speaking. It is a reinforcement of the ‘Watched’ condition (Condition
2). Through the provision of the pre-watching opportunity, conceptualization pres-
sure is likely to be reduced. Then having extra time before narrating, learners can use
the strategic planning time either to prepare for expressing the content of the story or
to search for solutions to certain anticipated linguistic problems (see Pang & Skehan
this volume). This condition is different from the two on-line planning conditions, in
which the extra time was inserted into the video so that speakers are more likely to
use the time resource ‘on-line’ to deal with immediate linguistic problems. However,
if the speakers know that the retrieval of the unfamiliar lexis may take a longer time
than the real time speaking situation could tolerate, they are likely to stop searching
for it o
n-line and switch to their familiar ways of expression – that might be a con-
straint of the two online planning conditions. In contrast, strategic planning affords
speakers time to search for unfamiliar lexis and expression before speaking. There-
fore, in this condition, having both a pre-watching opportunity and strategic planning
time, ‘Watched Strategic Planning’ could have a similar effect to those task conditions
involving adequate pre-task planning time such as 10 minutes (Guará-Tavares 2008;
Mehnert 1998; Ortega 1999; Skehan & Foster 1997; Tajima 2003). Based on the results
of these studies, it is hypothesized that ‘Watched Strategic Planning’ will result in
enhanced speech performance in complexity, accuracy, and fluency.
The ‘Repetition’ condition (Condition 6) uses immediate task repetition (i.e. a
kind of rehearsal) as an intervention to investigate time pressure reduction in the
complete process of speech production (i.e. conceptualization, formulation, and
articulation). Speakers were not told that they would carry out the same task again
until they had finished speaking on the task for the first time (so they themselves did
not think of their first encounter with the task as a rehearsal). Researchers have stud-
ied task repetition in various forms, but most of the studies have examined task rep-
etition after several days’ or weeks’ interval. For example, Bygate (1996) found fluency
and accuracy effects in a repetition task after a 3 day interval. Bygate (2001) found
increased speech fluency and complexity in a repetition task after 10 weeks. Gass,
Mackey, Álvarez-Torres and Fernández-García (1999) found that the group which
On-line time pressure manipulations 37
repeated the same task after a 2–3 day interval on two occasions outperformed the
non-repetition group regarding a general proficiency rating, partial accuracy of the
Spanish structure ‘to be’, morphosyntax, lexical density (i.e. type token ratio) and lexi-
cal sophistication (i.e. the number of difficult words used). The study by Lynch and
McLean (2000), however, involves a speaking condition similar to immediate task
repetition. They found that repeatedly making a presentation of a poster to differ-
ent interlocutors six times resulted in improved accuracy and fluency. More recently,
De Jong and Perfetti (2011), using a 4-3-2 repetition task (i.e. repeating a topic for
4, 3, and 2 minutes) as the training method, found that the group which repeated
the same topic three times had significantly higher fluency in a post-test one week
later than the group that spoke on three different topics each time. Although these
studies involve various forms of task repetition, generally they lead to the claim that
task repetition is likely to be a robust condition to enhance L2 speakers’ speaking flu-
ency, complexity and accuracy. The reason, as Bygate and Samuda (2005) explained,
is that during the re-run of the task, speakers are likely to build on the knowledge and
performance of the first enactment so that both speaking processing and language
product can be impacted by task repetition (p. 45). Therefore it is hypothesized that
immediate task repetition has the advantage of time pressure reduction in conceptu-
alization, formulation, and articulation, and L2 speakers’ speech complexity, accu-
racy, and fluency will be increased.
To summarize, the five experimental planning conditions proposed in this study are
connected with stages in speaking production and speaking performance – their com-
parisons with the non-intervention control condition may reveal the specific impact of
each time pressure reduction on L2 speaking performance. They may also reveal the
underlying mechanisms of L2 speech production with reference to processing stages.
Research Questions
1. Does the ‘Watched’ condition (Condition 2), which targets time pressure reduction
at the Conceptualizer stage, result in significantly more complex and more fluent
speech in comparison to the control condition (Condition 1)?
2. Does the ‘On-line Planning’ condition (Condition 3), which targets time pressure
reduction at the formulator stage, result in significantly more accurate but less
fluent speech in comparison to the control condition (Condition 1)?
3. Does the ‘Watched On-line Planning’ condition (Condition 4), which targets
time pressure reduction at both the Conceptualizer and the Formulator (through
on-line planning) stages, result in significantly more complex and more accurate
but less fluent speech in comparison to the control condition (Condition 1)?
38 Zhan Wang
4. Does the ‘Watched Strategic Planning’ condition (Condition 5), which targets
time pressure reduction at both the Conceptualizer and the Formulator (through
strategic planning) stages, result in significantly more complex, more accurate,
and more fluent speech in comparison to the control condition (Condition 1)?
5. Does the ‘Repetition’ condition (Condition 6), which targets time pressure reduc-
tion at the complete process of speech production (i.e. conceptualization, formu-
lation, and articulation), result in significantly more complex, more accurate, and
more fluent speech in comparison to the control condition (Condition 1)?
Method
Participants
The 77 participants (50 females, 27 males) in this study were undergraduates in differ-
ent majors at a university in Hong Kong, aged from 18 to 22. They were native Chinese
speakers who learned English as a second language. None of them had overseas expe-
rience of more than 3 months. They were recruited on a voluntary basis and a time
compensation fee was provided after they completed the tasks. Data were collected
through one-to-one meetings with participants. Once a student arrived for the data
collection, a pre-test was administered (see below) and according to the pre-test score,
the participants were assigned to one of the task conditions. The researcher made the
grouping decisions with the intention of achieving the balance of pre-test (primarily),
gender, year of study, and major of study across groups. The participants’ English pro-
ficiency, as self-reported, ranged from TOEFL 540 to 630 and IELTS 6 to 7.5 (with the
speaking subset ranging from 5.5 to 7.5).
Material
Drawing on Skehan and Foster (1999, 2005), we used two videos from the Mr. Bean
series to elicit speaking performance (the content of the two video stories is presented
in Appendix A). Using two videos can avoid some task irrelevant variables such as
learners being unfamiliar with the narrative task. The presentation sequence of the two
On-line time pressure manipulations 39
videos was counter-balanced, and the study reports the mean scores of the two video
performances as the results. The same videos were used in all conditions. The only dif-
ference across the conditions was that there were two playing rates: a slow speed for
conditions involving on-line planning, and a normal speed for all other conditions.
Details about task duration are discussed in Table 4. The Mr. Bean video series are
appropriate for narrative retelling because each episode is short, largely mimed, easy
to comprehend, and appealing (Skehan & Foster 1999). The videos were piloted for its
comprehensibility and cultural understanding.
1 Control – – – 5
2 Watched √ – – 10
3 Online Planning# – √ – 8
4 Watched Online Planning# √ √ – 13
5 Watched Strategic Planning √ – √ 13
6 Repetition Repetition – – 10
# Condition 3 ‘Online Planning’ and Condition 4 ‘Watched Online Planning’ used a slow version of video
that made a 5 minutes normal video become 8 minutes long so that it allows 3 minutes implicitly as on-line
planning time.
40 Zhan Wang
and proficiency level to the main study participants. The three pilot versions were: 50%
of the normal speed (making a normal 5 minutes’ video 10 minutes long), 60 % of the
normal speed (8 minutes), and 75% of the normal speed (6.5 minutes). These slowed
versions were made by using the Adobe Premiere© video compilation software to
edit the videos to be played consistently slower throughout the video (with no pauses
manually inserted). Based on the pilot participants’ feedback, 60% of the normal speed
was selected as the experimental version of on-line planning video because it was not
so slow for the manipulation to be recognized. 50% of normal speed, in contrast, was
recognized as ‘artificial’ (even though it would allow more on-line planning time).
Therefore the 60% speed was selected, which made a normal 5 minutes’ video become
8 minutes’ long. In other words, an extra 3 minutes was available to facilitate on-line
planning time implicitly while the video was running.
. Following Skehan (2009a), we divided fluency into three components: speed, breakdown, and
repair.
. Following Yuan and Ellis (2003) we used ‘pruned’ words – the words that were repeated, refor-
mulated and reduced were excluded from the calculation.
. Here we follow the CLAN manual in calling D ‘lexical diversity’ (MacWhinney 2000).
. The type token ratio is the total number of different words divided by total number of words.
42 Zhan Wang
Coding
The 90 speech samples collected (See Table 6 for the sample size of each condition)
were transcribed and coded following CHAT format (MacWhinney 2000) and Task-
profile conventions (Skehan, Chapter 1, this volume). The basic segmentation of
units for analysis was AS Units (Foster, Tonkyn & Wigglesworth 2000). Codes such
as measures of language complexity (e.g. subordination) and language accuracy (e.g.
error-free clauses) were computed by Taskprofile except for lexical diversity, which
was computed by the command ‘VOCD’ in CLAN software (MacWhinney 2000;
Malvern & Richards 2002; Richards & Malvern 1998).
Analysis
In view of the large number of dependent variables (i.e. speech performance mea-
sures), five MANOVAs were conducted to analyze each experimental condition (i.e.
‘Watched’, ‘On-line Planning’, ‘Watched On-line Planning’, ‘Watched Strategic Plan-
ning’, and ‘Repetition’) in comparison with the control condition. Statistical signifi-
cance is assessed relative to the two-tailed a priori alpha level of 0.05 (p < .05) for all
the measures. In each MANOVA, the dependent variables were the 10 performance
measures as listed in Table 5, and the independent variables were two speaking condi-
tions – the various experimental speaking conditions in comparison with the control
condition.
Before performing the MANOVAs, multivariate normality was examined. The
normal distribution of every dependent variable was examined and all variables that
deviated from normality (p < .001) were transformed into normal distribution using
a logarithm transformation. Standardized skewness and kurtosis were set within the
range of (–2, 2). Cohen’s d (Cohen 1988, 1992, 1994) was used in this study as the form
of effect size measures. Cohen’s d is based on the concept of standardized mean differ-
ence of a contrast (e.g. the difference between mean scores of a control condition and
an experiment), which is easy to comprehend and consistent with Norris and Ortega’s
(2000) meta-analysis of L2 instruction.
In such a study containing 5 MANOVAs, where each MANOVA involves 2 speak-
ing conditions as the independent variables and 10 performance measures as the
dependent variables, a mini meta-analysis based on effect size comparisons is ideal for
comparing the effect of each experimental condition. Confidence intervals, as men-
tioned by Norris and Ortega (2000), gauge the statistical trustworthiness of observed
effects (Rosenthal 1991). Therefore, 95% confidence intervals (CI95) around each
mean effect size were computed. A confidence interval at 95% in a population can be
interpreted as claiming the effect as likely 95% of the time.
On-line time pressure manipulations 43
Results
Pre-test
Two participants who scored 49 and 50 (out of 50) were excluded from the data analy-
sis since their scores were too close to the ceiling. Another participant who dropped
out after completing the first speaking task was excluded too. Table 6 below presents
the means and standard deviations (SD) for the pre-test. An ANOVA showed that
there was no significant difference among the groups in terms of pre-test scores (at
p < .05). The ‘Control’/ ‘Repetition’ group represents a within-subjects design, in which
the same group of participants produced speech samples twice, once for the baseline
performance as ‘Control’ samples, and the other as ‘Repetition’ samples.
Speaking conditions
The results of descriptive and inferential statistics are given in Table 7. Since tests of
statistical significance can be greatly affected by sample size, a very weak effect can be
‘statistically significant’ while a strong effect can fail to attain ‘significance’ (Cortina &
Nouri 2000; Hunter & Schmidt 1990; Meehl 1990) if sample sizes are large or small
respectively. However, effect size, which is the standardized index of the magnitude of
an effect, makes it possible to compare the effects of different variables within a given
study or to compare the effects of the same variable across different studies (Cortina &
Nouri 2000: p. 8). It is also unbiased with regard to the scale of the measurement the
researcher used as well as the standard errors of the dependent variables. Therefore,
the effect size of each contrast being significant or not is given in Table 8. The results
in Table 8 serve as a meta-analysis which synthesizes the effects of all the 5 types of
planning conditions relative to the control condition.
44 Zhan Wang
From Table 8 we can see that first, ‘Watched’ and ‘Watched Strategic Planning’ have
similar effects on speaking performance. Both of them significantly increased speech
fluency (measured by Speech_Rate) and complexity (measured by Total_Words, ML_
AS, and Subordination), but did not affect speech accuracy measured by the rate of
error-free clauses (EF_Clause_Rate) or lexical diversity. It is worth noting here that the
‘Watched’ condition was designed to reduce time pressure at the stage of conceptual-
ization while ‘Watched Strategic Planning’ was targeted at both conceptualization and
On-line time pressure manipulations 45
and the number of error free clauses (measured by EF_Clause) that were produced.
However, these two effects were likely to be the result of the extended speaking time,
so that speech quantity was increased.
Third, unlike the ‘On-line Planning’ condition, ‘Watched On-line Planning’
significantly enhanced speech complexity (measured by Total_Words, ML_AS,
and Subordination) and accuracy (measured by EF_Clause and EF_Clause_Rate).
In addition, ‘Watched On-line Planning’ was the only condition that significantly
increased the number of reformulations – an indicator of less fluency and more
self-repairs, which actually provides evidence for learners’ greater engagement in
the on-line planning and monitoring activities compared to the control condition.
It seems that although both ‘On-line Planning’ and ‘Watched Online Planning’ are
supposed to influence on-line linguistic formulation, only the ‘Watched On-line
Planning’ condition enhanced performance through speaking in a more complex
and more accurate way. We will discuss in detail later why these two conditions pro-
duced such contrasting results.
Fourth, the ‘Repetition condition, which was supposed to release time pressure at
all three speaking stages, generated significant improvement in speech fluency, com-
plexity, and accuracy. The results therefore lend support to the previous hypotheses.
Moreover, the magnitude of effects for the ‘Repetition’ condition is the largest among
all the conditions which were used.9
Last but not the least, none of the experimental conditions had a significant effect
on lexical diversity (measured by D), which is consistent with the general findings in
L2 planning studies (e.g. Ortega 1999; Yuan & Ellis 2003). Ortega used the type-token
ratio as a measure and did not find a significant planning effect with lexical variety.
Similarly, Yuan and Ellis (2003) did not find an effect of strategic planning or on-line
planning on lexical variety relative to No Planning either. It seems that learners’ lexical
performance is task dependent. As Skehan (2009b) and Skehan and Foster (2008) have
found, different tasks (Personal, Narrative and Decision-making tasks) resulted in sig-
nificantly different lexical sophistication10 scores (Read, 2000) measured by Lambda.11
Tavakoli and Foster (2008) also demonstrated that a task with storyline background
produced higher lexical diversity D than a task without background in unstructured
tasks (but not in structured tasks) for EFL learners. They argued that it may be the
. ‘Repetition’ is the only condition that used within-subject design. Such repeated measures may
slightly increase the effect sizes (Cortina & Nouri 2000).
. This lexical measure computes how many difficult words are used in a text defined by lower
frequencies on the basis of a certain frequency list from corpus analysis.
. This uses a Poisson distribution to identify events which have low frequency levels, following
Meara and Bell (2001) and Bell (2003) (see Skehan 2009a, and Chapter 1, this volume, for more
details).
On-line time pressure manipulations 47
number of events in tasks that determine the extent of lexical variety that L2 speakers
use – more events, more diversity of vocabulary (p. 462).
Table 9 further summarizes the above results. Four patterns can be identified from
the table (and in relation to Table 3):
1. All the experimental conditions that contained ‘pre-watching’ had a positive effect
on speech complexity.
2. All the ‘pre-watching’ conditions that did not involve ‘on-line planning’ improved
speech fluency as well – indicating that reducing time pressure at conceptualiza-
tion had positive effects on speech complexity and fluency.
3. The experimental conditions that had time pressure reduction at both conceptu-
alization and on-line formulation (i.e. ‘Watched On-line Planning’) had a positive
effect on speech accuracy.
4. The only condition that had positive effects on all the three components of speak-
ing performance (i.e. fluency, complexity and accuracy) is the ‘Repetition’ condi-
tion, which targets time pressure reduction at conceptualization, formulation, and
articulation.
Table 9. Comparing each experimental condition with the control condition
Experimental conditions Complexity Accuracy Fluency Lexis
Discussion
This study started with a discussion of the stages of L2 speech production. It then
explored methods of achieving time pressure reduction for L2 speakers, drawing upon
Ellis’ task-based planning framework. A study was designed to explore the compara-
tive effectiveness of several different research conditions. The results obtained from
the study demonstrated that different interventions had different effects on speech
performance. In this section, we mainly discuss the following three issues:
1. Why did time pressure reduction at the conceptualization stage result in higher
language complexity and fluency (as in the ‘Watched’ and ‘Watched Strategic’
conditions)?
48 Zhan Wang
2. Why did time pressure reduction solely at the on-line formulation stage fail to
improve the quality of speech (as in the ‘On-line Planning’ condition) whereas
intervention targeting both the formulation and conceptualization stages resulted
in higher speech complexity and accuracy (as in the ‘Watched On-line Planning
condition)?
3. Why did the conditions involving monitoring (as in ‘Watched On-line Planning’
that involves on-line monitoring and ‘Repetition’ that involves speech perception
monitoring) result in higher speech accuracy?
Skehan et al. (2012) have argued that ‘working with ideas’, ‘rehearsing’, and ‘monitor-
ing’ are the three processes that influence how L2 speech is produced and what the
nature of the speech performance will be. This analysis may provide answers to the
above questions. Based on the results of this study, and trying to connect pedagogical
interventions with the speech production processes, I adopt their generalization and
revise the three processes into ‘content conceptualization’, ‘linguistic formulation’, and
‘speech monitoring’ (See Table 10).
Content conceptualization + +
Linguistic formulation (+) –
Speech monitoring +
+ refers to a positive effect; – refers to a negative effect
(+) refers to a conditional positive effect (e.g. the intervention targeting linguistic
formulation may have a positive effect if the conceptualization pressure has been dealt with).
the planning effects on speech accuracy in the literature ‘reflect whether learners were
able to, or chose to engage in monitoring while they performed the task’ (p. 131).
Based on the results of speech accuracy in Bui (this volume), Li (this volume), and
this study, I also argue that monitoring is the key to accuracy. Skehan et al. (2012)
have proposed that allocating attentional resources to monitor ‘what is being said
just before it is said’ induces selective attention towards accuracy. ‘Watched On-line
Planning’ freed up attentional resources for conceptualization and also released time
for on-line planning so as to commit more attention to the monitoring of linguistic
structures that are being produced. In Bui’s study, pre-task planning did not help L2
learners improve accuracy whereas ‘familiarity with the speaking topic’ (operational-
ized as the task topic being related to learners’ major of study) did. The advantage of
‘familiarity’ probably allowed less pressure on conceptualization and easier retrieval of
topic-related lexis so as to have more attentional resources to monitor the speech – as
a result, accuracy was increased. Similarly, Li found that having post-task activities
(operationalized as transcribing their own speech products) induced L2 speakers to
produce more accurate language. As Skehan et al. (2012) proposed, learners who knew
they would transcribe their own errors seemed to (selectively) direct their attention to
accuracy during speech.
Following this analysis, it is claimed that the accuracy benefit demonstrated in this
study is not solely because of the interventions at both conceptualization and formu-
lation stages (as in ‘Watched On-line Planning’), but more precisely it is because the
reduction in time pressure at the two stages provides greater opportunities to direct
attentional resources to the monitoring of speech production so that speech accuracy
was raised. The evidence of increased self-repair (measured by reformulations) in the
‘Watched On-line Planning’ condition is essentially evidence for more engaged speech
monitoring.
Two other studies which also provide supportive evidence for this claim are
worth mentioning. Hulstijn and Hulstijn (1984) hypothesized that two conditions in
their study would favour accuracy: (a) instructions to focus-on-form (in contrast to
focus-on-meaning), and (b) slowed-down processing (in contrast to normal process-
ing speed). The results showed that only the focus-on-form condition helped produce
higher grammatical accuracy, while slowed-down processing did not. Interestingly, the
focus-on-form condition led to more time being taken in processing the speech, indi-
cating that focus-on-form might involve a higher level of speech monitoring whereas
slowing-down speech to enable on-line planning might not necessarily. Mochizuki
and Ortega (2008) found similar results in terms of an instructional intervention tar-
geting the monitoring of grammatical structures. They investigated three speaking
conditions in a narrative task with an auditory stimulus: no planning, unguided plan-
ning, and guided planning (operationalized as providing printed instructions on how
to make relative clauses). The results showed that while guided planners had a lower
speech rate than unguided planners and no planners, they had higher accuracy than
52 Zhan Wang
unguided planners in the relative clauses that they produced. All the studies discussed
here are consistent with the generalisation that allowing unlimited time for processing
or ‘Watched On-line Planning’ does not necessarily enhance accuracy, but directing
learners attention to speech monitoring often does. This claim can then explain why
mixed results have been found in the literature regarding whether different types of
planning benefit speech accuracy (Ellis 2009).
Finally an effect for speech monitoring was also found in the ‘Repetition’ con-
dition too. As proposed by Bygate and Samuda (2005), it is likely that speaking for
the first time has already involved the speech comprehension system so that speech
monitoring can operate at three places: the pre-verbal message, the inner speech plan
and overt speech plan (Levelt 1993). In other words, when L2 speakers are actually
speaking for the first time, they become the first (more than anyone else) to parse their
speech. As a result, the degree of monitoring in speech production in the ‘Repetition’
condition is higher than the other conditions that do not involve overt articulation.
Repeating the same task would induce more attention for monitoring, from which the
whole speech performance would benefit. This accuracy benefit found from immedi-
ate task repetition in this study is consistent with the findings in Bygate (1996), and
Gass et al. (1999).
To sum up, based on the results of this study, an instructional model of L2 speech
intervention was proposed (in Table 10). It suggests that different instructions will
enhance different aspects of language. First, L2 complexity could be enhanced by
interventions focusing on conceptualization. Second, speech fluency can be improved
by the manipulation of either conceptualization or formulation: easing conceptualiza-
tion or enabling rehearsal at the formulation stage could increase fluency but expand-
ing on-line formulation could decrease fluency to some extent. Third, interventions
focusing on speech monitoring is the key to improving language accuracy.
Conclusion
Researchers are interested in models of second language speaking because such mod-
els can uncover the subtle underlying mechanisms of speech production. Second
language pedagogy needs instructional models because they help teachers teach in
a systematic and effective way. The overarching goal of this study is to connect L2
speech production theories and instruction in order to develop an evidence-based
instructional model for L2 teaching. I have tried to accomplish this goal by establish-
ing a relation between the points of interventions with different processes of speech
production, on the one hand, and with performances, on the other. The five different
types of planning and repetition conditions designed, which are: ‘Watched’, ‘On-line
Planning’, ‘Watched On-line Planning’, Watched Strategic Planning’, and ‘Repetition’
On-line time pressure manipulations 53
represent time pressure reduction at certain production stages when compared with
the control condition.
The results showed that ‘Watched’ and ‘Watched Strategic Planning’ both helped
produce more complex and more fluent language; ‘On-line Planning’ did not help
complexity and accuracy, whereas ‘Watched On-line Planning’ did help produce more
complex and more accurate language but with a trade-off of having more reformu-
lations in speech; repetition is a robust condition that promoted speech complexity,
accuracy, and fluency with large effect sizes.
These results lead to the question as to why ‘On-line Planning’, which is s upposed
to direct attention to on-line formulation, did not help language accuracy but ‘Watched
On-line Planning’ did; and further, why there are often mixed results with accuracy in
task-based planning research. What has been proposed here is that L2 speakers are not
likely to attend to form unless conceptualization pressure has been solved – a ‘mean-
ing priority’ principle in L2 speaking. Because of this, ‘Watched On-line Planning’ has
been described as conceptualized on-line planning. It is argued that conceptualized
on-line planning has a greater chance, in terms of time and attentional resources, to
focus on speech monitoring. The evidence from this study as well as a range of other
empirical studies is consistent with this claim, such as in Bui (this volume), H ulstijn
and Hulstijn (1984), Li (this volume), Mochizuki and Ortega (2008), and Yuan and Ellis
(2003). These studies, together with the findings from the ‘Watched On-line P lanning’
and ‘Repetition’ conditions of this study, all report an accuracy effect for second lan-
guage speaking. It is concluded that speech monitoring is the key to L2 speech accuracy.
Author note
The author would like to express gratitude to Peter Skehan, Martin Bygate, John Norris,
Kris Van den Branden, and Rod Ellis for their helpful comments on earlier versions
of this article.
References
Ahmadian, M.J., & Tavakoli, M. (2011). The effects of simultaneous use of careful online planning
and task repetition on accuracy, fluency, and complexity of EFL learners’ oral production. Lan-
guage Teaching Research, 15, 35–59.
Anderson, J.R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press.
Anderson, J.R., & Lebiere, C. (1998). The atomic components of thought. Mahwah, NJ: Lawrence
Erlbaum Associates.
Anderson, J.R., Bothell, D., Byrne, M.D., Douglass, S., Lebiere, C., & Qin, Y. (2004). An integrated
theory of the mind. Psychological Review, 111(4), 1036–1060.
54 Zhan Wang
Foster, P., Tonkyn, A., & Wigglesworth, J. (2000). Measuring spoken language: A unit for all reasons.
Applied Linguistics, 21(3), 354−75.
Foster, P., & Skehan, P. (1996). The influence of planning on performance in task-based learning.
Studies in Second Language Acquisition, 18, 299–324.
Foster, P., & Skehan, P. (1999). The influence of source of planning and focus of planning on task-
based performance. Language Teaching research, 3, 185–214.
Gass, S., Mackey, A., Álvarez-Torres, M.J., & Fernández-García, M. (1999). The effects of task repeti-
tion on linguistic output. Language Learning, 49, 549–581.
Gilabert, R. (2007). Effects of manipulating task complexity on self-repairs during L2 oral produc-
tion. IRAL, 45, 215–240.
Guará-Tavares, M.G. (2008). Pre-task planning, working memory capacity and L2 speech performance.
Unpublished Ph.D. thesis. Universidade Federal de Santa Catarina, Brazil.
Hinkel, E. (2004). TOEFL test strategies with practice tests (3rd ed.) Hauppauge, NY: Barron’s.
Housen, A., & Kuiken, F. (2009). Complexity, accuracy, and fluency in second language acquisition.
Applied Linguistics, 30(4), 461–473.
Hulstijn, J.H., & Hulstijn, W. (1984). Grammatical errors as a function of processing constraints and
explicit knowledge. Language Learning, 34, 23–43.
Hunter, J.E., & Schmidt, F.L. (1990). Methods of meta-analysis: Correcting error and bias in research
findings. Newbury Park, CA: Sage.
Kawauchi, C. (2005). The effects of strategic planning on the oral narratives of leaners with low and
high intermediate L2 proficiency. In R. Ellis (Ed.), Planning and task performance in a second
language (pp. 37–76). Amsterdam: John Benjamins.
Kello, C.T, & Plaut, D. (2003). Strategic control over rate of processing in word reading: A computa-
tional investigation. Journal of Memory and Language, 48, 207–232.
Kello, C.T. (2004). Control over the time course of cognition in the tempo-naming task. Journal of
Experimental Psychology: Human Perception and Performance, 30(5), 942–955.
Kello, C.T., & Plaut, D.C. (2000). Strategic control in word reading: Evidence from speeded respond-
ing in the tempo-naming task. Journal of Experimental Psychology: Learning, Memory, & Cogni-
tion, 26, 719–750.
Kello, C., Plaut, D., & MacWhinney, B. (2000). The task-dependence of staged versus cascaded pro-
cessing: An empirical and computational study of Stroop interference in speech production.
Journal of Experimental Psychology: General, 129(3), 340–360.
Kempen, G., & Hoenkamp, E. (1987). An incremental procedural grammar for sentence formula-
tion. Cognitive Science, 11, 201–258.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Lawrence
Erlbaum Associates.
Kroll, J.F., & Stewart, E. (1994). Category interference in translation and picture naming: Evidence
for asymmetric connections between bilingual memory representations. Journal of Memory and
Language, 33, 149–174.
Lashley, K.S. (1951). The problem of serial order in behavior. In L.A. Jeffress (Ed.), Cerebral mecha-
nisms in behavior (pp. 112–146). New York, NY: Wiley.
Levelt, W.J.M, Roelofs, A., & Meyer, A.S. (1999). A theory of lexical access in speech production.
Behavioral and Brain Sciences, 22, 1–75.
Levelt, W.J.M. (1983). Monitoring and self-repair in speech. Cognition, 33, 41–103.
Levelt, W.J.M. (1989). Speaking: From intention to articulation. Cambridge, MA: The MIT Press.
Levelt, W.J.M. (1993). Language use in normal speakers and its disorders. In G. Blanken, J. D
ittmann,
H. Grimm, J.C. Marshall, & C–W. Wallesch, (Eds.), Linguistic disorders and pathologies
(pp. 1–15). Berlin: De Gruyter.
56 Zhan Wang
Levelt, W.J.M. (1999). Producing spoken language: A blueprint of the speaker. In C. Brown &
P. Hagoort (Eds.), Neurocognition of language (pp. 83–122). Oxford: OUP.
Levelt, W.J.M. (2001). Spoken word production: A theory of lexical access. Proceedings of the National
Academy of Sciences USA, 98(23), 13464–13471.
Long M., & Robinson, P. (1998). Focus on form: Theory, research, and practice. In C. Doughty &
J. Williams (Eds.), Focus on form in classroom SLA (pp. 15–41). Cambridge: CUP.
Lynch, T., & Maclean, J. (2000). Exploring the benefits of task repetition and recycling for classroom
language learning. Language Teaching Research, 4, 221–50.
Lynch, T., & Maclean, J. (2001). Effects of immediate task repetition on learners’ performance. In
M. Bygate, P. Skehan, & M. Swain (Eds.), Researching pedagogic tasks, second language learning,
teaching and testing (pp. 99–118). Harlow: Longman.
MacWhinney, B. (2000). The CHILDES Project: Tools for Analyzing Talk (3rd ed.). Mahwah, NJ: Law-
rence Erlbaum Associates.
Malvern, D., & Richards, B. (2002). Investigating accommodation in language proficiency interviews
using a new measure of lexical diversity. Language Testing, 19, 85–104.
Meara, P., & Bell, H. (2001). P_Lex: A simple and effective way of describing the lexical characteris-
tics of short L2 texts. Prospect, 16(3), 5–19.
Meehl, P.E. (1990). Why summaries of research on psychological theories are often uninterpretable.
Psychological Reports, 66, 195–244.
Mehnert, U. (1998). The effects of different lengths of time for planning on second language perfor-
mance. Studies in Second Language Acquisition, 20, 83–108.
Meyer, D.E., & Gordon, P.C. (1985). Speech production: Motor programming of phonetic features.
Journal of Memory and Language, 24, 3–26.
Mochizuki, N., & Ortega, L. (2008). Balancing communication and grammar in beginning level for-
eign language classrooms: A study of guided planning and relativization, Language Teaching
Research, 12, 11–37.
Norris, J.M., & Ortega, L. (2009). Towards an organic approach to investigating CAF in SLA: The case
of complexity. Applied Linguistics, 30(4), 555–578.
Norris, J.M., & Ortega, L. (2000). Effectiveness of L2 Instruction: A research synthesis and quantita-
tive meta-analysis. Language Learning, 50(3), 417–528.
Ortega, L. (1999). Planning and focus on form in L2 oral performance. Studies in Second Language
Acquisition, 21, 109–148.
Read, J. (2000). Assessing vocabulary. Cambridge: CUP.
Richards, B.J., & Malvern, D.D. (1998). A new research tool: Mathematical modelling in the measure-
ment of vocabulary diversity (Award reference no. R000221995). Final Report to the E conomic
and Social Research Council, Swindon, UK.
Robinson, P. (1995). Task complexity and second language narrative discourse. Language Learning,
45, 99–140.
Robinson, P. (2011). Task-based language learning: A review of issues. Language Learning,
61(Suppl. 1), 1–36.
Roelofs, A. (1997). The WEAVER model of word-form encoding in speech production. Cognition,
64, 249–284.
Rosenthal, R. (1991). Meta-analytic procedures for social research. Newbury Park, CA: Sage.
Samuda, V., & Bygate, M. (2008). Tasks in second language learning. Basingstoke: Palgrave.
Sangarun, J. (2005). The effects of focusing on meaning and form in strategic planning. In R. Ellis (Ed.),
Planning and task performance in a second language (pp. 111–141). Amsterdam: John Benjamins.
On-line time pressure manipulations 57
Sawaki, Y., Stricker, L.J., & Oranje, A.H. (2009). Factor structure of the TOEFL internet-based test.
Language Testing, 26(1), 5–30.
Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11,
129–158.
Schneider, W., & Chein, J.M. (2003). Controlled & automatic processing: Behavior, theory, and bio-
logical mechanisms. Cognitive Science, 27, 525–559.
Schriefers, H., Meyer, A.S., & Levelt, W.J.M. (1990). Exploring the time course of lexical access in
language production: Picture-word interference studies. Journal of Memory and Language, 29,
86–102.
Seidenberg, M.S. (1997). Language acquisition and use: Learning and applying probabilistic con-
straints. Science, 275, 1599–1603.
Shiffrin, R.M., & Schneider, W. (1977). Controlled and automatic human information processing,
II: Perceptual learning, automatic attending and a general theory. Psychological Review, 84(2),
127–190.
Skehan, P. (1998). A cognitive approach to language learning. Oxford: OUP.
Skehan, P. (2009a). Modelling second language performance: Integrating complexity, accuracy,
fluency and lexis. Applied Linguistics, 30(4), 510–532.
Skehan, P. (2009b). Lexical performance by native and non-native speakers on language-learning
tasks. In H. Daller, D. Malvern, P. Meara, J. Milton, B. Richards, & J. Treffers-Daller (Eds.),
Vocabulary studies in first and second language acquisition: The interface between theory and
application (pp. 107–124). London: Palgrave Macmillan.
Skehan, P., & Foster, P. (1997). The influence of planning and post-task activities on accuracy and
complexity in task-based learning. Language Teaching Research, 1(3), 16–33.
Skehan, P., & Foster, P. (1999). The influence of task structure and processing conditions on narrative
retellings. Language Learning, 49, 93–120.
Skehan, P., & Foster, P. (2005). Strategic and online planning: The influence of surprise information
and task time on second language performance. In R. Ellis (Ed.), Planning and task performance
in a second language (pp. 193–218). Amsterdam: John Benjamins.
Skehan, P., & Foster, P. (2008). Complexity, accuracy, fluency and lexis in task-based performance:
A meta-analysis of the Ealing research. In S. Van Daele, A. Housen, F. Kuiken, M. Pierrard, &
I. Vedder, (Eds.), Complexity, accuracy and fluency in second language use, learning and teaching
(pp. 263–284). Brussels: Royal Flemish Academy of Belgium for Sciences and Arts.
Skehan, P., Bei, X., Li, Q., & Wang, Z. (2012). The task is not enough: Processing approaches to task-
based performance. Language Teaching Research, 16(3), 170–187.
Squire, L.R. (1987). Memory and brain. Oxford: OUP.
Tajima, M. (2003). The effects of planning on oral performance of Japanese as a foreign language.
Unpublished Ph.D. thesis, Purdue University.
Tavakoli, P., & Foster, P. (2008). Task design and second language performance: The effect of narra-
tive type on learner output. Language Learning, 58, 439–473.
Tavakoli, P., & Skehan, P. (2005). Strategic planning, task structure, and performance testing. In
R. Ellis (Ed.), Planning and task performance in a second language (pp. 239–276). Amsterdam:
John Benjamins.
Ullman, M.T. (2001a). The declarative/procedural model of lexicon and grammar. Journal of Psycho-
linguistic Research, 30(1), 37–69.
Ullman, M.T. (2001b). The neural basis of lexicon and grammar in first and second language: The
declarative/procedural model. Bilingualism: Language and Cognition, 4(1), 105–122.
58 Zhan Wang
Task readiness
Theoretical framework and empirical evidence
from topic familiarity, strategic planning,
and proficiency levels*
* I would like to thank the Editor of this volume, Peter Skehan, who was also my Ph.D. supervisor,
for his guidance through this research. Thanks also go to Martin Bygate for his valuable comments
on previous drafts of this chapter.
64 Bui Hiu Yuet Gavin
Introduction
One distinctive feature of second language (L2) speaking is that many learners put
a lot of effort into speaking but still fail to reach native-like proficiency. A tension
between the meaning to express and the appropriate forms to use becomes a major
challenge to L2 learners. While communicative language ability in general involves
the ability to express ideational, interpersonal and discoursal meanings through the
use of formal linguistic resources, L2 development in particular further requires help-
ing learners to achieve the capacity to use resources already available to them (Bygate
2001). There exists a gap between L2 learners’ potential competence and their actual
performance. Such a phenomenon may be attributed to L2 learners’ underdeveloped
proficiency, but on top of this, they also fall prey to their processing capacity limita-
tions (Baddeley 1997; Skehan 1998). Therefore, there is a need to explore pedagogical
tasks and task conditions which go beyond cultivating underlying structural abilities
and which also increase learners’ readiness to handle various communicative needs
(Samuda 2001).
Much current research done to this end has focussed on planning (Ellis 2005,
2009), operationalized in a variety of forms such as pre-task planning and within-task
planning, as a means of maximizing learners’ readiness for tasks (see Wang, C hapter 2,
this volume). Different types of planning have been shown to promote learner per-
formance in different areas, that is, fluency, complexity, and accuracy. What seems
unfortunate is that, after all these studies, we still lack a comprehensive account of the
interrelationship between various types of planning from a more wide-ranging per-
spective. The term “planning” per se, if it is to be used as an umbrella term for concepts
such as rehearsal, strategic planning, and online planning, (Ellis 2005), appears to fall
short of capturing the generic features which are shared, and is a bit too limited in its
scope as a means for preparing students to handle tasks more effectively. Based on an
empirical study, this chapter proposes “task readiness” as an alternative theoretical
framework to “planning” in order that task research can be better contextualized and
the different types of planning can be more clearly inter-connected to each other in
research as well as practice.
Ellis (2005) distinguished between two types of planning: (1) pre-task planning which
can be further divided into rehearsal and strategic planning, and (2) within-task
planning that subsumes both pressured and unpressured situations. Rehearsal, simply
put, is to allow learners to practise a task before its actual performance, as e xemplified
Task readiness 65
by Bei (2013). Rehearsal usually involves explicit signalling to the learner that the pre-
vious performance may serve as preparation for the next. This makes an interesting
contrast with task repetition (Bygate 2001) in which students receive no briefing about
future performance, thus their drawing on the prior knowledge of the same task for
the following tasks becomes implicit planning (see Table 1 below for the compari-
son). In general both rehearsal and task repetition show very strong positive effects
on fluency, complexity, and/or accuracy (also see Wang 2009; Chapter 2, this volume).
Strategic planning is the most widely studied form of planning in the literature. It is
generally operationalized as offering planning time (Crookes 1989; Foster & Skehan
1996; Skehan & Foster 2005) prior to a task. Strategic planning is in most cases found
to push learners towards more fluent speech which involves more complex clausal
structure, whereas its effects on accuracy are not consistent in the literature (Ellis 2009;
Skehan Bei, Li & Wang 2012). One possible reason for the mixed results with accu-
racy is that studies have differed as to whether the task conditions allowed time for or
encouraged careful on-line planning (i.e. formulation and monitoring of speech plans
during performance) (Yuan & Ellis 2003). Within-task planning, or on-line planning,
is assumed to occur when sufficient time is available for planning during speaking. An
example of unpressured within-task planning is to have learners describe an edited
video which is played at a lower speed (Wang, Chapter 2, this volume). In contrast,
pressured within-task planning does not allow any leeway in gaining extra time for
planning while speaking. Speakers have to engage in real time planning for the ongo-
ing communicative task.
Ellis (2009) slightly revised this system of categorization and talks about three
types of planning: rehearsal, pre-task planning and within-task planning. Even so,
these two categorizations (2005 & 2009) are essentially the same, dwelling on task-
external manipulations of the degree of preparedness for a task. Rehearsal and pre-task
(strategic) planning without doubt prepare learners prior to a task, but within-task
planning can also be viewed as being something that can be increased or decreased so
as to vary the readiness for performance in a series of consecutive segments of strate-
gic planning, carried out ad hoc during a task.
If we adopt a broader perspective on this issue, the notion of planning as prepara-
tion or readiness in order to increase one’s capacity to do a task should extend its hori-
zons beyond these task-external means outlined in Ellis (2005, 2009). Prior knowledge
about, and hence familiarity with, the content of a task or the schemata of a task –
the knowledge or preparedness a speaker brings to any task whether or not pre- or
within-task planning time is provided – should also be incorporated into this broader
sense of planning. This study will provide evidence that familiarity with a certain topic
facilitates learner performance in a variety of ways similar to other types of planning,
albeit different in some other areas as well. Therefore topic familiarity is, one could
66 Bui Hiu Yuet Gavin
The last case of task-internal readiness, task repetition, makes an intriguing com-
parison with the first type of task-external readiness, namely rehearsal, with the major
difference lying in whether one knows if s/he is going to do the task again. Task repeti-
tion involves no briefing about future performance, so the previous task constitutes an
implicit planning opportunity which brings potential topic familiarity and task famil-
iarity to the next round of performance. In contrast, rehearsal as a task-external readi-
ness offers a probability known to the learner that one can prepare by practising the
task prior to the actual performance. Rehearsal thus becomes explicit planning which
also characterizes the other two kinds of task-external readiness: strategic planning
and within-task planning.
The major difference between task-internal and task-external readiness is the
degree of naturalness, or rather the degree of ad hoc manipulation, of the task prepa-
ration. Task-internal readiness, especially topic familiarity and schematic familiarity,
could be thought of as a more inherent and natural type of readiness, albeit perhaps
not so much a conscious process. At the same time, task-external readiness has a more
artificial element in that learners have imposed upon them extra manipulations to a
task. A question then arises from this comparison: which has a stronger influence for
the improvement of task performance? The literature on task research has little to offer
in this regard, so we will turn next to other areas for relevant insights.
Evidence for the influence of topic familiarity exists mainly in studies of speech
comprehension. The effects of prior knowledge, or schemas, in Piagetian terms, provide
good explanations for speech comprehension from a top-down perspective. In reading
comprehension research, being familiar with a certain content area has been gener-
ally found to be facilitative (Barry & Lazarte 1995; Bügel & Buunk 1996; Chang 2006;
Chen & Donin 1997; Johnson 1982; Lee 1986; Shimoda 1993). More recently, Lee (2007)
and Leeser (2007) also discovered that familiar texts greatly contributed to the compre-
hension of reading materials with also better content recall among L2 English learners
and L2 Spanish learners respectively. In listening comprehension, the mechanism of
topic familiarity bears some resemblance to that in reading, but the time constraints of
listening in real time impose additional difficulty on listeners. The time allowed in lis-
tening for the construction process (Kintsch 1988, 1998) before an appropriate schema
can be activated is much shorter, so while L2 readers have the opportunity of going back
to the textual data when first-inferencing fails, L2 listeners might encounter trouble at
this stage, before any helpful schema is able to take effect. In general, schemata might
be more important in L2 listening than L2 reading in that unlike readers who might,
given less time pressure, be able to rely more on bottom-up linguistic cues for meaning
construction, listeners probably have no such resource and a schema is crucial for pre-
diction and inference in a top-down manner. Not surprisingly, familiarity with content
knowledge in general aids learners in understanding audio input (Chiang & Dunkel
1992; Long 1990; Markham & Latham’s 1987; Schmidt-Rinehart 1994).
68 Bui Hiu Yuet Gavin
Not much research on topic familiarity has been conducted to investigate speech
production, and the existing literature mostly concerns L1 speaking. Prior knowledge
about a certain subject was found to raise temporal fluency (Good & Butterworh 1980),
reduce repeats and restarts but increase fillers (Bortfeld et al. 2001), but there are also
studies like Merlo and Mansur (2004) reporting unaffected fluency with a more famil-
iar topic. They instead found more propositions in the more familiar task, indicating
that the speech on a more familiar topic contains a higher density of information.
In addition, topic familiarity does not seem to help improve structural complexity
or coherence in narrative discourse (Banks 2004). These studies appear to show that
topic familiarity is more concerned with meaning expression (fluency and informa-
tion load) but less with structural ability (complexity and coherence) in first language
speaking.
Research on the influence of topic familiarity in L2 oral production is scarcer.
Familiarity with a topic seemed to enhance performance regarding fluency (words per
error-free T-unit and words per minute) but not with accuracy (error rate per T-unit)
in a monologic task (Chang 1999, with only 6 participants though). Familiarity with
the structure of a story, that is, a clearer schema in going to a restaurant versus a less
predictable storyline in playing golf, led to greater fluency (Skehan & Foster 1999).
Having a schema of a familiar area (a University map) also promoted fluency, but the
unfamiliar task (with an unfamiliar street map) generated higher lexical complexity
(Robinson 2001). More familiar tasks in Bei (2011) induced more formal language
features in discourse with a higher density of nouns and noun-associated word classes
such as articles and adjectives.
A few research gaps can be identified in the literature. It appears firstly from a
macro perspective that past research has generated quite extensive coverage on the
effects of task-external readiness, and even the interplay between its three types of
planning, namely rehearsal, strategic planning, and online planning. In contrast,
task-internal readiness has been much less touched upon, let alone the relationship
between task-external and task-internal readiness. The present study would argue that
task-internal readiness stands out as an inherent characteristic of task and could ren-
der more natural assistance to learner performance than task-external readiness does.
Secondly, at a micro level, topic familiarity has been shown quite unequivocally to help
L2 participants with greater fluency, but its influence on other performance areas like
complexity and accuracy is much less researched. A deeper consideration of this might
alert us to the possibility of its impact on test fairness as well, which warrants closer
scrutiny. At the same time and with no less significance, the extension of planning
to task-readiness may provide implications of findings for the Processing Approach
(Skehan 1998) versus Cognition Hypothesis (Robinson 2001) debate, which could
help to shed light on tasks and task behaviour from a wider perspective.
Task readiness 69
In pursuit of these goals, the current study employs a combination of three factors,
namely topic familiarity (task-internal), strategic planning (task-external), and profi-
ciency (learner characteristic) in a 2 × 2 × 2 split-plot design. The following section
details the implementation of the study.
Methods
Participants
The participants in this study were eighty university students aged between eighteen
and twenty-four in Hong Kong. They were selected from 102 volunteers based on their
proficiency levels (see 3.3 “Proficiency criteria” below) and their academic major (see
below). They were all native Cantonese speakers but with reasonable L2 English profi-
ciency. They had received twelve to sixteen years’ education of English as their second
language at the time of the study. Among them, forty students were medicine majors
and another forty were computer science students (see “3.4 Study Design” below for
the rationale).
Speaking tasks
Participants were given the following general scenario:
You are a specialist in the field giving a presentation to a group of university students
who are neither medicine nor computer majors but are interested in the topics.
Each participant was invited to talk about the following two descriptive topics.
Topic 1: Please describe in detail the general process of the infection of virus in a
human body, the possible consequences, and the general procedure for dealing with a
virus-infected person.
Topic 2: Please describe in detail the general process of the infection of virus in
a computer, the possible consequences, and the general procedure for dealing with a
virus-infected computer.
Proficiency criteria
The 80 participants were equally divided into a high and an intermediate group
according to their proficiency levels. The grouping criteria include a combination of
their previous Use of English (UE) examination results in their Hong Kong Advanced-
Level (HKALE) public exams and a pre-test (a C-test adapted from Dornyei & Katona
1992) administered immediately before the tasks. According to entrance require-
ments of the participants’ university, their UE results were approximately pitched
70 Bui Hiu Yuet Gavin
at Band 6–8 of the IELTS system. Therefore they were termed intermediate to low
advanced L2 learners.
Planners High 20 20
Intermediate 20 20
Non-planners High 20 20
Intermediate 20 20
When the medicine majors talk about a natural virus, the topic is regarded as
“familiar” to them, while the computer virus becomes the “unfamiliar”. The opposite is
then true for the computer majors. After doing the two tasks, participants were asked
to rate their familiarity with the topics. A medicine major who indicated an equal or
higher degree of familiarity with the computer virus topic than their own natural virus
topic would be excluded from the study. The same familiarity screening procedure
applied to the computer majors as well.
Task readiness 71
Accuracy Error-free Ratio Ratio of error-free clauses to all clauses. Foster &
Skehan (1996)
Errors per 100 Number of errors per 100 pruned words. Mehnert (1998)
Words
70% Accuracy Length of clause at which 70% of all clauses Skehan &
Clause length1 are correct. E.g., if 70% of all 5-word Foster (2005)
clauses and only 60% of all 6-word clauses
are error-free, a score of 5 is awarded.
Complexity Clauses per AS Ratio of subordinate clauses per AS unit. Foster, Tonkyn, &
unit Wigglesworth (2000)
Words Per AS Average number of words in all AS units. Ortega, L., Iwashita,
Unit N., Norris, J., &
Rabie, S. (in prep)
Words Per Clause Average number of words in all clauses
Data analysis1
Speaking performance from each task was recorded and transcribed largely following
the CHILDES format before having it analyzed with the above measures. The results
will be presented in the next section as both descriptive (including means and stan-
dard deviations) and inferential statistics (including significance levels and effect sizes
in Cohen’s d). Following Thalheimer and Cook (2002), Cohen’s d was calculated to
indicate the size of an experimental effect. Other statistical results were obtained from
repeated measures analyses of variance (ANOVA) in SPSS 19, which is deemed appro-
priate in dealing with a split-plot design like the current study. Results are organised
in terms of the sets of dependent measures, fluency, accuracy and then complexity.
Results
Fluency
The first result concerns the length of each performance, as measured by the num-
ber of words, under various conditions. Participants produced longer accounts on
the more familiar topics (360.36 raw words and 300.84 pruned words as compared to
. Following Skehan and Foster (2005), for example, if 50% of all 5-word sentences but lower than
50% of all 6-word sentence are correct, then with 50% as the threshold, the accuracy score is 5 in
that L2 speech. This study calculated 50%, 60%, and 70% as the thresholds, but only the 70% value is
reported in this study because it was found that 70% appeared to be a better threshold in differenti-
ating accuracy performance among learners of higher proficiency, such as those in the present study.
Task readiness 73
284.05 raw words and 229.61 pruned words with the unfamiliar topics, Cohen’s d = .53
for raw words, and Cohen’s d = .61 for pruned words, p = .000 for both). The opportu-
nity to have planning time seems to be a less powerful means in pushing learners to say
more, as a significant effect is reached only with total pruned words (297.8 and 233.60
in the familiar and unfamiliar tasks, p = .007, Cohen’s d = .44), which is an indication
that participants reduced repair features such as hesitation, repetition, interjections
and fillers (e.g. err, hmm) after strategic planning. A comparison of effect sizes further
supports the argument that familiarity with a certain topic has a greater impact on the
number of words used than does planning. Proficiency, interestingly, does not have
any effect on the number of words.
Factor analyses in the task literature (e.g. Mehnert 1998; Skehan & Foster 1999;
Tavakoli & Skehan 2005) have generally confirmed two types of fluency: breakdown
fluency and repair fluency. Breakdown fluency concerns temporal aspects of speech
and is usually measured through speech rate and pausing. In contrast, repair fluency
is associated with modifying language. It is operationalized as false starts, reformula-
tion, replacement, and repetition. Table 4 and Table 5 report on these two categories of
fluency variable respectively. The tables report on the effects of the three independent
variables, namely topic familiarity, strategic planning, and proficiency, on different
dependent variables such as the speech rate and phonation time in Table 4. In addition
to the means and standard deviations, the significance levels (p values) and the effect
sizes (Cohen’s d) are also given.
Table 4 shows that topic familiarity displays an overall effect on most (6 out
of 8) breakdown fluency measures. Being familiar with a certain subject matter
enables the participants to speak at a faster speech rate, with a longer stretch of
words before encountering any pauses, repairs or fillers (mean length of run).
Familiarity with a topic also helps to reduce the number as well as the average
length of pauses, and the total amount of silence in the middle of a clause. In addi-
tion, topic familiarity is able to shorten the total silence time between two clauses
(clause-end silence). However, the number and the length of pauses at the end of
clauses seem unaffected by topic familiarity. A notable point revealed in Table 4 is
the consistently small effect sizes in all measures contrasted with the wider range of
significance values. None of the effect sizes (Cohen’s d) reaches the medium level,2
which is an indication that while topic familiarity leads to higher fluency, the extent
of the influence is limited. It is intriguing that the largest effect sizes concern mid-
clause difficulties – familiarity seems particularly supportive in this respect.
. According to Cohen (1992), the effect size of Cohen’s d at .20 is small, .50 is medium, and .80
is large.
74 Bui Hiu Yuet Gavin
Table 4. The effects of topic familiarity, strategic planning and proficiency on breakdown fluency
Speech rate 96.30 (23.02) 90.47 (26.33) 102.52 (22.45) 84.24 (23.51) 96.79 (25.26) 89.97 (23.90)
p = .000 d = .26 p = .000 d = .58 p = .177 ns d = .28
Mean length of run 5.26 (1.57) 4.99 (1.76) 5.48 (1.85) 4.77 (1.38) 5.28 (1.51) 4.98 (1.80)
p = .016 d = .17 p = .046 d = .32 p = .397 ns d = .02
Mid-clause pause No. 9.73 (6.20) 12.13 (7.33) 8.61 (5.27) 13.24 (7.31) 10.17 (5.96) 11.69 (7.47)
p = .000 d = .38 p = .001 d = .54 p = .265 ns d = .22
Clause-end pauses No. 6.53 (2.09) 6.82 (2.52) 6.32 (2.22) 7.05 (2.34) 6.15 (1.97) 7.22 (2.48)
p = .562 ns d = .13 p = .087 ns d = .32 p = .023 d = .48
Mid-clause silence total 8.51 (7.86) 12.35 (13.41) 6.46 (4.83) 14.39 (13.09) 9.35 (9.25) 11.51 (11.77)
p = .000 d = .38 p = .000 d = .61 p = .299 ns d = .20
Clause-end silence total 5.73 (2.91) 6.41 (4.04) 5.10 (2.80) 7.08 (3.83) 5.63 (3.12) 6.54 (3.79)
p = .026 d = .19 p = .001 d = .59 p = .118 ns d = .26
Mid-clause pause length .79 (.31) .87 (.35) .70 (.33) .96 (.40) .80 (.29) .86 (.36)
p = .019 d = .28 p = .000 d = .71 p = .288 ns d = .18
Clause-end pause length 1.71 (.61) 1.78 (.81) 1.53 (.42) 1.96 (.86) 1.77 (.68) 1.72 (.74)
p = .437 ns d = .10 p = .001 d = .64 p = .584 ns d = .07
Notes: 1. Standard deviation in (). 2. All pause numbers and silence measures are standardized by calculating their occurrence per 100 words.
3. d = Cohen’s d which is a measure of effect size.
Task readiness 75
The effects of planning are quite similar to those of topic familiarity, except that
planning achieves a significant impact on more measures (7 out 8) with a larger mag-
nitude of the effects (i.e. bigger effect sizes). The opportunity to plan prior to speaking
raises the speech rate and the mean length of run. Planning reduces the number of
pauses in the middle of a clause, though not the number of pauses at the end of clauses.
There is also a reduction in the amount of silence and the average length of pauses in
the middle of a clause, the amount of silence, as well as the average pause length at the
end-of-clause positions.
Rather counter-intuitively, proficiency appears irrelevant to all but one measure,
the number of clause-end pauses. The intermediate proficiency participants produced
more pauses at the end of a clause than their high proficiency counterparts. This occur-
rence is intriguing in that the number of pauses at clause boundaries is one of few
measures that neither topic familiarity nor planning exerts any influence on, whereas
proficiency happens to fill this vacancy, with a medium Cohen’s d value (d = .48) indi-
cating a considerable effect. But other than that, the effect of proficiency seems to have
been overridden by topic familiarity and strategic planning.
In addition to the above main effects, there are also familiarity-by-planning
interaction effects in four breakdown fluency measures (see Table 5 below), that is,
speech rate (p = .001), number of mid-clause pauses (p = .005), mid-clause silence total
(p = .004) and clause-end silence total (p = .026). These five interaction effects con-
sistently point to a general trend that participants were able to reach a similar fluency
level after strategic planning, regardless of the familiarity level of the topics. That is, the
significant difference in breakdown fluency between familiar and unfamiliar topics is
reduced to almost non-existence when pre-task planning time is allowed. The results
suggest that, although planning helps to improve fluency performance in both familiar
and unfamiliar tasks, the unfamiliar tasks have benefited much more.
Regarding the different repair fluency measures, Table 6 shows that topic
familiarity helps to significantly reduce the number of repetitions, but only with
a small effect size (d = .31). Though not reaching significance, the means of the
other three variables are in the predicted direction. In comparison, ten-minutes
pre-task planning has a significant effect more generally, with fewer false starts,
reformulations, and repetitions, but it also induced more replacements. Similar
to the findings with breakdown fluency, the effect sizes produced by planning for
repair fluency range from medium to large, much bigger than those for familiarity.
As with the results for breakdown fluency, proficiency seems to exert little effect
on repair fluency.
Table 6. The effects of topic familiarity, strategic planning, and proficiency on repair fluency
Topic familiarity Planning Proficiency
False starts 1.38 (1.28) 1.63 (1.35) .85 (.80) 2.15 (1.40) 1.50 (1.29) 1.50 (1.35)
p = .125 ns d = .19 p = .000 d = 1.02 p = .999 ns d = 0
Reformulations 1.39 (1.00) 1.62 (1.25) 1.16 (.84) 1.84 (1.26) 1.60 (1.28) 1.40 (.95)
p = .088 ns d = .20 p = .001 d = .53 p = .321 ns d = .18
Replacements .95 (.79) 1.15 (.97) 1.26 (.90) .84 (.81) 1.08 (.85) 1.02 (.92)
p = .077 ns d = .23 p = .008 d = .43 p = .705 ns d = .07
Repetitions 3.94 (2.69) 4.72 (3.36) 3.16 (2.03) 5.40 (3.44) 4.29 (3.32) 4.27 (2.74)
p = .001 d = .31 p = .000 d = .60 p = .979 ns d = .01
Notes: 1. Standard deviation in (). 2. All number of the repairs are standardized by calculating their
occurrence per 100 words. 3. d = Cohen’s d which is a measure of effect size.
Accuracy
As shown in Table 7, topic familiarity appears to push participants to achieve a higher
ratio of error-free clauses, and thus fewer errors per 100 words, but only with small
effect sizes. Being familiar with a topic, however, does not help learners to produce
longer clauses where at least 70% of these clauses are correct (70% accuracy clause
length, p > .05). Strategic planning has even less of an impact here, showing no effect
on any of the measures. Proficiency, however, does show some significances. The
more advanced participants performed with longer 70% accuracy clauses, in addition
to having a significantly higher error-free ratio as well, and also a smaller number
of total errors per 100 words, when compared with their intermediate counterparts.
Proficiency is a strong driving force for accuracy as evidenced by the medium to large
effect sizes.
Task readiness 77
Table 7. The effects of topic familiarity, strategic planning, and proficiency on accuracy
Topic familiarity Planning Proficiency
Error-free .544 (.13) .517 (.14) .537 (.14) .524 (.13) .586 (.13) .475 (.11)
clause ratio
p = .020 d = .22 p = .618 ns d = .10 p = .000 d = .69
70% Accuracy 3.73 (2.27) 3.68 (2.17) 3.79 (2.63) 3.61 (1.96) 4.4 (2.49) 3.0 (1.66)
Clause length
p = .850 ns d = .02 p = .656 ns d = .08 p = .001 d = .57
Errors per 100 6.86 (2.46) 7.71 (2.61) 6.92 (2.59) 7.64 (2.45) 6.16 (2.25) 8.41 (2.30)
words
p = .000 d = .38 p = .121 ns d = .29 p = .000 d = .77
Notes: 1. Standard deviation in (). 2. d = Cohen’s d which is a measure of effect size.
Complexity
Table 8 gives the findings for the three different complexity measures. Topic familiarity
seems irrelevant to any of the complexity measures, but two measures, namely ‘clauses
per AS unit’ and ‘words per AS unit’, are significantly influenced by planning. In these
cases planners outperformed non-planners, with small and large effect sizes respectively.
Participants of higher proficiency also spoke with significantly longer AS units than
those of lower proficiency. Though only approaching significance (p = .067), the p value
in ‘clauses per AS unit’ shows a similar trend in that the advanced learners are probably
able to produce a higher subordination ratio than the intermediate ones. In comparison
to the effect size for proficiency, planning appears to be a stronger variable in promoting
complexity. A bit unexpectedly, the newly developed measure of ‘words per clause’ does
not seem to be sensitive to the influence of familiarity, planning, or proficiency.
Table 8. The effects of topic familiarity, strategic planning, and proficiency on complexity
Topic familiarity Planning Proficiency
Clauses per AS unit 1.74 (.32) 1.73 (.35) 1.81 (.31) 1.67 (.35) 1.79 (.34) 1.68 (.32)
p = .747 ns d = .03 p = .018 d = .39 p = .067 ns d = .33
Words per AS unit 12.93 (2.69) 12.43 (3.36) 13.96 (2.2) 11.39 (3.36) 13.49 (2.70) 11.85 (3.15)
p = .089 ns d = .16 p = .000 d = .81 p = .002 d = .52
Words per Clause 7.11 (.77) 6.97 (.85) 7.13 (.62) 6.95 (.75) 7.15 (.81) 6.92 (.82)
p = .195 ns d = .17 p = .22 ns d = .26 p = .112 ns d = .28
Notes: 1. Standard deviation in ( ). 2. d = Cohen’s d which is a measure of effect size.
78 Bui Hiu Yuet Gavin
Two interaction effects are also found between proficiency and planning for the
measure of “clauses per AS unit” (p = .002) and ‘words per AS unit’ (p = .026). Table 9
suggests two noteworthy points. Firstly, though the more advanced participants always
scored higher than the intermediate ones in complexity, the gap between them is sig-
nificantly narrowed after planning. Secondly, while planning raises the length of AS
units for all, it appears to help participants of lower proficiency more than the higher.
Table 9. Interaction effects in “clauses per AS unit” and “words per AS unit”
Proficiency
Planning Significance
High Intermediate
Clauses per AS unit Unplanned 1.82 (.40) 1.52 (.20) p = .002
Planned 1.77 (.27) 1.85 (.34)
Discussion
The Results section has provided a detailed description of the three major perfor-
mance areas (fluency, complexity, and accuracy). This section will further synthesize
the results in terms of topic familiarity, strategic planning and L2 proficiency so that
the effects of these three independent variables can be explored more directly.
Topic familiarity
A recapitulation of the results shows that topic familiarity enables learners to produce
longer speech with greater fluency with fewer breakdowns and slightly higher accu-
racy and repair fluency. What topic familiarity was not so effective with is syntactic
complexity. Several aspects derived from these results have theoretical significance.
First, topic familiarity seems to affect both the Conceptualization and the Formula-
tion stages in Levelt’s (1989) speaking model. The Conceptualizer is responsible for
drawing information from memory and forming a pre-verbal message as input for the
Formulator. It seems to take less time to access more familiar information due to an
immediacy effect, since speakers are more primed in the relevant knowledge domain.
As a Conceptualizer effect, too, speakers have a more ready-made schematic struc-
ture at their disposal which could be accessed on a macro basis. The faster-accessible
message plus an existing framework into which the message can be structured helps
Task readiness 79
ease the workload at the Conceptualization stage. The longer account produced on the
familiar topics indicates that more familiar information can be retrieved from long
term memory in any given time period.
Topic familiarity also appears to exert an influence on Levelt’s (1989) Formulation
stage. The Formulator receives the pre-verbal message from the Conceptualiser, then
draws on lemmas and lexemes from the mental lexicon and assembles them into a
linguistic plan waiting to be articulated at the next stage. In this process, lexis can be
retrieved not only at higher speed, as evidenced by the fewer breakdowns (Table 4),
but also in a larger quantity (more total words and more varied words (Bui, in prepa-
ration). In addition, the greater mean length of run and fewer mid-clause pauses on
the familiar topics all suggest that the more familiar topics facilitate the use of bigger
chunks in which more lexical items are packed into an uninterrupted stream, which
is an indication that topic familiarity helps learners with lexicalized language. This
would not only explain the superior temporal aspects of speaking, but also the slightly,
but significantly higher accuracy results because if some expressions are memorized as
wholes, it reduces the computational workload and thus error probability. To sum up,
learners are able to more efficiently access the exemplar-based system with faster word
searches, and reduce the analytic computation in their rule-based system (Skehan
1998) with more efficient on-line assembly of utterances when they are in possession
of relevant prior knowledge.
An additional explanation that might not be as general as those discussed above
is also relevant here. The medium of instruction for all the participants in their major
courses in both academic groups is primarily English, and all the textbooks and lec-
ture notes are in English. According to the encoding specificity principle (Tulving &
Thomson 1973), the language in which knowledge is stored in long term memory
will speed up access. Therefore, participants in the unfamiliar condition might have
to go through one more step at the Formulation stage, that of transforming their gen-
eral knowledge about the unfamiliar topic from Chinese into English. This could then
hamper their performance in terms of fluency and lexis. Such an observation might
have some implications for content-based language teaching in that if a certain domain
knowledge is learnt in one’s L2, it appears that future retrieval of the knowledge and
production in the L2 will be enhanced at least as far as fluency and lexis are concerned.
Another perspective on these findings is to make the point that the unplanned
condition effectively triggered more pressured communication – there was only scope
for on-line planning (Ellis 2005). Their limited processing capacity (Skehan 1998) cre-
ates difficulties for L2 speakers whose target language system is not yet automatized
to do efficient parallel processing. Consequently, more attentional resources allocated
to the Conceptualization stage means there may be difficulties at the later Formula-
tion and Articulation stages. Learners had to slow down their speech rate and pause
more often with a shorter average speaking time in order to cope with the unfamiliar
80 Bui Hiu Yuet Gavin
t opics. This result for fluency is largely consistent with some studies in L1 (e.g. Good &
Butterworth 1980; Bortfeld et al. 2001) and L2 (Chang 1999; Skehan & Foster 1999;
Robinson 2001) research. However, the discovery from the present research is that
pre-task planning is able to attenuate the difference between unfamiliar and familiar
topics in many of the fluency measures.
The second point to consider concerns form-meaning connections in relation
to topic familiarity. The primary concern in a speaking task is obviously to get the
message across. Meaning expression is more likely to be attended to than the other
aspects of speaking. However, it appears that the familiar topics also raise accuracy,
enabling meaning and form to be handled at the same time. In addition to the theory
of better access to the exemplar system and chunking, two more possibilities from a
processing perspective are worth noting. First of all, the attentional resources released
from the Conceptualization and the Formulation stages can help learners with self-
monitoring. With the more familiar topics, speakers may shift their attentional focus
partly from ‘what to say’ to ‘how to say’ and even ‘how to say well’, whereas they will
have to struggle with the content to express in the unfamiliar situations, which results
in greater working memory load for monitoring and correction (see also Bygate &
Samuda 2005). Secondly, on-line planning studies (e.g. Yuan & Ellis 2003) provide
evidence that unpressured within-task planning can contribute to more accurate per-
formance. As a task-internal readiness construct, topic familiarity appears to achieve
a similar effect because it prepares learners not only prior to the task, but through
the whole process of speaking. This resemblance of on-line readiness to unpressured
on-line planning may partly explain the higher accuracy scores in the familiar tasks.
At the same time, the small effect sizes may also be justified simply because task-
internal readiness is still time-pressured when compared to the unpressured task-
external on-line planning.
Strategic planning
Before going on to the discussion on strategic planning, a recapitulation of its gen-
eral effects is helpful. Briefly, strategic planning greatly helps learners to improve their
fluency (with both breakdowns and repairs) and syntactic complexity, though it was
not so effective in raising accuracy. The results seem generally consistent with the
bulk of the literature in planning, but the comparison and contrast between strategic
planning and topic familiarity would add more insight into the story. We can start
with fluency. First of all, planning works in a very similar way to topic familiarity in
term of fluency, especially with breakdown fluency. Such a pattern is reflected in the
measures on which both planning and familiarity have similar effects. Secondly, plan-
ning should at the same time be distinguished from topic familiarity in terms of the
strength and the range of their influence on fluency performance. Strategic planning is
Task readiness 81
likely a more powerful pedagogical means that constitutes a potentially higher level of
task-readiness than familiarity as a task-internal readiness, although this also depends
on how thoroughly the planners are able to anticipate the detail of the task, how much
they can cover during planning, and also how much they can retain and recall dur-
ing task performance. The effectiveness of such pre-task planning is supported by the
effect sizes that planning produces with fluency measures. To account for these two
observations, one could argue that the ten-minute planning time allows learners to
formulate a conceptual plan for the relevant message to convey (Ellis 2005; Mehnert
1998), which greatly reduces the need for online macro-structure planning. Instead,
L2 learners can allocate scarce attentional resources for the Formulator, thus speaking
with fewer pauses and at a faster rate. While speaking on a familiar topic without plan-
ning is still a pressured process, planned speech is much less so, which may explain
why planning is able to cut down on the frequency of repairs but topic familiarity is
much weaker in this regard.
What further distinguishes topic familiarity from planning is that planning pushes
learners to higher structural complexity. The lexicalized language or chunks that are
more readily and speedily accessible due to topic familiarity would likely involve less
complex syntactic processing partly due to being lexicalised, as well as because limited
working memory capacity does not allow overly long utterances to be processed and
passed on for long-term memory storage. Therefore, a reasonable assumption here
would be that the prefabricated expressions which are available in long term memory
are usually relatively short expressions. A comparison with topic familiarity shows that
strategic planning helps learners not only to access formulaic language (Foster 2001)3
and hence achieve higher fluency, but also assemble the pre-fabricated chunks into
longer psychological units of planning (AS units), as shown in higher scores in the
two complexity measures (“words per AS unit” and “clauses per AS unit”). In addition,
strategic planning encourages learners to stretch their speech content, which results
in their more adventurous attempt to produce more elaborated language. This result
is consistent with most studies, that planning drives learners to take risks to produce
more elaborated language. To some extent, this study, combined with Foster (2001),
helps to better explain why task-external readiness can, but task-internal readiness
cannot, promote greater complexity.
Rather disappointingly, strategic planning does not seem to affect the ‘words per
clause’ measure of clause length even though Ortega, Iwashita, Norris, and Rabie (in
preparation) argue that it is a better measure for more advanced learners. Bei (2010)
. Foster (2001) found that, given planning time, native speakers tend to use less formulaic lan-
guage and be more creative, whereas non-native speakers will use more formulaic language after
planning.
82 Bui Hiu Yuet Gavin
conducted two factor analyses (one for a familiar task and the other for an unfamil-
iar tasks) that included most of the available task performance measures, and both
confirmed that ‘words per clause’ appears to be very closely connected to the F-score
(Heylighen & Dewaele 1999), an index of formality, and less closely but significantly
with lexical sophistication. The F-score measures the extent to which nouns and
noun-associated word classes such as articles and adjectives are employed in speech,
while lexical sophistication is a yardstick for the frequency of rare words use. Taken
together, Bei (2010) argued that “words per AS unit”, together with the F-score and
lexical sophistication, belongs to a new construct “noun phrase complexity” which
should be treated distinctively from the syntactic or lexical complexity indentified in
the literature. The relationship between strategic planning and noun phrase complex-
ity warrants further studies.
What remains opaque is the relationship between planning and accuracy. The
previous literature has been unclear in this respect (Ellis 2009), and the present
study did not find a significant accuracy effect from planning (but see discussion in
Pang & Skehan (this volume) and Wang (this volume)). A thorny question emerges
naturally at this point: if as mentioned above, planning enables L2 learners to bet-
ter access their lexicalized language (formulaic chunks) as topic familiarity does, why
can topic familiarity raise accuracy but planning cannot? Possibly the puzzle can be
disentangled with the following three arguments. Firstly, planning drives learners to
embark on more complex language and in the process more pre-fabricated expres-
sions need to be assembled into an AS unit. The more syntactic work there is, the more
errors there might be (Crookes 1989), especially when strategic planning is largely
concept-oriented with little attention to grammar. Secondly, from a limited processing
capacity point of view (Skehan 1998), there is likely to be a trade-off between accu-
racy and complexity (Skehan & Foster 1997). Learners’ L2 systems are, by and large,
controlled but not automatized, and so attentional resources allocated to the over-
whelming workload when complexity is prioritised mean a reduction of attentional
focus on accuracy. Thirdly, it is possible that pre-task planning cannot affect on-line
monitoring (Skehan 2009) as what learners bring to the task from strategic planning
would most focus on getting the message across, whereas topic familiarity as a form of
task-internal readiness prepares learners anytime they speak, acting as both pre-task
and on-line readiness, and reduces the on-line processing workload to enable more
within-task monitoring.
Proficiency
The previous task-based literature has not seen proficiency as an area of primary and sys-
tematic concern. The few exceptions (e.g. Kawauchi 2005; Ortega 2005; W igglesworth
1997), however, suggest that task performance, as influenced by strategic planning,
Task readiness 83
differs according to learners’ proficiency levels. The present study re-examines the
effects of planning at different proficiency levels, whilst adding to it a new dimension
of planning: topic familiarity. As mentioned in Section 3.3, measured through HKALE
results and the IELTS system, the current participants are at relatively high proficiency
levels. In contrast, participants in past studies were mostly at lower proficiency levels.
The following discussion will take this caveat into consideration.
In terms of the main effects, proficiency shows consistently strong effects on all
accuracy measures and some effects on complexity (p = .000, d = .52 for “words per AS
unit”; p = .067 for the conventional “clauses per AS unit”), with performances of learn-
ers at the higher proficiency level being more accurate and more syntactically com-
plex. More advanced learners were also able to reduce the number of pauses between
clauses. However, proficiency seems to be, at least in this context, largely irrelevant to
fluency (either breakdown fluency or repair fluency), and even noun phrase complex-
ity (Bei 2011). An emerging pattern from these results is that proficiency tends to have
much greater influence on syntactic than semantic aspects of performance.
Learners of higher proficiency consistently made fewer errors in performance
than their lower proficiency counterparts did, regardless of familiarity or planning
time. Furthermore, the “70% accuracy clause length” measure indicates that the lower
error rate obtained by higher proficiency students was not achieved by the avoidance
strategy with which one might make fewer errors by resorting to shorter and simpler
utterances. Higher proficiency participants in fact spoke with longer error-free clauses
than the lower proficiency participants did. All this suggests that accuracy in perfor-
mance is basically a by-product of one’s underlying linguistic competence. The lack of
interaction effects between proficiency and the other two independent variables (plan-
ning and familiarity) further supports this claim, and might partly explain why accu-
racy in performance was less sensitive to task manipulations like strategic planning.
It could be argued that better performance in accuracy originates from two sources: a
well-developed linguistic system and a good ability to monitor speaking (see Li (this
volume)). A more advanced linguistic system plays a main role with error-free utter-
ances and it almost becomes a cliché to say that the actual ‘performance’ is a reflection
of implicit ‘competence’. A more fully-fledged underlying system is usually a more
automatized one, which frees up more attentional resources for monitoring errors.
All this contributes to the significantly and consistently better accuracy performance
among the higher level learners in all three accuracy measures. The medium to large
effect sizes (Cohen’s d values ranging from .57 to .77) suggest that the difference in
accuracy between the two proficiency levels is substantial.
Only one out of the three complexity measures, namely ‘words per AS unit’, was
significantly affected by proficiency. However, the effects of proficiency nearly reached
significance in the conventional ‘clauses per AS unit’ measure (p = .067). These results
suggest that proficiency does show its influence on syntactic complexity, though its
84 Bui Hiu Yuet Gavin
effects are not as big as those for accuracy. Compared to strategic planning, proficiency
is much less a driving force for higher complexity; compared to topic familiarity, pro-
ficiency is a much more important indicator for higher accuracy. Therefore, we might
postulate that L2 learners tend to opt for a conservative stance in speaking and try to
avoid mistakes. Planning time encourages them to be more willing to task risks and
use more elaborated language. Higher proficiency itself can liberate L2 learners from
their timidity only to a limited extent.
Regarding fluency, proficiency only has an effect on the “number of end-of-clause
pauses” (p = .027). Higher proficiency learners do not pause as frequently as their inter-
mediate counterparts do between clauses, but there is no difference between the two
proficiency levels in terms of the frequency of mid-clause pauses. Mid-clause pauses
have been shown to be a trait of L2 speaking (Skehan 2009), so both high and inter-
mediate proficiency learners in this study remained by and large L2 speakers whose
oral performance was not very native-like, as far as fluency is concerned. However, the
higher proficiency level did appear to reduce the hesitations between clauses. This was
probably because a more automatized linguistic system can assemble information in a
more coherent manner, making it less likely that the utterances will be fragmented or
loosely connected to each other.
When it comes to fluency and “nouny” language use (or rather, noun phrase com-
plexity), however, higher proficiency seems of no great relevance in most cases. Past
research has shown that fluency and complexity were more easily affected by task-
external influences (e.g. planning and task repetition), but fluency and complexity
were the two places in this study that proficiency had no effect or only a weak effect on.
Taken together, the possibility emerges that learner proficiency and task conditions
could stand in competition. That is, if a certain performance area, such as accuracy, is
merely a reflection of learners’ underlying competence, it is more likely to be resistant
to task conditions or task characteristics, such as planning. On the other hand, areas
less closely connected to proficiency (e.g. fluency in this study) are more prone to task
manipulations. There also appears to be a trade-off between task conditions and pro-
ficiency levels. Such a claim needs to be verified in future studies as it might suggest
limits as to how far task manipulations can go from short-lived performance enhance-
ment to genuine competence improvement in an L2. If the much researched area of
task-external readiness has little impact on L2 proficiency, it would then be time for us
to turn to new areas. A few studies (Skehan & Foster 1997; Li, Chapter 5, this volume)
have begun to show promise in making significant influence on accuracy performance
by employing post-task activities. There is room for more research in how different
effects of task manipulation could be integrated.
Two interaction effects in the literature between proficiency and planning are
noteworthy, both concerning complexity. Wigglesworth (1997) found that the oppor-
tunity to plan allowed learners of higher proficiency, but not those at the lower level,
Task readiness 85
to produce more complex language. Similarly, Kawauchi’s (2005) high proficiency par-
ticipants benefited most in the case of complexity (and fluency), with the lower profi-
ciency participants gaining less (but they gained the most in accuracy, with the most
advanced learners benefiting the least). On the contrary (at first sight), a general pat-
tern from the interactions between planning and proficiency in both ‘clauses per AS
unit’ and ‘words per AS unit’ in this study is that the intermediate learners were much
better than their high proficiency counterparts in making the most out of planning
time to achieve higher complexity. For the AS length measure, the difference between
high and intermediate participants was narrowed to virtual non-existence after plan-
ning. More significantly, in terms of the conventional ‘clauses per AS unit’ measure,
the intermediate proficiency planners even slightly surpassed the high proficiency
planners, though the high proficiency non-planners were much better than the inter-
mediate proficiency non-planners. As for accuracy, though Kawauchi (2005) found
that learners at a lower level gained the most in accuracy after planning, Wigglesworth
(1997) and Ortega (1999) claimed that planning helped learners at an advanced level
to achieve better accuracy in performance. The evidence in the present study does not
support either side in this disagreement. No matter whether given planning time or
not, the higher proficiency learners were always better than the intermediate ones (c.f.
the main effects of proficiency above).
Some inconsistency between the present study and the literature in terms of the
effects of planning on complexity and accuracy on different proficiency levels may
probably be attributed to the operationalization of the independent variable ‘profi-
ciency’ per se. As mentioned in Section 3.3, and the beginning of this Section (5.3),
the “intermediate” participants in this study were already quite proficient speakers
of English given the high entry requirement of their university. If the participants
of intermediate proficiency in this study are at a level similar to the ‘high’ partici-
pants in Kawauchi (2005) and Wiggleswoth (1997) (and if the ‘high’ here is equal to
the ‘advanced’ in Kawauchi), then, instead of contradicting, this study could in fact
support Kawauchi’s results for complexity. That said, such a claim remains a specula-
tion before a commonly acceptable way of equating different proficiency measures is
available.
Table 10. Effect sizes produced by topic familiarity and strategic planning
Topic familiarity Strategic planning
Implications
The pedagogical implications regarding task-external readiness (e.g. strategic planning)
have been researched in many studies (see Ellis 2005, 2009, for a detailed discussion),
but the benefit of using task-internal readiness has rarely been touched upon in the lit-
erature. Evidence from this study, however, showed that task-internal readiness should
not be ignored in language education, and this for a number of different reasons.
88 Bui Hiu Yuet Gavin
First of all, as noted above, previous research has shown that receptive language
use, namely reading comprehension (Shimoda 1993; Chang 2006; Lee 2007; Leeser
2007; Barry & Lazarte 1995; Bügel & Buunk 1996; Chen & Donin 1997; Johnson
1982; Lee 1986) and listening comprehension (Markham & Latham 1987; Long 1990;
Chiang & Dunkel 1992; Schmidt-Rinehart 1994; Leeser 2004), are greatly influenced
by background knowledge. The present study further provides evidence for the effects
of familiarity in L2 speech production, as productive language use. Familiarity may
therefore become an inevitable issue in test fairness. It is highly likely that one per-
forms well not because s/he is in fact more proficient but simply because s/he is more
familiar with the topic. Matches and mismatches between test content and learner
background have to be taken into serious consideration in either language compre-
hension or production tests.
Second, one of the important issues in task-based language instruction is to
encourage learners to participate actively in various task activities. This study shows
that providing learners with more familiar topics will reduce learner anxiety and elevate
their willingness to communicate, as evidenced by the significantly longer accounts
they give with familiar topics. On the one hand, longer performance produced by an
L2 learner is an indication of his/her willingness and readiness to communicate. On
the other hand, this certainly helps to enhance learner confidence, which may work
especially for low to intermediate learners.
Third, strategic planning was shown to help learners produce more fluent and
more complex language. Accordingly, it would appear to be a good idea to allow
learners some time prior to any actual performance. Planning encourages learners to
embark on more elaborated language, attempting more complex structures through
which they could experiment with newly acquired linguistic knowledge. Planning also
serves to narrow the gap between high and low proficiency, and between familiar and
unfamiliar tasks, in terms of fluency and complexity. In classrooms, then, teachers may
take advantage of planning when learners are facing adverse situations (such as low
proficiency and unfamiliar topics).
Fourth, the results suggest that, learners should be provided with familiar top-
ics in tasks if accuracy is the primary concern. Given the way familiarity seems to
function as mini-online-planning, it appears to help learners by providing them more
resources to attend to form. As mentioned above, this may increase their confidence
and reduce feelings of frustration.
Fifth, this study may also have implications for task sequencing. We have seen the
separate benefits for pedagogy from each individual variable, but it is far more impor-
tant to examine how these different influences are organized to form a coherent and
organic whole. It is certainly too early to make any claims on the “whole picture” based
on the three variables in this study alone. Nonetheless, this study indicates that at the
pre-task stage planning is a useful tool, whilst at the during-task stages familiarity may
Task readiness 89
help. Then, beginners should receive the most familiar topics and planning time in
order that they could be fully supported in tasks. As their language ability develops, at
some points and to some extent it may be possible for either familiarity or planning to
be reduced so that they would face greater (but appropriate) challenges and be moti-
vated to proceed further.
Last but not least, the present study supports content-based instruction (Mohan
1986) in language teaching. Topic familiarity proved to be a positive influence on flu-
ency and accuracy, with indications that it helped to push learners to a more integra-
tive approach to language learning. Compared to ‘pure’ or intensive language teaching,
language seems more effectively taught when the domain knowledge (not linguistic
knowledge) is imparted to learners in their L2, leading to a genuine need to solve real
world problems, and that domain knowledge then serves as a continual reference point
for the growing language curriculum. In a language classroom where general knowl-
edge is not the focus, language can still be taught using tasks involving connections to
real life so that tasks become the medium between classroom and the real world.
Conclusion
However, planning produces bigger effect sizes than topic familiarity with fluency.
Planning is also able to greatly reduce the gap between familiar and unfamiliar
topics in fluency. This leads us to the conclusion that task-external readiness is
in general more powerful than task-internal readiness in improving meaning-
oriented performance.
3. Planning raises syntactic complexity, while topic familiarity increased accuracy. It
would then appear that task-internal readiness encourages learners to a conserva-
tive stance (thus higher accuracy), but task-external readiness pushes learners to
task risks (hence higher complexity). Interestingly, higher proficiency produces
much higher accuracy and moderately higher complexity, confirming a close rela-
tion between syntactic performance and linguistic competence.
4. With the above points taken together, an intriguing pattern emerges – task
influence and proficiency influence do not always complement each other. The
proficiency-oriented variables (e.g. accuracy) are affected more by proficiency
levels and less by task manipulations, whereas task-oriented variables (e.g. flu-
ency) function just on the opposite. There are also intermediate variables, such as
complexity.
This study was conducted in the context of TBLT research and established very close
connections to prior studies, thus enabling cross-study comparisons. It is then the
hoped that this research will be a link between the literatures on planning and the
future studies on the extended concept of planning, that is task-readiness, to explore
task-based language learning from an even wider perspective.
References
Baddeley, A. (1997). Human memory: Theory and practice. New York, NY: Psychology Press.
Barry, S., & Lazarte, A. (1995). Embedded clause effects on recall: Does high prior knowledge of con-
tent domain overcome syntactic complexity in students of Spanish? Modern Language Journal,
79, 491–504.
Banks, J. (2004). The impact of event familiarity on the complexity and coherence of children narra-
tives of positive events. Unpublished MSc thesis. North Carolina State University.
Bei, X.G. (2010). Exploring task-internal and task-external readiness: The effects of topic familiarity
and strategic planning in topic-based task performance at different proficiency levels. Unpublished
Ph.D. thesis. The Chinese University of Hong Kong.
Bei, X.G. (2011). Formality in second language discourse: Measurement and performance. Interdis-
ciplinary Humanities, 28(1), 22–31.
Bei, X.G. (2013). Effects of immediate repetition in L2 speaking task: A focused study. English Lan-
guage Teaching, 6(1), 11–19.
Bui, H.Y.G. (In review). L2 fluency as influenced by content familiarity and planning performance
and methodology. Submitted to Language Teaching Research.
Task readiness 91
Bui, H.Y.G. (In preparation). Lexical diversity, lexical sophistication and lexical density in L2 speak-
ing tasks. Unpublished manuscript.
Bortfeld, H., Leon, S.D., Bloom, J.E., Schober, M.F., & Brennan, S.E. (2001). Disfluency rates in
conversation: Effects of age, relationship, topic, role, and gender. Language and Speech, 44(2),
123–147.
Bügel, K., & Buunk, B. (1996). Sex differences in foreign language text comprehension: The role of
interests and prior knowledge. Modern Language Journal, 80, 15–31.
Bygate, M. (1996). Effects of task repetition: Appraising the developing language of learners. In
J. Willis & D. Willis (Eds.), Challenge and change in language teaching (pp. 136–146). Oxford:
Heinemann.
Bygate, M. (2001). Effects of task repetition on the structure and control of oral language. In
M. Bygate, P. Skehan, & M. Swain (Eds). Researching pedagogical tasks: Second language learn-
ing, teaching and testing (pp.23–48). Harlow: Longman.
Bygate, M., & Samuda, V. (2005). Integrative planning through the use of task repetition. In R. Ellis
(ed.), Planning and task performance in a second language (pp. 37–74). Amsterdam: John
Benjamins.
Chang, C. (2006). Effects of topic familiarity and linguistic difficulty on the reading strategies and
mental representations of non-native readers of Chinese. Journal of Language and Learning, 4,
172–198.
Chang, Y.F. (1999). Discourse topics and interlanguage variation. In P. Robinson (Ed.), Represen-
tation and process: Proceedings of the 3rd Pacific Second Language Research Forum (Vol. 1,
pp. 235–241). Tokyo: PacSLRF.
Chen, Q., & Donin, J. (1997). Discourse processing of first and second language biology texts:
Effects of language proficiency and domain-specific knowledge. Modern Language Journal, 81,
209–227.
Chiang, C.S., & Dunkel, P. (1992). The effect of speech modification, prior knowledge, and listening
proficiency on EFL lecture learning. TESOL Quarterly, 26, 345–373.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159.
Crookes, G. (1989). Planning and interlanguage variation. Studies in Second Language Acquisition,
11, 367–383.
Dornyei, Z., & Katona, L. (1992). Validation of the C-test amongst Hungarian EFL learners. Lan-
guage Testing, 9, 187–206.
Ellis, R. (2005). Planning and task-based performance: Theory and research. In R. Ellis (Ed.), Plan-
ning and task performance in a second language (pp. 3–36). Amsterdam: John Benjamins.
Ellis, R. (2009). The differential effects of three types of task planning on the fluency, complexity, and
accuracy in L2 Oral production. Applied Linguistics, 30(4), 474–509.
Ellis, R., & Yuan, F.Y. (2005). The effects of careful within-task planning on oral and written task per-
formance. In R. Ellis (Ed.), Planning and task performance in a second language (pp. 167–192).
Amsterdam: John Benjamins.
Foster, P. (2001). Rules and routines: a consideration of their role in task-based language production
of native and non-native speakers. In M. Bygate, P. Skehan, & M. Swain (Eds.), Researching
pedagogic tasks: Teaching, learning and testing (pp. 75–97). Longman, London.
Foster, P., & Skehan, P. (1996). The influence of planning and task type on second language perfor-
mance. Studies in Second Language Acquisition, 18, 299–323.
Foster, P., Tonkyn, A., & Wigglesworth, G. (2000). Measuring spoken language: A unit for all reasons.
Applied Linguistics, 21(3), 354- 375.
92 Bui Hiu Yuet Gavin
Gardner, R.C. (2001). Psychological statistics using SPSS for Windows. Upper Saddle River, NJ:
Prentice-Hall.
Good, D.A., & Butterworth, B. (1980). Hesitancy as a conversational resource: some methodologi-
cal implications. In H. Dechert & M. Raupach (Eds.), Temporal variables in speech production
(pp. 145–152). The Hague: Mouton.
Heylighen, F., & Dewaele, J. (1999). Formality of language: Definition, measurement and behavioral
determinants. Internal report, Center “Leo Apostel”, Free University of Brussels.
Johnson, P. (1982). Effects on reading comprehension of language complexity and cultural back-
ground of text. TESOL Quarterly, 16, 169–181.
Kawauchi, C. (2005). The effects of strategic planning on the oral narratives of learners with low
and high intermediate proficiency. In R. Ellis. (Ed.), Planning and task performance in a second
language (pp. 143–164). Amsterdam: John Benjamins.
Kintsch, W. (1988). The role of knowledge of discourse comprehension: A construction-integration
model. Psychological Review, 92, 163–182.
Kintsch, W. (1998). Comprehension: A paradigm for cognition. Cambridge: CUP.
Lee, J.F. (1986). Background knowledge and L2 reading. Modern Language Journal, 70, 350–354.
Lee, S.K. (2007). Effects of textual enhancement and topic familiarity on Korean EFL students read-
ing comprehension and learning of passive form. Language learning, 57, 87–118.
Leeser, M. J. (2004). The effects of topic familiarity, mode, and pausing on second language learnres’
comprehension and focus on form. Studies in Second Language Acquisition, 26, 587–615.
Leeser, M.J. (2007). Learner-based factors in L2 reading comprehension and processing grammatical
form: Topic familiarity and working memory. Language Learning, 57(2), 229–270.
Levelt, W.J.M. (1989). Speaking: From intention to articulation. Cambridge MA: The MIT Press.
Long, D.R. (1990). What you don’t know can’t help you: An exploratory study of background knowl-
edge and second language listening comprehension. Studies in Second Language Acquisition,
12, 65–80.
Markham, P., & Latham, M. (1987). The influence of religious-specific background knowledge on the
listening comprehension of adult second-language students. Language Learning, 37, 157–170.
Mehnert, U. (1998). The effects of different lengths of time for planning on second language perfor-
mance. Studies in Second Language Acquisition, 20, 83–108.
Merlo, S., & Mansur, L.L. (2004). Descriptive discourse: Topic familiarity and disfluencies. Journal of
Communication Disorders, 37, 489–503.
Mohan, B.A. (1986). Language and Content. Cambridge, MA: Addison-Wesley.
Ortega L. (1999). Planning and focus on form in L2 oral performance. Studies in Second Language
Acquisition, 21, 109–148.
Ortega, L. (2005). What do learners plan? Learner-driven attention to form during pre-task planning.
In R. Ellis (Ed.), Planning and task performance in a second language (pp. 77–109). Amsterdam:
John Benjamins.
Ortega, L., Iwashita, N., Norris, J., & Rabie, S. (in preparation). A multi-language comparison of syn-
tacticcomplexity measures and their relationships to foreign language proficiency. Manuscript
in preparation.
Robinson, P. (2001). Task complexity, task difficulty and task production: exploring interactions in a
componential framework. Applied Linguistics, 22, 27–57.
Samuda, V. (2001). Guiding relationships between form and meaning during task performance: The
role of the teacher. In M. Bygate, P. Skehan, and M. Swain (Eds), Researching pedagogic tasks:
Second language learning, teaching and testing (pp. 119–140). London: Longman.
Task readiness 93
Schmidt-Rinehart, B.C. (1994). The effects of topic familiarity on second language listening compre-
hension. The Modern Language Journal, 78, 179–189.
Shimoda, T.A. (1993). The effects of interesting examples and topic familiarity on text comprehen-
sion, attention, and reading speed. Journal of Experimental Education, 61, 93–103.
Skehan, P. (1998). A cognitive approach to language learning. Oxford: OUP.
Skehan, P. (2009). Lexical performance by native and non-native speakers on language learning tasks.
In B. Richards, H.M. Daller, D. Malvern, P. Meara, J. Milton, J. Treffers-Daller (Eds.), Vocabu-
lary studies in first and second language acquisition: the interface between theory and application
(pp. 107–124). Basingstoke: Palgrave Macmillan.
Skehan, P., & Foster, P. (1997). Task type and task processing conditions as influences on foreign
language performance. Language Teaching Research,1(3),185–211.
Skehan, P., & Foster, P. (1999). The influence of task structure and processing conditions in narrative
retellings. Language Learning, 49(1), 93–120.
Skehan, P., & Foster, P. (2005). Strategic and online planning: the influence of surprise information
and task time on second language performance. In R. Ellis. (Ed.), Planning and task performance
in a second language (pp. 193–216). Amsterdam: John Benjamins.
Tavakoli, P., & Skehan, P. (2005). Strategic planning, task structure, and performance testing. In
R. Ellis. (Ed), Planning and task performance in a second language (pp. 239–273). Amsterdam:
John Benjamins.
Thalheimer, W., & Cook, S. (2002). How to calculate effect sizes from published research articles: A sim-
plified methodology. Retrieved April 18, 2009 from http://work-learning.com/effect_sizes.htm.
Tulving, E., & Thomson, D.M. (1973). Encoding specificity and retrieval processes in episodic mem-
ory. Psychological review, 80(5), 352–373.
Yuan, F., & Ellis, R. (2003). The effects of pre-task planning and on-Line planning on fluency, com-
plexity and accuracy in L2 monologic oral production. Applied Linguistics, 24(1), 1–27.
Wang, Z. (2009). Modeling L2 Speech Production and Performance: Evidence from Five Types of Plan-
ning and Two Task Structures. Unpublished Ph.D. thesis. The Chinese University of Hong Kong.
Wigglesworth, G. (1997). An investigation of planning time and proficiency level on oral test dis-
course. Language Testing, 14(1), 85–106.
chapter 4
The second language planning literature has been mainly quantitative in nature,
with very few qualitative investigations of planning (but see Ortega 2005). This
chapter tries to redress that imbalance and reports on a study of what second
language learners say they do when they plan. Participants were from a university
in Macao, and completed a narrative task, followed by retrospective interviews. The
interview data was coded, and a coding scheme emerged from this work which had
some affinity to the Levelt (1989) model of speaking. As a result, this may be of use
in other contexts. In addition, relationships between the self-reported planning
behaviours and actual performance on the task were explored. This suggested
some generalizations as to what planning behaviours are associated with higher
performance, and, interestingly, which are associated with lower performance. The
former tend to implicate the Conceptualiser stage of speech production and are
specific and limited in range, whereas the latter are frequently concerned with
over-ambition during the planning stage, a concern for form, and participants
attempting to do too much.
Introduction
the time perspective under which tasks are completed (Robinson 2011); the time when
planning occurs, with a contrast between pre-task or strategic planning compared to
on-line planning which takes place when time is available while speaking is taking place
(Ellis 2005b) (see other chapters in this volume). As a result, we are now in a much bet-
ter position to predict what the impact will be on performance of manipulating differ-
ent task features and different task conditions. Indeed, an index of the success of these
developments is that we have competing accounts of what produces the different effects
which have been reported. Whereas Skehan (2009a) argues that limitations in atten-
tion frequently lead to trade-offs between different areas, and that careful task choice
and use can mitigate the impact of these tradeoffs, Robinson (2001, 2011), through the
Cognition Hypothesis, analyses attention differently, as less limited, and proposes that
task difficulty can push for different aspects of performance (complexity, accuracy) to
be simultaneously raised.
The nature of performance itself has been the focus for much research, r eflecting
a wider interest in second language performance dimensions. Following Crookes
(1989), task researchers have usually focused on the dimensions of complexity, accu-
racy, and fluency. More recently measures of lexis have also started to become more
common in task research (Skehan 2009b). Essentially, these four aspects of perfor-
mance are regarded as distinct, so that one or two might be raised, while the others are
not. This connects with one of the most interesting debates on the effects of planning:
whether it impacts on all areas of performance, or only on complexity and fluency.
Some researchers, including Crookes (1989), and Ortega (2005), argue that planning
raises these two areas only, whereas others, notably Foster and Skehan (1996), argue
that planning can also have an impact on accuracy.
Some experimental studies have tried to address some of the disagreements, (see
for example Wang (Chapter 2, this volume) and Bui (Chapter 3, this volume), as do
studies by Michel et al. (2007) and Revesz (2009). Their hope has been to design stud-
ies in such a way as to tease out differences and predictions from theoretical positions
which will resolve disagreements. Such research has been useful, but so far, not defini-
tive. In reflecting on this it is important to point out that the vast majority of stud-
ies on planning have been quantitative in nature. Hypotheses have been framed, data
collected, coded, and scored, and quantitative techniques have been used to explore
the match or mismatch between hypotheses and results. Next, interpretations of the
experimental conditions and their effects in relation to the results have been put for-
ward, hence the claims about Cognition, Tradeoff, accuracy versus complexity effects,
and so on. What has been strikingly absent is any tradition of qualitative research
used to gain insights into what happens during the period when planning is under-
taken. Researchers seem to have been keener to try to infer the mental processes of
participants from quantitative patterns, rather than to discover what the participants
themselves have to say.
Self-reported planning behaviour and second language performance in narrative retelling 97
The major exception to this comes from two studies by Ortega (1995, 1999), and
the broader account she published of this work (Ortega 2005), where she herself won-
ders why there had been so little qualitative research in this area. Several years later, it
is even more striking that this state of affairs has not really changed, particularly given
that qualitative studies have the potential to unlock a great deal in relation to debates
within the literature. They can provide insights into the aims learners have during
the planning period, the processes they try to engage in, and their satisfaction subse-
quently with the activities they have engaged in. The present research tries to redress
this situation and report qualitative data on planners’ activities, an endeavour which,
if nothing else, will add to Ortega’s initial work.
There were, h owever, some reports of planning not being helpful. One reason pro-
posed was that the narrative tasks were rather easy, and did not need much planning.
There were also reports of lack of transfer from planning to actual performance. Many
of these points are prescient with respect to the results to be reported here.
Ortega (2005) makes two more general points which have importance. First, she
draws attention to individual differences. She contrasts communicatively-oriented
participants and those more focused on form, and argues that participants brought
this predisposition to the way planning time is exploited. Second, she draws attention
to issues deriving from the presence of a listener. Participants were put in pairs, with
one as speaker and the other as listener. Both participants were ‘valued’ in the encoun-
ter, but in fact, it was only the data from the speakers which was used. However, the
speakers were clearly aware of the listener as an important factor in the encounter.
Ortega (2005) reports this had a major impact on performance, with speakers report-
ing thinking of how to use planning time to make content more accessible, of how
they could retrieve easier vocabulary, and how they could avoid more advanced gram-
mar. There was also a reluctance to self-correct and even a willingness to slow down
delivery in order to make comprehension easier for the listener.
rather than speaking strategies, and so they may not be totally appropriate. In par-
ticular they are not based on any theory of speaking, either first or second language
(Levelt 1999; Kormos 2006). There would seem scope, therefore, to use an emergent
category approach from the beginning, but to have in mind while categories emerge
the possibility of relating them to a model of speaking, such as Levelt’s, which has
been applied to the second language case (DeBot 1992; Kormos 2006; Skehan 2009a).
In this way, from retrospective interviews, one could look for comments which might
relate to the different stages in the Levelt model (Conceptualisation, Formulation,
and Articulation), as well as associated processes such as monitoring. At the same
time, one could also explore whether and how far Levelt’s proposals on the function-
ing of a mental lexicon have relevance for the reports which are gathered.
A third motivation for the present research is that there is no research we are aware
of that explores the relation between participants’ reports on what they did during
planning and their subsequent performance. Ortega’s work gives us excellent insights
into what participants say they do during planning, and also what they say about how
effective they think their planning was, but we have no information about whether
the reports on planning are associated with success or failure. This is important to the
extent that the planning literature is centrally concerned with trying to establish link-
ages between planning and performance, with performance generally conceived of in
terms of complexity, accuracy, fluency and lexis. Indeed, the debates within the litera-
ture are less about the desirability of planning (because in the main, researchers assume
its value), but rather how different interpretations of planning, perhaps with different
tasks, can impact upon different aspects of performance. So, to gather retrospective data
on planning, and to link this to performance, would relate to wider debates within the
planning literature. For instance, such data might shed light on which planning behav-
iours are most effective, in such a way that one could make relevant suggestions for ped-
agogic intervention, and even offer suggestions for how more effective planning could
be trained. This, too, would go beyond planning as being a rather monolithic and crude
pedagogic option, and suggest how it could be fine-tuned and targeted more effectively.
Method
Research Questions
This background generates two fairly straightforward research questions. It is inap-
propriate, given the exploratory, qualitative nature of the proposed research, to put
forward research hypotheses and predictions. Instead the two general questions are:
Research methodology
The researcher met each dyad of participants in an individual room. She briefed the
two participants that they would have ten minutes to plan a picture story individually
and would take turns to tell the story in English for about 2 minutes to each other
(see the instructions in Appendix 1). The listener could not see the pictures and was
told to ask at least one question after the story was told. All storytellings were audio-
recorded. When participants finished telling the story, the researcher interviewed each
student individually about what they did during the 10-minute planning time. The
retrospective interview questions, trialed in the pilot study mentioned earlier, were
intended to collect planning activities in a range of areas including words, g rammar,
Self-reported planning behaviour and second language performance in narrative retelling 101
ideas, story structure. The interview was conducted in Mandarin or Cantonese Chinese
and was audio recorded.
Tasks
Selecting interesting and appropriate story pictures for this study was a challenging
process. Given the success of the Shaun the Sheep video narrative retellings (Wang &
Skehan, this volume), it was decided to adapt the video stories as the basis for mak-
ing a cartoon series. A wide range of Shaun the Sheep episodes were initially identi-
fied. These were then scrutinised by the researchers and reduced to seven. The seven
selected episodes were converted into picture story series, and these were each evalu-
ated by around 10 MA TESOL students. Based on the ratings of the story series given
by the graduate students in terms of clarity, humor, and depth, four picture series were
selected to be used in the pilot study. The two aims of the pilot study were to trial
the retrospective interview questions, and to select two story pictures for the present
study. Fourteen students from the same university in Macao participated in the pilot
study. Two picture series were finally selected based on the following criteria: amount
of useful retrospection about planning, variety of self-reported planning behaviour,
and ratings by the participants (clarity, humor, depth, and difficulty in completion).
The pilot also enabled a decision to be made about the length of planning time to be
used. The two selected story pictures adopted in this study each had 19 pictures in total
and were printed in color for use in the research sessions.
Procedures
The Retrospective Interviews (RIs) of the 48 participants were audio-recorded. All
the 48 digitally recorded RIs were transferred to computer as MP3 files. They were
transcribed (and translated at the same time) using Soundscriber, to facilitate control
over the sound file during transcription. The prompts for these interviews are shown
in Appendix 2.
The data handling for the narrative performances followed the procedures used
in most of the chapters in this volume, and therefore does not need to be discussed
in great detail. A broad transcription was made of each sound file. Transcriptions
were segmented into AS units, and copied to contain two identical lines for each AS
unit. The first line was then coded following CHAT conventions (MacWhinnie 2000),
and represented as the CHAT tier. The second line was coded following TaskProfile
conventions, containing clausal segmentation, error coding, measurement of pauses
longer than 0.40 seconds, and coding for a range of repair types (reformulation, rep-
etition, etc.). In addition, the length of time taken, at millisecond level, for each AS
turn was recorded as a third line. These codings enabled TaskProfile to generate all the
measures required in the present study (for a fuller discussion of performance cod-
ings, see Chapter 1).
102 Francine Pang & Peter Skehan
Performance measures
Five performance measures were used in this study. Complexity was measured using
the subordination measure described in Chapter 1: The total number of clauses was
divided by the total number of AS units, generating an index with a minimum value
of 1 (no AS unit contained anything other than a matrix clause, with no subordi-
nation), with most values falling between 1 and 2. Accuracy was measured as the
proportion of clauses that were error-free, and in this case, although the data was
coded for gravity of error, the measure which was actually used, on the basis of effec-
tive discrimination, was that based on all errors (rather than only serious errors, as
in Skehan and Shun (this volume)). Fluency was measured by two indices, both of
pausing: pauses per 100 words at AS boundary points, and pauses, again per 100
words, which occured mid-clause. Skehan (2009b) has shown that pauses at these
two locations need to be considered separately. Finally, lexical sophistication was
measured as the Lambda score, the index which captured the extent to which the
contribution of the speaker contained less frequent words (see Chapter 1 for a more
detailed account). These five measures constitute a basic set of those described more
fully in Chapter 1.
The provisional Coding Scheme consisted of little more than a list of codes with
a provisional description of each of them. Although the codes and the categories
were preliminary, they created an initial framework for coding the data in this study
by taking advantage of the similar features of the participants, tertiary Chinese non-
English major students in the Pilot Study and in this study. The 48 RIs were coded
using the method of constant comparison (Strauss & Corbin 1990). That is, the
researcher attempted to closely examine and re-examine each RI Unit, compare for
similarities and differences, and ask questions about the planning activities reflected
in the retrospective interviews. In addition, every effort was made to ensure the
coding system was sensitive to the variety of planning activities the ESL speakers
might use. Detecting all the planning activities reported by each participant, (either
reported directly by the participants or inferred from the interview) was the starting
point for this.
Having constructed the provisional Coding Scheme and having coded the 12
RIs in the previous pilot study, the first author became more sensitive throughout the
investigation to the possibility of any emergent planning activities. The view is that
the more exhaustive the analyses, the more realistic they would be with respect to the
planning activities used by both groups of speakers, the high intermediate and the low
intermediate groups. Therefore, in constructing the Coding Scheme, no quantitative
criterion was used; that is, regardless of the frequency with which a planning activ-
ity occurred, it was incorporated into the Coding Scheme. For example, the follow-
ing three planning activities are included in the Coding Scheme although they only
occurred once: (A) 2. Macro Planning Plan sequence: Look at pictures one by one, then
describe; (D) 6. Lexical choice: Advanced words; (E) 8. Lexical: Planned but not used/
correctly used; (see Appendix 3).
This first pass through the 48 RIs to develop coding categories took a significantly
longer time than was anticipated. This is not only because of coding the data itself (as it
would have taken much less time if a fixed, established coding scheme had been used)
but also because the final Coding Scheme underwent an iterative process of multiple
revisions. The action of examining, re-examining, comparing and asking questions
challenges the RI Units themselves and the method of coding the RI Units continually.
In other words, how to match an RI Unit to a code subtly and how to code all RI units
consistently created considerable complexity.
The first coding of all the 48 RIs resulted in the identification of 42 planning activi-
ties (i.e. 42 codes). This Coding Scheme included: the codes of the planning activities,
a description of each code, and an appropriate example, demonstrating the planning
activity. As a brief example of the scheme in operation, the code “Connect the pictures
to develop the story plot” (described as “Try to connect or structure the pictures to develop
the story plot”) is the result of conceptualizing an RI Unit “I emphasized on the plots.
I considered how to connect the pictures to make the story better.”
104 Francine Pang & Peter Skehan
The next stage was to explore whether any order could be brought to the codes.
It proved possible to organise the entire set of 42 planning activities (i.e. the codes)
into five groups reflecting their functions. These were (A) Macro planning; (B) Micro
planning; (C) Lexical and grammar planning; (D) Metacognitive planning, and (E)
Post-task perception and evaluation. The last category will not be pursued further in
this chapter, for reasons of space. This first Coding Scheme was not intended to be a
complete representation of all possible planning activities. Nevertheless, it does rep-
resent an exhaustive list of the planning activities the participants used to prepare for
telling a story from a series of pictures.
Results
Number of mentions 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Macro Planning
Plan sequence: Scan, then describe 0 24 0 24 0 48
Plan sequence: Look at each picture, 23 1 24 0 47 1
describe
Micro Planning
Understand pictures in detail 5 10 7 2 0 10 6 4 3 1 15 16 11 5 1
Plan general things 16 8 12 12 28 20
Plan small details 8 14 2 2 12 10 10 26 12
Plan how to tell the story 8 11 4 1 12 9 3 0 20 20 7 1
Plan how to express oneself better 13 7 4 13 7 4 26 14 8
Think of ideas beyond the pictures 17 7 10 12 2 27 19 2
Organise the ideas developed from 17 6 1 14 8 1 31 14 2 1
pictures
Connect the pictures to develop 14 9 1 6 12 4 2 20 21 5 2
the story
(Continued)
106 Francine Pang & Peter Skehan
Number of mentions 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Metacognitive Planning
Rehearse: General 9 15 16 8 25 23
Rehearse: To be accurate 18 6 17 6 1 35 12 1
Rehearse: To be fluent 15 9 13 11 28 20
Rehearse: To be logical or clear 24 0 19 4 1 43 4 1
Rehearse: To help to memorise 24 0 22 1 1 46 1 1
Take notes to help plan 16 7 0 1 15 7 2 0 31 14 2 1
Take notes: No need 11 13 13 11 24 24
Memorisation 24 0 22 2 46 2
Try to be aware of the listener 3 18 2 0 1 6 11 3 4 9 29 5 4 1
the lower proficiency group and the higher proficiency group, as well as the entire
group. This shows the number of times a given coding was made. So, for the first
line (Macro Planning: plan sequence; scan then describe), in the low proficiency group,
no-one coded zero (i.e. no one failed to report this planning behaviour), so that the
table shows that all twenty-four lower proficiency participants reported using this
behaviour. This contrasts with the first of the micro planning behaviours: Understand
pictures in detail. Here, five participants (still with the low proficiency group) did not
report using this behaviour, ten participants reported it once, seven reported it twice,
two reported it three times, and no-one in this group reported using it four or more
times (although in the high proficiency group there was one participant who reported
this behaviour four times). These raw values have been reported, rather than averages
or percentages, because they communicate the situation more effectively.
The table gives a general picture of the frequency with which different codings
were found for the participants. Not a great deal will be said about these figures at
this stage, since more discussion will be provided when we link coded behaviours to
actual performance. Even so, a number of points are worth making. First, there is the
issue of discrimination and whether there is variation between participants in the
use of a particular code. If the goal were simply to characterise self-reported plan-
ning behaviours, this would not be important. They are what they are, and it is the
patterns amongst them which would be of interest. But if we are to link such behav-
iours to performance, it is helpful if there is dispersion. On these grounds, then,
both of the macro behaviours (the two Plan sequence behaviours), as well as several
of the lexical codings, (e.g. Advanced words, plus one or two of the metacognitive
Self-reported planning behaviour and second language performance in narrative retelling 107
behaviours) are problematic, since effectively only one value is reported, with this
often being zero or one, indicating that the behaviour is simply not reported by the
vast majority of the participants. A second point which stands out is the connection
with proficiency level. Several of the codes show interestingly different patterns for
the low and high proficiency students. For example, the micro planning behaviour
Connect the pictures to develop the story sees fourteen of the low proficiency partici-
pants code ‘0’, while only six of the high proficiency participants code in this way.
One wonders, therefore, if this behaviour is facilitated by students’ higher levels of
proficiency.
Quantitative data
Table 3 shows the performance results for the participants. The data is shown for one
measure of accuracy (error free clauses), one of complexity (subordination), two paus-
ing measures (number of pauses at AS boundaries per 100 words, number of pauses
mid-clause, per 100 words), length of run, and Lambda, so covering accuracy, com-
plexity, fluency, and lexis. The measures are given for the entire group, and then also
108 Francine Pang & Peter Skehan
for low and high proficiency participants. In this case, significance values are also
given, based on between subjects t-tests. The Length of Run measure is only given to
provide a limited degree of comparison with Ortega (2005) who used a very similar
index. For all values in this table the N is 48 overall, with 24 low proficiency partici-
pants and 24 high proficiency participants.
We will first consider the general values in this table. There are now a wide range
of studies (including in this volume) which use the set of measures shown in the table,
and so one can relate the values shown here to the wider literature. In this respect, the
values for accuracy are relatively low (although there may be issues here with Chinese
L1 EFL learners of English, they are somewhat lower than those reported by research
conducted in a U.K. context). The values for fluency are fairly normal for the sorts of
narrative retellings involved here, and are low, if anything, relative to other research
studies. The remaining values, though, are in fact, higher than is normally reported.
The subordination values for tasks of this sort are usually a little lower, possibly more
in the 1.4 to 1.5 range for planned performances. Similarly, the Lambda values, as an
index of lexical sophistication, are definitely higher than is typical, probably around
0.2 to 0.5 above what one normally finds (Skehan 2009b). Finally, the length of run
values are also clearly higher than the norm, since one often sees values here of 3 or 4
rather than the 5.42 and 6.12 shown for the two groups. Indeed, these values approach
those found with native speakers. If one makes a rather tentative comparison with
Ortega’s (2005) ‘words within one intonation contour’ index, one would estimate that
the participants in this study are between her two groups, and perhaps closer to her
Self-reported planning behaviour and second language performance in narrative retelling 109
advanced group. In all, one could characterise these performances as lacking in accu-
racy, but lexically rich and complex.
Turning to the comparisons of the low and high groups, it is surprising how few sig-
nificant differences are reported. In fact, there are only two, for accuracy and for length
of run, although each of these comparisons reaches quite a high level of significance.
So the advanced group is more accurate (consistent with Bui (this volume)), and they
produce ‘smoother’ language, producing units of speech which are longer. In contrast,
with fluency and with lexical and structural complexity, there are no significant dif-
ferences. It is noteworthy that in all these cases, the high proficiency group produces
arithmetic means which indicate a higher level of performance, but none of these com-
parisons actually reach significance. But one important point does need to be made here
regarding the accuracy contrast. In the next section we are going to turn to the associa-
tions between self-reported planning behaviours and actual performance. For structural
complexity, for lexical sophistication and for fluency, we shall treat the two proficiency
groups as effectively one group, given the lack of significant differences found. But the
two groups have to be considered differently for accuracy. As we will see, this poses a
problem for interpretation. If certain coded behaviours are associated with higher levels
of performance and also discriminate between the two proficiency groups, we cannot
tell whether these behaviours are ‘enabled’ by the higher proficiency, or whether they
contribute to that higher level of performance. We will return to this problem below.
Table 4. Frequency and Error-Free clause mean scores: Lexical Code 1 – General retrieval
Code Number of reports Mean Error free clauses score
0 19 .37
1 24 .41
2 5 .45
for the different performance areas, all codes which seemed to have some association
with performance were brought together (whereas codes which generated no such
relationship were not included). Table 5 presents the findings for the Error-free clauses
measure, both positive and negative. The first two columns give information about the
coding involved, and fuller information is provided in Appendix 3. The third column
shows whether there is a linkage to proficiency, and the fourth column shows the dis-
tribution of coding frequency (0, 1, 2, etc.) and also the mean error-free clauses score
associated with each frequency.
Positive
LE1: Lexical: Try to remember the words used in Yes Distrib: 19 -24 – 5
General retrieval the task EFC .37/.41/.47
LE8: Lexical choice: Try to use accurate words Yes Distrib: 43 – 5
Accurate words EFC: .40/.46
Meta2: The notes taken are helpful to No Distrib: 35 -12 – 1
Comment on notes: structure a clear story EFC: .38/.47/.54
Structure clear story
Meta3: Rehearse to be fluent No Distrib: 28 - 20
Rehearse: Fluent EFC: .36/.47
Meta4: Rehearse to check whether what is Yes Distrib: 43 – 4 - 1
Rehearse: planned is logical or clear EFC: .39/.55/.50.
Logical or clear
Micro 8: Try to connect or structure the Yes Distrib: 19 -21 – 5 - 2
Connect the pictures to develop the story plot EFC: .35/.43/.43/.56
pictures to develop
the story plot
Negative
Le 10: Lexical Use a few words or another way No Distrib: 30 -13 - 4
compensation: to replace a word which cannot be EFC: .43/.38/.27
Circumlocution recalled
Mic5: Plan how to Plan how to describe the story more No Distrib: 26 -14 - 8
express better vivid, interesting, or clearer etc. EFC: .43/.38/.-37
Self-reported planning behaviour and second language performance in narrative retelling 111
Only the relations with error-free clauses have been shown here, and the other per-
formance areas are covered in the tables which follow. It may be, therefore, that codes
appear in more than one table, if a particular code has a relationship with more than
one performance area. Intriguingly, one or two codes have positive associations with
one performance area and negative associations with another. For example, LE1, Lexi-
cal General Retrieval, the first coding shown in Table 5, and with a positive relationship,
also has a negative relation with mid-clause pausing (negative, that is, with greater flu-
ency, in that LE1 is associated with more mid-clause pausing). In any case, and setting
aside proficiency linkages for the moment, it appears that higher accuracy scores are
obtained when more focused behaviour is involved, whether this is rehearsing specific
words or rehearsing in a more targeted way. Rehearsing (note-taking) linked to structure
(Meta2) is beneficial, consistent with focused behaviour within planning being effec-
tive. In contrast, lower accuracy is associated with planning leading to problems, either
through lexical difficulties or with over-extension in what is attempted. It is interesting
that these suggest the potential for planning to be two-edged: It can lead the speaker
into trouble, as well as be helpful, depending on how the planning time is used.
Table 6 below shows the comparable information for the Subordination measure.
Once again the linkage with proficiency level is shown, but it should be borne in mind
that there was no significant contrast here between the two proficiency levels.
An obvious first remark here has to be that the number of positive and negative
associations is somewhat reversed for those with accuracy. There, positive associa-
tions predominated. Here we find quite a few more negative connections than posi-
tive. Regarding the positive influences, it does come across that ideas lead the way
when subordination increases, and grammar takes a back seat. It also appears that
organising, and going from big to small, is a good strategy to raise subordination. The
literature on tasks suggests that task structure is generally a good thing, and that earlier
findings that accuracy was advanced by structure have now been extended to suggest
that complexity is also affected. The results in this section are that if learners can use
planning time to bring their own structure to the narrative retelling, this seems to
raise the subordination measure. The negative influences do seem to have a common
quality here. It is over-ambition. This could be over-ambition in lexical choices, or
over-ambition generally in what can be done during planning time. This seems to be
associated with performance being damaged to some degree, particularly with regard
to complexity. In addition, it seems as if those who focus on grammar do so at the
expense of complexity, at least as indexed by subordination.
Next, we turn to the two measures of fluency, AS pauses and mid-clause pauses.
They are treated separately, mainly because while there is a certain amount of overlap
in the associations found, mostly the two fluency measures are influenced by different
things, and in one case a code is negatively associated with one, and positively associ-
ated with the other, as we shall see. The information for AS-pauses is given in Table 7.
A slight difference in this compared to the other tables is that where there is a relevant
112 Francine Pang & Peter Skehan
Table 6. Codes associated with higher and lower subordination scores
Positive
Negative
Le2: Lexical choice: Choose appropriate words for telling No Distrib: 40 - 6 - 2
Appropriate words the story Negative
Subord.: 1.68/1.58/1.56
Le4: Lexical choice: Choose to use some connective words Yes Distrib: 29 - 18 - 1
Connective words Negative,
Subord. 1.71/1.58/1.37
Le7: Lexical choice: Try to use various words to talk about No Distrib: 44 - 4
Various words the same thing Negative,
Subord: 1.67/1.46
Le8: Lexical choice: Try to use accurate words Yes Distrib: 43 - 5
Accurate words Subord: 1.67/1.58
Le9: Lexical Use a word similar in meaning to No Distrib: 42 - 6
compensation: replace the word which can’t be Subord: 1.67/1.53
Approximating recalled
Le 12: Grammar: Think of general use of grammar No Distrib: 37 - 10
General use Subord: 1.68/1.62
Le 13: Grammar: Think of the use of correct tense No Distrib: 32 - 16
Tense Subord: 1.7/1.57
Mi2: Plan general Plan general things, such as describing Yes Distrib: 28 - 20
things the general plot of the story Subord: 1.69/1.61
Mi5: Plan how to Plan how to describe the story more No Distrib: 26 – 14 - 8
express better vivid, interesting, or clearer etc. Subord: 1.69/1.63/1.59
Mi8: Connect the Try to connect or structure the Yes Distrib: 19 – 21 - 5 - 2
pictures to develop pictures to develop the story plot Subord: 1.71/1.64/1.56/1.5
the story plot
mid-clause pausing association, that too is shown. The same thing is done in Table 8,
since AS pause information is shown in a table focusing on mid-clause pauses. In this
way, the two tables provide separate information on these two aspects of dysfluency,
but also facilitate interpretations of shared influences, such as Lexis 14, in the positive
section of Table 7, or discrepant influences, such as Lexis 8, in the Negative section of
the same table. Note also that greater fluency is indexed by lower pausing scores.
Self-reported planning behaviour and second language performance in narrative retelling 113
Table 7. Codes associated with differences in AS clause boundary pausing per 100 words
Positive
Negative
Le2: Lexical choice: Choose appropriate words for No Distrib: 40 - 6 - 2
Appropriate words telling the story AS pausing 3.34/3.54/3.72
Le4: Lexical choice: Choose to use some connective Yes Distrib: 29 - 18
connective words words AS pausing 3.15/3.65
Negative,
Mid-pause: 3.54/3.79
Le8: Lexical choice: Try to use accurate words Yes Distrib: 43 - 5
Accurate words AS pausing 3.35/3.69
Positive,
Mid-pause: 3.8/2.26
Le12: Grammar: Think of general use of grammar No Distrib: 37 - 10-
general use AS pausing 3.18/4.0
Le13: Grammar: Think of the use of the correct No Distrib: 32 - 16
Tense tense AS pausing 2.98/4.19
Negative,
Mid-clause: 3.18/4.56
Mi6: Think of ideas Think of more ideas beyond the Yes Distrib: 27 – 19 - 2
beyond pictures pictures AS pausing 3.32/3.42/3.93
Positive,
Mid-pause: 3.82/3.49/2.6
Mi8: Connect the Try to connect or structure the Yes Distrib: 19 – 21 – 5 - 2
pictures to develop pictures to develop the story plot AS pausing 3.17/3.48/3.58/4.08
the story Negative, Mid, 3.37/3.99/3.93/2.36
114 Francine Pang & Peter Skehan
being general with ideas. Negatively, there is the behaviour of avoiding lexical dif-
ficulty, of being (too) unambitious, and of relatively unfocussed rehearsal. There may
be connections here with other areas – level of achievable ambition is relevant, as is
the issue of generality-specificity, although here the influence is positive for generality
for ideas, and negative for rehearsal. The most we can draw from this is that there does
seem to be a specific attraction or rejection of words for different participants.
Negative
Discussion
In this discussion section we will first explore convergence and contrast with Ortega
(2005). Then we will examine successes and failures within the present coding scheme
in terms of its broader categories, notably the macro, and lexico-grammar categories,
and then move to relate this data to dimensions of performance (complexity, accuracy,
fluency, lexis) as well as the Levelt model. We will finish by briefly discussing debates
in the planning literature, as illuminated by the present study, and then make some
suggestions for training and pedagogic applications.
The coding scheme developed in the present research has proved usable and illu-
minating. It contains the broad categories of Macro Cognitive Codes, Micro Cognitive
Codes, Lexical and Grammar Codes, and Metacognitive Codes. This ‘packaging’ may
contain a number of similar elements to those in Ortega (2005), but their arrangement
is sometimes different. The major change is the introduction of macro and micro codes.
Some of the detailed codes here relate to Ortega’s metacognitive strategies, since they
concern organisational planning and various forms of attentional focus, (though, they
do not emphasise monitoring and evaluation as much as she does in relation to the
Meta-cognitive codes.) Also the present scheme has a number of lexical and grammar
planning codes, which in principle connect with cognitive strategies. However, here
Self-reported planning behaviour and second language performance in narrative retelling 117
the overlap with Ortega’s scheme is not extensive. In fact, our system treats rehearsal
as part of metacognitive planning (on the basis that there is self-awareness involved),
whereas Ortega includes it as a form of cognitive strategy. Our rehearsal codes are also
more explicitly linked with aspects of performance, which can be closely associated
with different stages of Levelt’s model.
We now shift focus a little and consider how the codes are associated with differ-
ences in performance. Here we have some interesting contrasts. The two Macro Codes
(Table 2) were of little help in the analysis, in that they totally failed to discriminate
between participants. (As a result, there was no scope for any association with per-
formance to emerge.) In contrast, the Micro planning codes were far more success-
ful, though only one (Micro1) had no connection with performance. The other seven
discriminated, and also had associations with different performance areas, with a ten-
dency to link with complexity (Table 6) and pausing, both AS pausing (Table 7) and
mid-clause pausing (Table 8). Turning to the Lexical and Grammatical codes, their
contribution was more mixed, with two lexical codes yielding nothing. The remainder,
though, did make some contribution to the associations with performance. The lexical
codes had associations with all performance areas, both negative as well as positive,
although fluency (Tables 7 and 8) was the commonest area to crop up. Interestingly,
the three grammar codes, which all revealed associations with aspects of performance,
were consistent: they showed no relation to accuracy (Table 5), and showed negative
relationships with subordination (Table 6) and pausing (Tables 7 and 8)! Finally, the
metacognitive codes were a mixed bag. Around half either failed to discriminate or
had no connection with performance. The focus here is on rehearsal, and while this
was important for accuracy (the most consistent sub-area to have this connection:
Table 5), other metacognitive codes, such as involving notes, or memorisation, or
being aware of the listener, had little to provoke discussion. Clearly these results sug-
gest that some areas of reported planning behaviour are more rewarding than others.
We turn next to exploring in greater depth why codes were associated with perfor-
mance success. We will propose five wider principles which seem to subsume the more
detailed codes which have been discussed so far. The principles are:
I nterestingly, four of the five principles cut across the performance areas, and so they
are proposed as general principles for good use of planning time, though they manifest
themselves in different performance areas in different ways. Only the fifth is directly
connected with any particular performance area. In fact, the first two ‘code bundles’
are associated with widespread improvements in performance, whereas as we go on,
the focus for improvement is narrowed somewhat.
and do not plan their way into trouble (plan general, plan how to be better, and to use
more appropriate words), do better. Most interesting of all, this strategy for using plan-
ning time seems to raise accuracy, complexity and both aspects of fluency.
accuracy or error. In contrast, the first two, implying a concern for grammar, are asso-
ciated negatively with subordination and fluency, and the third, a negative portrayal of
grammar, is positively associated with subordination and fluency. So, the reverse effect
is found to what might be expected here, and it suggests that focusing on grammar
confers only disadvantages, and no advantages.
Reflecting on these five ‘principles’ for effective use of planning time, the most
notable reaction one is likely to have is how negative they are, and that they seem to
convey that we have learned more about how to use planning badly than how to use
planning well. There are suggestions of some principles associated with a good use
of planning, such as to impose structure, to be led perhaps by ideas and not form,
and to plan small and detailed. But in general, it is what not to do that emerges more
strikingly.
Indeed, the preponderance of the codes where positive relationships are involved
are concerned with complexity and fluency. This is consistent with the generally
accepted finding in the literature (e.g. Ortega 2005) that planning has a beneficial effect
on complexity and fluency. So it seems as if our results are consistent with the broad
quantitative results that have been published (although of course, there are exceptions,
such as Ellis (2009) showing that accuracy effects are not at all uncommon). In other
words, the range of influences we have just covered are consistent with the findings
from the literature, and provide, therefore, some detailed support in terms of what
second language speakers say they do when they plan.
To put this another way, the reported behaviours which are associated with raised
accuracy and with raised lexical sophistication are not so numerous, which is some-
thing of a disappointment. There are some, however, such as Lexical Choice, Accu-
rate Words, as well as General Lexical Retrieval, which make intuitive sense, as well
as the Micro code of Connect or Structure the pictures, implying the value of giving a
structure to the story for accuracy. But Rehearsal, Fluency, and Rehearsal to be logical
or clear are less obviously relevant. So what underlies the association between self-
reported planning behaviours and greater accuracy remains a puzzle. In addition, the
Accurate Words coding, while good for accuracy, seems bad for Complexity and Flu-
ency, suggesting some degree of trade-off in performance. Accurate Words is, though,
the main positive influence on lexical sophistication, while the other influences in this
performance area are the more generalised negative influences of focusing on simple
words, and engaging in general rehearsal. We are left, in other words, with few clear
and convincing influences on either accuracy or lexical sophistication, which may
have some relevance for the less consistent findings in this area. Behaviours generally
concerned with rehearsal of specific things seems good, but the avoidance of harmful
influences (in general) seems just as important.
The other issue arising from the literature which is worth commenting on is the
respective claims of the Trade-off approaches and the Cognition Hypothesis. Basically,
Self-reported planning behaviour and second language performance in narrative retelling 121
the present study has little direct to say on this. But perhaps two points are worth mak-
ing. First, there is often competition between different performance areas when one
looks at the influence of individual coded behaviours, and so raised performance in
one area is often associated with lower performance in another. In general, it looks as
if complexity and fluency often go together, and do not compete. In contrast, accuracy
does seem to compete with other performance dimensions. The Cognition Hypoth-
esis proposes that under certain circumstances complexity and accuracy can both be
raised. When one looks at the way influences on these two areas which arise out of
planning seem rarely to go together, one can conclude that the present study does not
seem particularly supportive for the Cognition Hypothesis. The second point relates to
the way task complexity is seen as the driver, in the Cognition Hypothesis, for raising
accuracy and complexity jointly. One can see from the present study the existence of
influences which cause ideas to predominate and therefore push the speaker to make
the task more complex. On these occasions, however, there is no sign that there is a
beneficial influence on accuracy as well.
Some of these issues are taken up in the final chapter of this volume. Two other
chapters (Bui (this volume) and Wang (this volume)) explore other facets of planning
quantitatively. In the final chapter these different findings will be brought together
to offer a more general account of the role of planning in second language task
performance.
Conclusion
In general, the planning research supports the idea that using planning in pedagogic
contexts where a communicative approach is prevalent is a good thing (Ellis 2005a).
The literature suggests that it is rare to find studies which suggest that planning is
harmful. There are studies which point to its limitations, but it is clear that using
planning is far better on most occasions than not using it. But at the same time, we
have not made that much progress over the last twenty-five years in making sug-
gestions as to when planning is most effective, what alternatives are available to the
planner, and ultimately, whether effectiveness in planning can be trained. The pres-
ent study provides no clear answers to any of these issues, but it is suggestive. The
differences in reported planning behaviours certainly suggest that not all planners
do the same things, and that some behaviours are likely to be more effective than
others. The discussion above suggested that planning choices can be most effectively
regarded as a set of operating principles (build your own structure, avoid trouble,
don’t over-extend, handle trouble, plan small, don’t be too focused on grammar). It
appears, then, that it would be worth exploring whether these principles are general;
whether speakers who do not at first think of using them could be trained to use
122 Francine Pang & Peter Skehan
them (and whether this would then be more effective); and perhaps most ambi-
tiously of all, whether speakers could be induced to use planning behaviours likely
to target particular performance dimensions. The research possibilities here are con-
siderable, and they have the important quality that they would seem likely to have
considerable pedagogic payoff.
References
Bei, X. (2010). The effects of topic familiarity and strategic planning in topic-based task performance at
different proficiency levels. Unpublished Ph.D. thesis. Chinese University of Hong Kong.
Crookes, G. (1989). Planning and interlanguage variation. Studies in Second Language Acquisition,
11, 367–383.
De Bot, K. (1992). A bilingual production model: Levelt’s “Speaking” model adapted. Applied Lin-
guistics, 13, 1–24.
Ellis, R. (1987). Interlanguage variability in narrative discourse: Style shifting in the use of the past
tense. Studies in Second Language Acquisition, 9, 1–20.
Ellis, R. (Ed.). (2005a). Planning and task performance in a second language. Amsterdam: John
Benjamins.
Ellis, R. (2005b). Planning and task-based performance: Theory and research. In R. Ellis (Ed.), Plan-
ning and task performance in a second language (pp. 3–36). Amsterdam: John Benjamins.
Ellis, R. (2009). The differential effects of three types of task planning on the fluency, complexity, and
accuracy in L2 oral production. Applied Linguistics, 30(4), 474–509.
Foster P. & Skehan P. (1996). The influence of planning on performance in task-based learning. Stud-
ies in Second Language Acquisition 18(3), 299–324.
Hill L.A. (1961). Picture composition book. London: Longman.
Kormos J. (2006). Speech production and second language acquisition. Mahwah, NJ: Lawrence
Erlbaum Associates.
Levelt, W.J.M. (1989). Speaking: From intention to articulation. Cambridge, MA: The MIT Press.
Levelt, W.J. (1999). Language production: a blueprint of the speaker. In C. Brown & P. Hagoort (Eds.),
Neurocognition of language, (pp. 83–122). Oxford: Oxford University Press.
MacWhinney B. (2000). The CHILDES Project: Tools for analysing talk, Volume 1: Transcription for-
mat and programs (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Mehnert, U. (1998). The effects of different lengths of time for planning on second language dis-
course. Studies in Second Language Acquisition, 20, 52–83.
Michel, M.C., F.Kuiken, & Vedder, I. (2007). Effects of task complexity and task condition on Dutch
L2. International Review of Applied Linguistics, 45(3), 241–259.
O’Malley, J.M., & Chamot, A.U. (1990). Learning strategies in second language acquisition. Cambridge:
CUP.
Ortega L. (1995). The effect of planning on L2 Spanish narratives. Research Note 15. Honolulu, HI:
University of Hawai’i Second Language Teaching and Curriculum Center.
Ortega L. (1999). Planning and focus on form in L2 oral performance. Studies in Second Language
Acquisition, 21, 109–148.
Ortega, L. (2005). What do learners plan? Learner-driven attention to form during pre-task planning.
In R. Ellis (Ed.), Planning and task performance in a second language (pp. 77–109). Amsterdam:
John Benjamins.
Self-reported planning behaviour and second language performance in narrative retelling 123
Oxford R. (1990). Language learning strategies: What every teacher should know. Rowley, MA:
Newbury House.
Pang F., & Skehan P. (2006). ‘What do learners do when they plan: a qualitative study’. Paper p
resented
at St. Mary’s University College, Twickenham.
Revesz A. (2009). Task complexity, focus on form, and second language development. Studies in
Second Language Acquisition, 31(3), 437–470.
Robinson P. (2001). Task complexity, task difficulty, and task production: Exploring interactions in a
componential framework. Applied Linguistics, 22, 27–57.
Robinson P. (2011). Second language task complexity, the Cognition Hypothesis, language learning,
and performance. In P. Robinson (Ed.), Second language task complexity: Researching the Cogni-
tion Hypothesis of language learning and performance (pp. 3–38). Amsterdam: John Benjamins.
Skehan, P. (2009a). Modeling second language performance: Integrating complexity, accuracy, flu-
ency and lexis. Applied Linguistics, 30(4), 510–532.
Skehan, P. (2009b). Lexical performance by native and non-native speakers on language-learning
tasks. In B. Richards, H. Daller, D. Malvern, & P. Meara (Eds.), Vocabulary studies in first
and second language acquisition: The interface between theory and application. (pp. 107–124).
London: Palgrave Macmillan.
Skehan, P., & Foster, P. (1997). Task type and task processing conditions as influences on foreign
language performance. Language Teaching Research, 1(3), 185–211.
Strauss, A., & Corbin, J. (1990). Basics of qualitative research: Grounded theory procedures and tech-
niques. Newbury Park, CA: Sage.
Tavakoli, P., & Foster, P. (2008). Task design and second language performance: The effect of narra-
tive type on learner output. Language Learning, 58(2), 439–473.
Tavakoli, P., & Skehan, P. (2005). Planning, task structure, and performance testing. In R. Ellis (Ed.),
Planning and task performance in a second language, (pp. 239–276). Amsterdam: John Benjamins.
Wang, Z. (2009). Modelling speech production and performance: Evidence from five types of planning
and two task structures. Unpublished Ph.D. thesis. Chinese University of Hong Kong.
Wigglesworth, G. (1997). An investigation of planning time and proficiency level on oral test
discourse. Language Testing, 14(1), 85–106.
Appendix 1
Appendix 2
General Prompts
Planning What did you do? How did you plan? Tell me what you did during the
(Self report) time you’re planning?
Then (Notes) Here are your notes. Do they help you remember more about what
you did when you planned? Did making notes help in the actual task
although you’re not allowed to look at them?
General reactions to Do you think the planning time was useful? Why or why not?
self report Do you think the planning was useful for the general way or for small
(more focus Qs) things? Why?
What effect did the planning have on the way you did the task?
Emphasis What did you emphasise?
(Self report)
More specific questions on 1. WORDS
emphasis. – To remember words that might be useful?
(If nothing said on each of – To use words that might be diverse, accurate?
the expected area, ask one 2. GRAMMAR
by one.) – To remember grammar?
– To avoid mistakes (correctness)?
3. IDEAS
– To think of ideas?
– To organize ideas?
4. STRUCTURE OF THE STORY (Picture Story Telling Task)
– To structure the flow of pictures
– To use language to show how the pictures are connected and
developed
5. REHEARSE
– To be accurate
– To be fluent
6. To THINK HOW TO SAY things to make it easier and clearer for
your partner?
Self-reported planning behaviour and second language performance in narrative retelling 125
Appendix 3
1. Plan sequence: Scan all the pictures, I first went through the pictures. After I got
Scan pictures, then understand what the story a general idea, I began to think how to tell
describe is about, then think of the story in English.
how to describe the story
2. Plan sequence: Look at the pictures one I looked at the pictures one by one, and
Look at picture by one and at the same think about how to tell each picture at the
one by one, then time think about how to same time.
describe describe each picture
1. Understand the Try to understand the I first read the instruction, but I had no
pictures in details pictures in details and idea when I saw so many sheep in the
know what the story is grids. I went through the pictures first and
about then looked at the details.
2. Plan general things Plan general things, such I planned general things. I needed to have
as describing the general a concept first, and the details could be
plot of the story added while I spoke.
3. Plan small details Plan small details, such I think it was useful for the subtle things.
as detailed description Maybe I could tell the story in a general
way right away but I might make it a mess.
4. Plan how to tell Plan how to tell the story During the ten minutes, I was thinking
the story in general how to tell the story.
5. Plan how to Plan how to describe Based on the pictures, I considered how to
express better the story more vivid, describe them in a better way and
interesting, or clearer etc. I organized my words.
6. Think of ideas Think of more ideas With some time to plan, I could add my
beyond pictures beyond the pictures ideas if the links were strange or if
I couldn’t understand the pictures.
7. Organize the ideas Try to organize the ideas Yes, I read it. Then, I turned to the details
developed from developed from different and figured out how to organize them.
the pictures pictures to have a clear I sorted out how many times the sheep was
story plot weighted and how many different sports
it did, and I also concluded that at last it
succeeded to be the same size as the other
sheep.
8. Connect the Try to connect or I emphasized on the plots. I considered
pictures to develop structure the pictures to how to connect the pictures to make the
the story plot develop the story plot story better.
(Continued)
126 Francine Pang & Peter Skehan
Appendix 3 (Continued)
2. Lexical choice: Choose appropriate I wanted Yuni to know that the tree is
Appropriate words words for telling the very important. I would think about
story the language of every scene and the
appropriate words to the scenes.
3. Lexical choice: Choose to use simple I considered using simpler words which
Simple words words could be understood most easily.
4. Lexical choice: Choose to use some Secondly, I added some connective words
Connective words connective words between sentences to connect all the
pictures together.
5. Lexical choice: Choose to use some and added some personification words.
Personification personification words
words
6. Lexical choice: Try to use some … and also use some relatively advanced
Advanced words advanced words words.
7. Lexical choice: Try to use various words A little. For example, for the cake… I
Various words to talk about the same thought of two words, tempt and attract.
thing
8. Lexical choice: Try to use accurate I tried to use some accurate words to
Accurate words words describe the pictures.
9. Lexical Use a word similar in If I had any words that I didn’t know how
compensation: meaning to replace the to express, I tried to replace them with
Approximating word which can’t be other words.
recalled
10. Lexical Use a few words or Secondly, for those words I didn’t know or
compensation: another way to replace forget how to say, I need time to find other
Circumlocution a word which cannot be ways to explain them.
recalled
11. Lexical: No/Little No or very little concern No, not particularly think of the use of
concern about vocabulary. words.
12. Grammar: General Think of general use of Yes, I considered the use of grammar as
use grammar grammar plays an important role in story
telling.
13. Grammar: Tense Think of the use of I feel it’s necessary to consider it. I thought
correct tense if I should use present, past or continuous
tense.
14. Grammar: No/ No or very little concern For me, sometimes I didn’t pay much
Little concern about grammar use attention on grammar. Instead, I rely more
on my sense of the language and didn’t
think much about the correctness.
Self-reported planning behaviour and second language performance in narrative retelling 127
1. Rehearse: General Rehearse what is planned I did not rehearse word by word. I just
to say (and hopefully considered the sequences of what I would
remember better) say.
2. Rehearse: Accurate Rehearse to be accurate At the same time, I think the rehearsal
could help me to be more accurate.
3. Rehearse: Fluent Rehearse to be fluent I rehearsed as I want to tell a more fluent
story.
4. Rehearse: Logical Rehearse to check I think the rehearsal could help to check
or clear whether what is planned the logic of the story, and to see whether
is logical or clear my story is in accordance with the pictures
as well as our common way of thinking.
5. Rehearse: Rehearse to help The rehearsal could also help me to
Memorization memorizing what is memorize some of the things I’ve planned
planned in order to deliver my story more
smoothly.
6. Take notes Take notes to plan the I made some point-form notes on the draft
task (although cannot paper. I usually make point-form. I noted
see it when doing the down the flow.
task)
7. Take notes: No There is no need to take I didn’t since the story is short; if it’s a long
need notes one, I would take notes.
8. Memorization Try to memorize what Finally, I went through my notes, referring
is planned before doing to the pictures; and memorize the points
the task planned.
9. Aware of listener’s Try to make the listener Because the task has no words but just
understanding understand the story, pictures, the planning time could help me
such as using simple think about how to say things to make my
grammar or words partner more easily understand.
8. Lexical: Planned During planning, lexical Though I did think about the word, when
but not used/ is considered, but it is I was telling the story, I forgot these things
correctly used not used or not correctly and just said whatever in my mind.
used when doing the task
9. Grammar: During planning, Yes, I’ve thought about grammar, but I
Planned but not grammar is considered, didn’t pay much attention on it when
used/ correctly but it is not used or not actually doing the task.
used correctly used when
doing the task
chapter 5
Li Qian
Guangdong University of Foreign Studies, China
Given the small body of existing research concerning focus on form at the post-task
stage in task-based language teaching, the present study uses a post-task transcribing
condition as a focus on form activity and explores the effects of transcribing under
various conditions. Eighty participants, divided into four experimental groups and
one control group completed four tasks with a one-week interval between each task.
Different experimental groups were assigned various post-task activities respectively.
No post-task activity was adopted in the control group. Task performance was
measured in terms of complexity, accuracy and lexical performance. The findings
are multifaceted. First of all, the adoption of post-task transcribing, in general, was
found to be efficient for different formal aspects of task performance. In the second
place, pair-based transcribing led to more syntactically complex language, whereas
the individual-based transcribing at the post-task stage led to an improvement in
lexical sophistication. Thirdly, further revision after transcribing had mixed effects
on accuracy and complexity. The findings are discussed in light of the concepts of
noticing and attention, interaction theory and other related SLA theories. Based on
the theoretical discussion, pedagogical implications are proposed.
Introduction
In second language pedagogy, one of the major issues is what Stern (1983) called
the ‘code-communication dilemma’ (Ellis 2008). This is reflected in the dichotomy
between “instructed” and “naturalistic” L2 learning. There are advocates of gram-
mar teaching who view grammar instruction as the foundation for second language
acquisition (Bialystok 1990; McLaughlin 1990; Rutherford 1988). By contrast, there
are also advocates of the “zero option” (Ellis 2008) which proposes abandoning for-
mal instruction and allowing learners to construct their interlanguage naturally
“through communication” rather than “for communication” (Krashen 1982, 1985;
Prabhu 1987).
130 Li Qian
In the past two decades or so, researchers have come to realize that adopting a one-
sided approach, either communicative-based or grammar-based, leads us nowhere.
Pedagogical interventions need to be interwoven into primarily communicative activi-
ties so as to overcome the limitations of both traditional grammar instruction and
communicative language teaching (Doughty & Williams 1998c). Among the variety
of proposals regarding the incorporation of formal instruction in communicative set-
tings (see the review in Norris & Ortega 2000), Focus on Form (FonF) has received
increasing interest in the last two decades. In his seminal work, Long (1991) initially
introduced the notion as follows: ‘focus on form…overtly draws students’ attention to
linguistic elements as they arise incidentally in lessons whose overriding focus is on
meaning or communication’ (Long 1991, p. 45–46). Based on this theoretical notion,
Long and Robinson (1998) later raised a more pedagogically applicable definition:
‘Focus on form consists of an occasional shift in attention to linguistic code features –
by the teacher and/or one or more students – triggered by perceived problems with
comprehension or production’. (Long & Robinson 1998, p. 23)
Literature review
In contrast, there is, rather surprisingly, a paucity of research with regard to the
effects of focus on form at the post-task stage. One of the reasons may be that in a
research context, some researchers have viewed task-based instruction as consisting
only of pre-task activity and task performance (or even task performance only in some
cases). Once the task is accomplished, the job is done. However, that should not neces-
sarily be the case in L2 language pedagogy. Willis (1996) pointed out that:
‘task-based learning is not just about getting learners to do one task and then another.
If that were the case, learners would probably become quite expert at doing tasks and
resourceful with their language, but they would almost certainly gain fluency at the
expense of accuracy. ….to promote constant learning and improvement, we should
see it (to do a task) as just one component in a larger framework’. (p. 40)
their attention may not necessarily be directed to form to achieve higher levels of
accuracy. In brief, ‘public performance’ does not represent the whole picture of what
post-task activities are about.
In a subsequent study, Foster and Skehan (2013) used an alternative post-task
activity – “transcribing” one’s own performance, in order to deal with the restrictions
of public performance. Transcribing is an individual activity in which every partici-
pant re-examines their own task performance. When pushed to transcribe their own
performance, all learners are forced to pay attention to the formal aspects of their own
language during the task (Foster & Skehan 2013). The researchers divided the par-
ticipants into experimental and control groups. The experimental group transcribed
extracts of their own task performance as a post-task activity, while the control group
did no post-task activity. Both groups performed two types of tasks: narrative and
decision-making, in two counterbalanced task cycles. The results showed that (1) fore-
knowledge of transcribing as a post-task activity had a significant accuracy effect
on task performance in both narrative and decision-making tasks, (2) significantly
greater complexity was found for the experimental group in the decision-making task,
(3) with regard to fluency, length of run (i.e. the number of words uttered before any
breakdown or repair occurs) was significantly greater for the post-task condition in
the decision-making task. As compared with Skehan and Foster (1997), the results of
this study are clearly more positive and supportive not only for the accuracy effect,
but also with regard to the effects of a post-task transcribing phase on complexity and
one measure of fluency. As far as task type is concerned, the findings suggested that
the decision-making task was more sensitive to post-task effects, not only in terms
of accuracy, but also of complexity. In contrast, narrative performance appears to be
more difficult to influence. These results may push researchers to pay more attention
to differences in task type and how these interact with other task conditions (Skehan &
Foster 2001; Skehan 2007).
Skehan and Foster’s post-task research is pioneering in the task-based research
field and shows some preliminary results which could inspire further studies. In addi-
tion, the post-task phase has also attracted some attention in the literature on second
language pedagogy. From a teacher’s perspective, Lynch (2001, 2007) investigated the
process of transcribing as a post-task activity in an L2 classroom setting. Based on
observation of classroom learning, two limitations were identified by Lynch (2001).
First, learners received too much feedback on learner activity, and had too little time
for reflection on the forms of the language. They kept on doing activities with few
opportunities to reflect on their language gains and deficiencies. Second, given that
only high-proficiency learners were aware of the changes in their language during the
activity (Lynch & Maclean 2001), devising ways to help learners analyse their per-
formance is necessary. In such a context, Lynch employed transcribing as a reflective
noticing activity for classroom learning. In contrast to employing transcribing as an
Get it right in the end 133
individual task, as in Foster and Skehan’s (2013) study, Lynch (2001, 2007) designed
the activity to be conducted in pairs as a form of collaborative transcribing. In Lynch’s
(2001) study, learners were asked to transcribe together, then discussed and revised
the transcripts, and submitted the revised transcripts to the teacher for further correc-
tions and reformulation. The analyses of the process and product of these cycles sug-
gested that collaborative transcribing and revising can encourage learners to focus on
form in a relatively natural way. Furthermore, the teacher can play an important role in
this post-task intervention, especially with regard to the improvement of vocabulary
use (Lynch 2001).
In a subsequent study, Lynch (2007) compared the effects of two different tran-
scribing groups – student-initiated transcribing (students in pairs transcribed their
own performance and then revised the transcripts) and teacher-initiated transcribing
(the teacher transcribed problematic extracts of learners’ performance, and the tran-
scripts with errors were given back to the pairs for their revision). The analyses of the
subsequent performance showed that both procedures are manageable under normal
classroom conditions, and suggested that the student-initiated transcribing was more
effective in helping the learners to maintain higher accuracy levels in the highlighted
forms which were revised by students themselves.
The previous studies, either from a researcher’s or from a teacher’s perspective,
show encouraging results with regard to the effect of post-task activities. Comparing
the findings of two post-task activities (redo the task publicly or transcribing task per-
formance) in Skehan and Foster’s two studies (1997, 2013), we find that transcribing
may be adopted as a more feasible, manageable and effective post-task activity in com-
municative language classrooms. Similarly, Willis and Willis (2007) also suggest that
transcribing as an effective post-task pedagogical choice is appealing to both teach-
ers and learners. Some teachers have already begun to employ this activity in their
classrooms and found positive effects of transcribing on students’ oral performance
(Clennell 1999; Mennim 2003, 2011; Stillwell et al. 2010).
In view of the relative infancy of the research base on post-task activities, it is
not surprising that the available studies have some serious limitations. For instance,
in Lynch’s two studies and some L2 classroom-based studies (Mennim 2003, 2011;
Stillwell et al. 2010), the sample sizes were not large enough for statistical analyses,
and so no statistical results were reported, which causes problems in generalizing
to other classrooms. In Foster and Skehan (2013), post-task transcription was per-
formed outside the actual classroom. So, some unexpected intervening variables may
have been at play with transcribing taking place beyond the supervision of the teach-
ers. In addition, in the previous studies (Clennell 1999; Mennim 2003, 2011; Stillwell
et al. 2010; Foster & Skehan 2013), participants were engaged in the activity of post-
task transcribing only twice, which may not be sufficient to demonstrate strong treat-
ment effects.
134 Li Qian
Given these limitations with the previous studies, the present study explores the effects
of post-task transcribing under various conditions. First of all, transcribing is carried
out either individually or in pairs. In Foster and Skehan (2013), transcribing was per-
formed individually. That is, each participant was asked to transcribe their task per-
formance (including interactive task performance) by him/herself. In contrast, Lynch
(2001, 2007) adopted a pair work style which he called “collaborative transcribing”. The
present study aims to explore which of these two conditions (individual versus collab-
orative work in pairs) is more effective in terms of enhancing language performance.
Secondly, some of the experimental groups will take part in a revision condition.
In Lynch (2001, 2007), when they were involved in revising and reformulating, learn-
ers showed a clear performance improvement. Given the small sample size in Lynch’s
study (N = 16), no generalizations could be made concerning the effects of revision.
Participants in the current study (N = 80) will be engaged either in reflective self revi-
sion, interactive peer revision or no revision conditions to disentangle the distinctive
impacts of these alternative procedures.
In particular, the goals of the current study are (i) to investigate whether transcrib-
ing as a post-task activity has an effect on formal aspects of task performance; (ii) to
compare the effects of different transcribing conditions (individual work versus pair
work, revision versus no revision) on language performance.
In addition, as compared with previous research, participants in the present study
are engaged for a longer period, that is, four task sessions with a one week interval
between each session. Adopting multiple task sessions is expected to make the mea-
surement of the treatment effect more revealing in exploring learners’ performance
changes over time.
The research questions which guided this study are the following:
Research methodology
Participants
Eighty participants were included in this study. All of them were second-year univer-
sity students from a South China university aged between nineteen and twenty-one.
Get it right in the end 135
They were non-English majors, among whom forty-one were female and thirty-nine
were male. They had been studying English for 7–10 years, with low intermediate to
intermediate English proficiency. The participants were randomly divided into four
experimental groups and one control group only according to the time slots which
would be available for them to attend the study.
Prior to the experiment, to explore the comparability of the five groups regard-
ing English proficiency, the participants were administered a cloze test. In language
testing, cloze tests have frequently been adopted as a valid and reliable instrument to
assess overall language proficiency (Brown & Rodgers 2002). In this study, the cloze
test was composed of three cloze passages, which were adapted from the nation-wide
standardized China College English Test Band 4 (CET-4) Database for non-English
majors at low intermediate to intermediate English proficiency level.
A one way ANOVA showed no significant difference among the five groups on the
cloze test results, F(4, 78) = .628, p = .679. On this basis, the groups were found to be
comparable with regard to English proficiency.
Experimental tasks
The present study used six tasks in total. There were three narratives (one for practice
and two for actual data collection), and three interactive decision-making tasks (again,
one for practice and two for data collection). The four treatment tasks were carefully
arranged in order to tease out the intervening influence of task type and task order. The
two sub-groups under each treatment group were assigned the same tasks in reverse
order to counterbalance any intervening effect of task sequence. In addition, the dif-
ferent topics were arranged in a balanced order. Table 1 below shows the arrangement
of task type and task topics.
In the narrative tasks, the participants described stories from cartoon episodes
of Tom and Jerry. They watched the cartoons, and then retold the story. In the inter-
active decision-making tasks, the participants, in dyads, acted as the editors of the
problem column of a magazine (cf. Skehan & Foster’s (1997) Agony Aunt task). In
each task, they discussed the problem in a letter written to the magazine and agreed
upon the best advice for the writer. Each letter described a certain tricky personal
situation that did not have a simple or obvious solution (Skehan & Foster 2001).
(See Appendix for task instructions and problem letters for the interactive decision-
making tasks).
Procedures
The participants were seen five times at one-week intervals, the first time for orienta-
tion and the other four times for main study data collection. Prior to the data collection,
an orientation session was given to ensure that the participants were well informed
regarding the task procedure and the basic transcribing skills that were required (for
the experimental groups). In addition, task practice for both task types was expected
to reduce the participants’ performance anxiety (Bygate 1996, 2000, 2001).
In the task session, each participant was provided with a tape-cassette recorder.
For the narrative task, a cartoon episode from Tom and Jerry was played to the partici-
pants. After two minutes planning, every participant was asked to describe the episode
to the recorder as if they were telling the story to someone else who had not watched
the cartoon. The recordings were then used for post-task transcribing. With the inter-
active decision-making tasks, participants, working in dyads, were given the problem
letter and then discussed with each other to agree what good advice could be given.
The pairs were the same in each session. Their discussion performances were recorded
for transcribing as well.
At the post-task stage, participants transcribed part of their recordings from the
tapes with paper and pencil. As for the length of the transcribed performance, in a
narrative task, the participants who worked individually transcribed a 3-minute per-
formance starting from a certain story point. For the pair-work groups, this was more
complicated. Each member of the dyad contributed a 1.5-minute performance in a
continuous storyline. For example, starting from a story point (The door is too small
for the puppy to come in…), participant A’s 1.5-minute performance were transcribed,
and the transcription ended with another story point (Tom was angry…) from which
participant B’s 1.5-minute performance started. In this way, the story content contrib-
uted by the dyads was supposed to be different from each other so as to avoid any com-
petition concerning the quality of the oral performance between the two members.
Several story turning points in the cartoons were provided for free selection by each
pair. In the decision-making tasks, considering that between-interlocutors’ pauses
tended to occur more frequently than in the narrative task, a five-minute conversation
between a dyad was assigned to be transcribed either individually or in pairs. To fol-
low previous research procedures (Lynch 2001, 2007) that five minutes are needed and
enough for learners to transcribe their own one-minute speech, in the present study
Get it right in the end 137
fifteen and twenty minutes were allocated for the transcription of the narrative and
interactive performance respectively.
The four experimental groups were assigned to carry out post-task transcribing
activities in different conditions as follows (see the research design in Table 2):
The transcripts were then collected by the researcher for further qualitative study.
At the end of the last task cycle, an interview was carried out to gather feedback
on the study, specifically participants’ reflections on their task performance and on the
post-task activities.
Sub- Sub- Sub- Sub- Sub- Sub- Sub- Sub- Sub- Sub-
group 1 group 2 group 1 group2 group 1 group 2 group 1 group2 group 1 group 2
(n = 8) (n = 8) (n = 8) (n = 8) (n = 8) (n = 8) (n = 8) (n = 8) (n = 8) (n = 8)
cycle 1 Na1 Ib Na Ib Na Ib Na Ib Na Ib
P12 P1 P2 P2 P3 P3 P4 P4
cycle 2 Ia Nb Ia Nb Ia Nb Ia Nb Ia Nb
P1 P1 P2 P2 P3 P3 P4 P4
cycle 3 Ib Na Ib Na Ib Na Ib Na Ib Na
P1 P1 P2 P2 P3 P3 P4 P4
cycle 4 Nb Ia Nb Ia Nb Ia Nb Ia Nb Ia
P1 P1 P2 P2 P3 P3 P4 P4
a: narrative task puppy tale; Nb: narrative task baby butch;
Note: 1. N
Ia: interactive decision-making task cyber love; Ib: decision-making task unbalanced status
2. P1: post-task individual transcribing; P2: post-task individual transcribing & revising;
P3: post-task pair transcribing; P4: post-task pair-transcribing & revising.
138 Li Qian
Research design
In this large-scale study, a 4 × 2 × 2 research design was employed (see Table 2).
The first independent variable, the post-task transcribing condition, is a between-
subject factor with five levels: (1) individual transcribing only, (2) individual tran-
scribing and revising, (3) pair transcribing only, (4) pair transcribing and pair
revising, plus a control group. The second independent variable, task type, is a
within-subject factor with two levels: the narrative task and the decision-making
task. The third independent variable, task session, is a within-subject factor with
two levels: the first and the second sessions for either narrative or decision-making
tasks. The dependent variable is the oral performance which was measured in terms
of complexity, accuracy, and lexical performance. The major focus in this chapter
is the post-task transcription condition, but with some consideration of task dif-
ferences (narrative vs. interactive). Space does not enable extensive coverage of the
factors of time or task-session.
Data analysis
Data were analysed using SPSS15.0. To address Research Question One concerning
the effects of post-task transcribing, a Multivariate Analysis of Variance (MANOVA)
was performed. For Research Questions 2 and 3 concerning the effects of pair/indi-
vidual transcribing and the effects of further revision, two-way MANOVAs were per-
formed to consider the two independent variables simultaneously. All the MANOVAs
were followed by post-hoc comparisons of all the examined conditions to identify
which groups were significantly different from the others. Furthermore, effect size
(Cohen’s d)1 calculations were conducted to demonstrate the magnitude of any sig-
nificant effect.
Results
. (1) when d is smaller than 0.2, it can be regarded as a small effect; (2) when d is between 0.2
and 0.8, a medium effect; and (3) when d is larger than 0.8, a large effect. (Cohen 1988)
140 Li Qian
Table 4. Descriptive statistics for two task performance of each task type
Group Task Accuracy Complexity Lexical performance
Ratio of Errors/100 words Clause ratio Mean length of Lexical diversity Lexical
error-free clauses AS unit sophistication
Task 1 Task 2 Task 1 Task 2 Task 1 Task 2 Task 1 Task 2 Task 1 Task 2 Task 1 Task 2
Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean
(SD) (SD) (SD) (SD) (SD) (SD) (SD) (SD) (SD) (SD) (SD) (SD)
Individual N* .39 .52 12.46 9.61 1.25 1.38 7.84 9.20 30.75 36.78 1.99 2.12
transcribing (.12) (.13) (5.88) (3.94) (.12) (.13) (.86) (1.25) (5.33) (6.28) (.37) (.33)
(n = 16) I .60 8.73 1.77 1.88 5.52 6.22 57.83 57.70 1.37 1.64
.70 7.51
(.11) (.07) (3.02) (2.70) (.14) (.09) (1.72) (1.77) (12.82) (11.46) (.25) (.25)
Individual N .37 .54 13.36 8.87 1.33 1.42 8.49 9.44 34.65 33.28 1.89 2.43
transcribing (.13) (.11) (5.60) (3.26) (.12) (.15) (1.75) (1.33) (13.69) (5.97) (.30) (.11)
& revising 1.76 6.07 6.87 54.66
I .63 .78 7.37 6.03 1.70 54.39 1.49 1.72
(n = 16) (.12) (.25) (.15) (1.61) (2.02) (12.36) (11.53) (.22) (.14)
(.05) (2.70) (1.91)
Pair N .34 .51 14.71 10.35 1.29 1.53 7.95 8.79 29.54 34.43 1.86 2.17
transcribing (.11) (.10) (4.34) (3.47) (.16) (.13) (1.34) (1.29) (6.96) (5.49) (.39) (.28)
(n = 16) I .55 1.66 1.90 6.58 9.71 56.80 57.93 1.35 1.54
.71 9.62 5.50
(.12) (.10) (4.42) (2.72) (.13) (.14) (2.02) (1.78) (8.25) (8.20) (.21) (.13)
Pair N .32 .51 12.63 8.85 1.30 1.47 7.79 8.16 31.44 35.78 1.89 2.11
transcribing (.11) (.13) (5.10) (3.18) (.12) (.14) (1.06) (1.30) (5.63) (5.09) (.41) (.30)
& revising I .56 9.17 1.63 1.86 6.16 9.47 53.35 57.46 1.36 1.45
.77 6.21
(n = 16) (.08) (3.47) (.10) (.15) (1.12) (2.03) (10.96) (9.97) (.17) (.15)
(.05) (2.72)
Control N .35 .38 13.14 12.06 1.32 1.35 7.60 7.80 29.55 37.78 1.93 2.04
(n = 16) (.11) (.08) (5.20) (5.49) (.14) (.12) (1.31) (1.32) (6.58) (7.69) (.56) (.48)
I .55 .56 10.30 9.84 1.67 1.67 5.70 5.78 50.75 60.96 1.24 1.33
(.06) (.06) (2.86) (3.14) (.14) (.14) (.52) (.60) (9.28) (11.13) (.25) (.23)
(*N: narrative I: interactive)
Get it right in the end 141
Cohen’s d = 1.16; interactive task performance: p = .000, Cohen’s d = 2.12). There was
no significant difference between the experimental groups and the control group as
far as lexical diversity was concerned.
Table 5 provides summary information on the MANOVA results for Research
Question 1. The results concerning the performance on two different task types are
presented separately. Under each task type, probability values of the multiple indexes
of complexity, accuracy and lexical performance are provided, together with the effect
sizes.
Narrative task
Complexity clauses per AS unit 3.43 .007* 1.43, large
words per AS unit 2.25 .056 0.57, medium
Accuracy ratio of error free clauses 4.39 .001* 1.72, large
error per 100 words 1.45 .214 0.46, medium
lexical performance lexical diversity 1.13 .350 0.4, medium
lexical sophistication 3.11 .010* 1.16, large
Interactive task
Complexity clauses per AS unit 4.02 .003* 1.88, large
words per AS unit 2.83 .000* 3.06, large
Accuracy ratio of error free clauses 26.63 .000* 3.74, large
error per 100 words 7.50 .000* 1.81, large
lexical performance lexical diversity .68 .643 0.31, medium
lexical sophistication 8.30 .000* 2.12, large
(*p < .01)
inferred that the pair transcribing groups produced more but shorter clauses or clause
elements than the individual groups, whereas the latter adopted more words but sim-
pler syntax (with less subordination).
In the second interactive task, the pair groups used more words per AS-unit (9.71,
9.47) than the individual groups (6.22, 6.87) (p = .001, Cohen’s d = 1.76,), without any
significant differences noted for the ratio of clauses per AS unit (1.88, 1.76 for the indi-
vidual groups; 1.90, 1.86 for the pair groups). Table 6 shows the results for both task
types in terms of the complexity measures.
With both task types, there were no significant effects of individual/pair transcrib-
ing for accuracy and lexical performance.
Table 6. Significant effects of individual/pair conditions in both task types: Complexity
Groups Dependent variables F Sig. Effect size
Interactive Task revision> no revision accuracy: error-free ratio 7.16 .001* 1.09, large
no revision> revision complexity: clause ratio 3.56 .014 0.64, medium
(*p < .01)
144 Li Qian
The MANOVA results showed that in the second narrative task, the involvement
of revision brought about no significant differences in terms of the various aspects of
task performance. Nor was any significance found with regard to lexical performance
in both task types. In brief, the involvement of revision led to a more accurate, but less
complex interactive task performance, although there was no such effect of revision on
narrative task performance.
In addition to the above main effects of both post-task conditions, there is an
interaction effect between the individual/pair condition and the revision/no-revision
condition, although this is only on the measure of lexical sophistication (p = .007,
Cohen’s d = 0.7). Specifically, in the narrative task, the revision condition pushed
the individual transcribing groups to produce significantly more low-frequency
words than the pair groups. On the other hand, the no-revision condition was associ-
ated with the pair transcribing groups using more low-frequency words than their
counterparts.
Discussion
There are four major sections to this discussion section. First, I will discuss the general
effects of post-task transcribing, followed by a discussion of the differential effects of
transcribing on the various aspects of performance. This is followed by a section on the
role of interaction, while the last section discusses the importance of revision as a step
which follows transcription itself.
To account for the effects of post-task transcribing, two issues may be related: the
foreknowledge of post-task transcribing, and the operationalization of t ranscribing.
Regarding the first, prior to task performance, participants were informed that they
would transcribe their performance recordings afterwards. It appeared that the
foreknowledge of transcribing played a role in directing participants’ attention to
formal aspects of performance because this may remind them that task performance
is not an end in itself, but instead is connected with wider pedagogic concerns
(Foster & Skehan 2013). This may have emphasized the importance of the quality of
task performance. Participants, therefore, were cautious during performance to keep
a balance between fluent communication and language accuracy. Accordingly, they
appeared to pay attention not only to meaning transmission for task accomplish-
ment, but also allocate certain attention to the formal aspects of performance for a
more satisfactory transcribed performance. Even so, it was not clear whether they
shifted their attention unconsciously or intentionally.
The operationalization of post-task transcribing possibly may have had some
effects on language development as well, although these suggestions are rather specu-
lative. One of the most evident advantages of post-task transcribing is that it affords
participants opportunities to attend to formal aspects of language performance.
During task performance, most of the learners’ attentional focus was probably on
communication and meaning to get the task done. In contrast, at the post-task stage,
it was likely that more attention could be released to consider the formal aspects of
task performance, because meaning would not compete for major attention any longer
(Foster & Skehan 2013; Lynch 2001, 2007). Under these conditions, noticing, which is
a prerequisite for language change and acquisition (Schmidt 2001), may occur more
easily and naturally at the post-task stage.
However, if attention for formal aspects of language is available, this only offers
the possibility for noticing to take place, but does not guarantee its occurrence. In the
present study the performance transcripts pushed participants to notice, remember,
and reproduce the processed language forms (Lynch 2007). On the one hand, tran-
scripts transformed the oral task performance into a written form which may have
further prompted the learners to attend to the formal aspects of their performance
(Doughty & Williams 1998a). On the other hand, the transcription reactivated the task
performance and, in this way, may have led to deeper processing, such as cognitive
comparison. As Doughty (2001) says:
‘If the verbatim format of recent speech remains activated in memory and available for
use in subsequent utterance formulation, this can be taken to be an important cognitive
underpinning for facilitating the opportunity to make cognitive comparisons’ (p. 253).
Cognitive comparisons may be made between the transcripts and the target language
(i.e. noticing the gap) or between the missing forms in the transcripts and the existing
146 Li Qian
counterparts in the target language (i.e. noticing the hole), both of which functioned
effectively for the improvement of formal aspects in the current research.
So far, the discussion has concerned the general effect of post-task transcribing, regard-
less of the differences between various transcribing conditions. However, in view of
the pedagogical applications of this study, post-task transcribing was operationalised
in various ways. The following sections present a discussion on these different opera-
tionalisations in greater detail.
could not fully understand what their partner exactly said in the recordings or why the
partner said what they said. Requests for confirmation and the corresponding expla-
nations occurred in the interaction. An exchange that follows a confirmation move
“forces the learner to clarify and organize their own knowledge and thus enhances
their own understanding” (Storch 2007, p. 155). Such exchanges between the partners
provided an advantage which was missing when participants transcribed individually.
Joint responsibility over the creation of the transcripts means that students may be
more receptive to peer suggestions and feedback comments (Storch 2007).
lexical complexity in this study. While the effect of pair work has been established in
previous research (see a review in Storch 2007), research work with an emphasis on the
benefits of individual condition in L2 learning may be a fruitful field as well.
Pedagogical implications
The present study has interesting implications for second language instruction. First
and foremost, teachers in task-based settings are recommended to include post-task
activities in their teaching practice. The present study focussed on post-task transcrip-
tion and showed a striking effect for improvements in formal aspects of language. The
procedure is perfectly feasible in regular classrooms in that only recorders and pens
are needed for transcribing and the average time for transcribing a 1-min extract is
around five minutes, both of which are manageable in L2 classrooms. In addition,
other types of post-task activities can also be examined in further research so as to
provide more focus-on-form options for pedagogical application.
In the second place, the findings highlighted the need to monitor the variations
in post-task transcribing carefully. Not all the transcribing conditions were beneficial
for overall language improvement. For example, only the pair transcribing condition
was favorable for syntactic complexity improvement. L2 learners at different proficient
levels may be primarily concerned with different aspects of language performance.
Teachers should, therefore, carefully design transcribing conditions to allow students
with different needs and at different stages of IL development to focus on different
aspects of task performance achievement.
Thirdly, teachers need to understand the factors that impact in contrasting ways
on different performance aspects. For instance, the effect of revision is complex. It is
generally accepted among teachers that the involvement of revision is helpful for L2
learners (Willis & Willis 2007). However, the results of this study reveal that revision
in a general sense facilitates improvement in accuracy, but may hinder the use of com-
plex language. Thus, we should be cautious when we adopt further revision in post-
task transcribing unless raising accuracy is the current pedagogic goal. One strategy
that could be employed is for a teacher to make it clear to the students that the focus
of revision is on both error-correction and structural improvement prior to revision.
This may help learners direct their attention to both aspects. This might reduce the
potentially negative effect of revision on complexity to a certain extent.
Finally, it should be acknowledged that all the above pedagogical recommenda-
tions, based as they are on just one study, cannot really be warranted unless further
replication studies are carried out. It should be noted as well that transcribing, when
adopted as a type of post-task activity, may be beneficial to induce learners to focus
on form, but might not necessarily bring about an immediate improvement in L2
performance.
Get it right in the end 151
Conclusion
This research has been explorative in terms of both theoretical and pedagogical issues.
The findings have underscored the necessity for task-based research and pedagogy to
give equal weight to a post-task focus on form as during pre- and during-task stages. As
Skehan (2007) noted, a task-based approach has much to offer form-focused instruc-
tion in a variety of ways. Focus on form at the post-task stage is a promising area which
is worthy of future exploration.
References
Malvern, D., & Richards, B. (2002). Investigating accommodation in language proficiency interviews
using a new measure of lexical diversity. Language Testing, 19, 85–104.
McLaughlin, B. (1990). Restructuring. Applied Linguistics, 11, 113–128.
Mennim, P. (2003). Rehearsed oral L2 output and reactive focus on form. ELT Journal, 57, 130–138.
Mennim, P. (2011). Learner negotiation of L2 form in transcription exercises. ELT Journal, 66(1),
52–61.
Norris, J., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative
meta-analysis. Language Learning, 50, 417–528.
Pica, T. (2002). Subject-matter content: How does it assist the interactional and linguistic needs of
classroom language learners? Modern Language Journal, 86, 1–19.
Prabhu, N.S. (1987). Second language pedagogy. Oxford: OUP.
Read, J. (2000). Assessing vocabulary. Cambridge: CUP.
Rutherford, W. (1988). Second language grammar: Teaching and learning. London: Longman.
Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language instruction (pp.
3–28). Cambridge: Cambridge University Press.
Sheppard, K. (1992). Two feedback types: Do they make a difference? RELC Journal, 23, 103–110.
Skehan, P. (1996). A framework for the implementation of task-based instruction. Applied Linguistics,
37, 38–62.
Skehan, P. (1998). A cognitive approach to language learning. Oxford: OUP.
Skehan, P. (2007). Task research and language teaching: Reciprocal relationships. In S. Fotos &
H. Nassaji (Eds.), Form-focused instruction and teacher education: Studies in honor of Rod Ellis
(pp. 55–69). Oxford: OUP.
Skehan, P., & Foster, P. (1997). Task type and task processing conditions as influences on foreign
language performance. Language Teaching Research, 1, 185–211.
Skehan, P., & Foster, P. (2001). Cognition and tasks. In P. Robinson (Ed.), Cognition and second lan-
guage instruction (pp. 183–205). Cambridge: CUP.
Spada, N. (1997). Form-focused instruction and second language acquisition: A review of classroom
and laboratory research. Language Teaching, 30, 73–87.
Stern, H.H. (1983). Fundamental concepts of language teaching. Oxford: Oxford University Press.
Stillwell, C., Curabba, B., Alexander, K., Kidd, A., Kim, E., Stone, P., & Wyle, C. (2010). Students
transcribing tasks: Noticing fluency, accuracy and complexity. ELT Journal, 64, 445–455.
Storch, N. (2007). Investigating the merits of pair work on a text editing task in ELS classes. Language
Teaching Research, 11, 143–159.
Swain, M. (1993). The output hypothesis: Just speaking and writing aren’t enough. The Canadian
Modern Language Review, 50, 158–164.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook & B. Seidhofer
(Eds.), Principles and practice in applied linguistics: Studies in honor of H.G. Widdowson
(pp. 125–144).Oxford: OUP.
Swain, M. (2005). The output hypothesis: theory and research. In E. Hinkel (Ed.), Handbook on
research in second language teaching and learning (pp. 471–483). Mahwah, NJ: Lawrence
Erlbaum Associates.
Swain, M., & Lapkin, S. (1995). Problems in output and the cognitive processes they generate: A step
towards second language learning. Applied Linguistics, 16, 371–391.
Swain, M., & Lapkin, S. (1998). Interaction and second language learning: Two adolescent French
immersion students working together. The Modern Language Journal, 82, 320–337.
Truscott, J. (2007). The effect of error correction on learners’ ability to write accurately. Journal of
Second Language Writing, 16, 255–272.
Willis, D., & Willis, J. (2007). Doing task-based teaching. Oxford: OUP.
154 Li Qian
Willis, J. (1996). A framework for task-based learning. Harlow: Addison Wesley Longman.
Wolfe-Quintero, K., Inagaki, S., & Kim, H.Y. (1998). Second language development in writing:
Measures of fluency, accuracy and complexity (Technical Report #17). Honolulu, HI: University
of Hawaii, Second Language Teaching and Curriculum Center.
Appendix
The Cognition and Tradeoff Hypotheses account for task performance in different
ways. The former sees task complexity as the driver for higher accuracy and structural
complexity whereas the latter, within the constraints of limited attentional capacities,
sees performance as being accounted for through the interaction of influences
from task characteristics and task conditions. This chapter reports on a study
which contrasts these two accounts, manipulating task structure (as an influence
on primarily accuracy, but secondarily complexity), vocabulary difficulty (as a
disruptor of smooth processing during performance), and time perspective (as a
method of operationalising task complexity). The results do simultaneously produce
raised accuracy and complexity, but this is best accounted for through the separate
contribution of task structure and a there-and-then perspective (analysed differently
to that within the Cognition Hypothesis), rather than through greater task complexity.
Vocabulary difficulty did not have the predicted impact. The results are discussed in
terms of the Tradeoff and Cognition Hypotheses.
Introduction
The last twenty-five years haveseen great practical interest in task-based approaches to
instruction (Ellis 2003; Van Den Branden 2006; Van den Branden et al. 2009). At the
same time, there has also been a parallel focus on research into task-based performance
(Skehan & Foster 2008), in an attemptto develop a research-based view on language
instruction. Indeed, research in this area has the dual attractions of connecting with
interesting theoretical issues in the domain of second language acquisition (see, e.g.
Housen & Kuiken 2009) and practical concerns within the field of pedagogy (Skehan
2011). The available research allows for a range of generalisations, which add to our
understanding of how differences in task features and task conditions can have system-
atic influences on performance, and so, perhaps, pedagogy. For example, tasks based
on familiar information or more concrete information are easier (Brown et al. 1984)
156 Zhan Wang & Peter Skehan
and also associated with greater accuracy (Foster & Skehan 1996). Similarly, tasks which
require information m anipulation (i.e. those requiring the creation of a storyline to link a
series of pictures) and integration (i.e. those requiring the integration of foreground and
background information in a narrative task) are more difficult, but are associated with
greater language complexity (Skehan & Foster 1997; Tavakoli & Skehan 2005). These
findings are the result of research studies which have focussed on task types or character-
istics (Skehan & Foster 2008), but there has also been research on the conditions under
which tasks are performed. Planning, for example, has been shown to be beneficial to
performance, whether strategic, that is pre-task, or on-line, (that is conducted while the
task is running), with the former consistently producing more complex and more flu-
ent performance (Foster & Skehan 1996; Ortega 1999), and the latter more consistently
producing greater accuracy (Yuan & Ellis 2003; Ellis 2009). Other task conditions, such
as repetition (Bygate 2001; Wang this volume), or post-task activities (Skehan & Foster
1997; Li this volume) have also proved to be beneficial.
Even though the research base on task performance has grown steadily, there is
still a heated debate on the theoretical underpinnings that can explain the results we
see. At present, there are two competing theoretical models aiming to account for the
impact of task type and task conditions on performance, the Trade-off H ypothesis
(Skehan 1998, 2009a) and the Cognition Hypothesis (Robinson 2001a, b, 2012). The
starting point for the Trade-off Hypothesis is the assumption that there are attentional
limitations on performance, associated with limited working memory size, and that
pressure on such limited resources will have implications for what a second language
speaker can produce. Task research has frequently portrayed performance in terms
of language complexity, accuracy, and fluency (and more recently lexis also, Skehan
2009b). So, applying attentional limitations to this view of performance, prioritising
one area of performance, complexity, say, might have significant effects on perfor-
mance in other areas. In fact, the Tradeoff Hypothesis has proposed that there is a
particular tension between complexity and accuracy, such that it is difficult to produce
high levels of performance in both of these areas simultaneously. In contrast, high
accuracy and high fluency, or high complexity and high fluency are in conflict to a far
lesser extent (Skehan & Foster 1997).
This, though, is a rather basic account of the Trade-off Hypothesis. The fundamen-
tal assumption is that if tasks become more difficult, the significance of attentional and
memory limitations becomes greater. But this does not mean that the effects of Trade-
off are unavoidable. Indeed, Trade-off is a fundamental constraint, and then a major
contribution of task research is to explore how task characteristics and task conditions
can mitigate its effects. In other words, influences which singly might elevate perfor-
mance in a single direction may, when operative together, raise performance in more
than one area. Indeed, the purpose of much Trade-off research is to see to what extent
such limitations can be overcome. For example, Tavakoli and Skehan (2005), and
Structure, lexis, and time perspective 157
oster and Tavakoli (2009), each using picture story narrative retelling, have shown
F
how task structure, which ordinarily promotes accuracy and fluency, and information
integration, which normally promotes complexity, can have a conjoined positive influ-
ence and raise accuracy and complexity together. This particular interactive influence
may be difficult to achieve ordinarily, but the above studies show that it is possible.
The fundamental assumption is that attentional limitations have to be assumed, but
this can be taken as the necessary starting point to explore how pedagogic goals, even
if not easy to achieve, can be reached within such constraints.
A final aspect of the Trade-off Hypothesis is its dependence on the Levelt model
of first language speaking (Levelt 1989; Kormos 2006). This model proposes (amongst
other things) that there are three broad stages in speaking: Conceptualisation, Formu-
lation, and Articulation. The first of these is concerned with the ideas to be expressed,
with an evaluation of the context of speaking, and with decisions about the stance
towards what is being said. The stage ‘outputs’ the pre-verbal message. Formulation
then takes the pre-verbal message, accesses the mental lexicon, and engages in clothing
the propositions to be expressed with language, first through lexis, and then through
syntactic encoding. Skehan (2009a) takes this first language model and applies it to
second language case, particularly in relation to task-based performance. He discusses
four types of influence on performance: complexifying, pressuring (both of which
make demands on the processing system), and easing and focussing (which reduce
processing pressure, or direct it). He then relates findings from the task performance
literature to these four influences, and links them to stages within the Levelt model. In
this way, an attempt is being made to construct a theoretical base for the claims which
are made, and also to link task performance to the psycholinguistic processes second
language speakers engage in when speaking. Such a theoretical base is also important
in combating the claim (Robinson 2007) that the Trade-off Hypothesis is vacuous, and
only makes post-hoc claims rather than predicting where trade-off effects will occur.
The contrasting approach is represented by Robinson’s Cognition Hypoth-
esis (Robinson 2001a; Robinson & Gilabert 2007). This hypothesis explicitly rejects
notions of attentional limitations, and proposes that tasks should be designed and
sequenced on the basis of gradual increases in cognitive complexity. A triadic compo-
nential framework is proposed, embracing task features, which impact on task com-
plexity, task conditions, which draw on interactive factors, comprising participation
and participant variables; and learner factors, both affective and ability, which are
important for task difficulty (being defined in reference to the language learner rather
than task features). For our present purposes task complexity factors are most relevant;
the other two areas will not be pursued here. As for task complexity, a distinction is
made between resource-directing factors and resource-dispersing factors. The former
category contains sub-headings such as number of elements in the task, time perspec-
tive (Here-and-now vs. There-and-then), and reasoning demands. It is assumed that
158 Zhan Wang & Peter Skehan
more elements, there-and-then tasks, and more reasoning demands individually (and
presumably collectively) make a task more complex, and as a result raise the level of
accuracy and complexity. In other words, task complexity pushes up performance in
each of these general areas, which is in direct contrast with the default position of
the Trade-off Hypothesis. Indeed, Robinson (2005) makes the claim that increasing
complexity in each of these areas pushes speakers to produce particular aspects of
language, such as the use of articles, hence the resource-directing label. In contrast,
resource-dispersing influences, while increasing task complexity, do “not direct learn-
ers to any particular aspects of language code which can be used to meet the additional
task demands” (Robinson 2005: p. 7). These lead Robinson to note, that resource-
directing factors will affect fluency negatively but accuracy and complexity positively.
In contrast, increasing task complexity through resource-dispersing factors will influ-
ence fluency, and accuracy and complexity, negatively (e.g. through lack of provision
of planning time, or through multiple tasks, or through the need to use unfamiliar
information).
Broadly there are two types of critique that one can raise against the Cognition
Hypothesis. First, there is the issue of evidence. The hypothesis has been around for
some time now, and its proponents have published a wide range of research studies.
However, on the whole, this research has not been outstandingly supportive of the
predictions, especially regarding the above-mentioned joint influence on accuracy and
complexity. Frequently one aspect of performance is raised, but not the other (see
Iwashita et al. 2001; Michel et al. 2007; Kuiken & Vedder 2007, 2008; Rahimpour 1997;
Robinson 1995; Gilabert 2007 regarding the positive effect of more complex task on
language accuracy; and Foster & Skehan 1999; Foster & Tavakoli 2009; Robinson 2007;
Tavakoli & Foster 2008, 2011 on language complexity); it is rare to see the prediction
of joint raising fulfilled. Occasionally this is the case, (as in Ishikawa 2006), but these
results were obtained with written performance, which renders comparison with the
spoken language studies difficult. There are a few studies (e.g. Foster & S kehan 1999;
Tavakoli & Skehan 2005; Foster & Skehan 2013) where both accuracy and complexity
are higher. In view of these studies, Skehan (2009a) argues that it is insufficient, given
the Cognition Hypothesis, to demonstrate that accuracy and complexity are raised –
one must also demonstrate that, at the individual level, the two variables are correlated.
Otherwise there is the possibility that some learners may raise complexity, some may
raise accuracy, and the outcome will be significant gains for each, but this, then, would
not apply at the individual level. In the above three ‘favourable’ studies, for example,
the correlations between accuracy and complexity were not significant, which makes it
hard to defend that they provide any support for the Cognition Hypothesis.
But another type of critique can be raised with regard to the extent to which each
of the components of task complexity actually give rise to higher complexity levels.
The fundamental claim is that resource-directing influences, through greater cognitive
Structure, lexis, and time perspective 159
complexity, push learners to higher accuracy and structural complexity levels. Take
Here-and-now vs. There-and-then, for instance. The interpretation favoured by the
Cognition Hypothesis is that the second condition is more complex (hence, the dif-
ferences in performance that are predicted). But one can ask why this would be the
case, and demand greater clarity on the exact meaning of cognitive complexity. Speak-
ing about Here-and-now is certainly easier with respect to the availability of infor-
mation to be communicated. But it is also less negotiable, and has more prominent
input which needs to be attended to unavoidably, factors which add to complexity.
Speaking about There-and-then is certainly more difficult regarding the lack of input
to be easily referred to, and it also makes memory demands. But it is potentially less
input-dominant and provides greater scope for negotiation, and the shaping of contri-
butions on the part of the speaker, since the stimulus material, the input provided, can
be responded to selectively, sometimes ignored, and more easily repackaged to make
it easier to handle. With regard to resource-dispersing influences the assumption is
made that the different categories of influence are unproblematic and work in predict-
able ways. This, however, is doubtful. For instance, planning, which is interpreted in
the Cognition Hypothesis as resource-dispersing, is claimed to produce lower perfor-
mance (if planning time is not available). But the research on planning raises argu-
ments against this interpretation. The literature suggests that planning does not have
equal effects on all aspects of performance, leading to stronger effects on complexity
and fluency, and smaller and less dependable effects on accuracy. There is also the
issue of how strategic and on-line planning interact (see Wang, this volume). In addi-
tion, evidence from qualitative studies of planning (Ortega 2005; Pang & Skehan this
volume) show that planning consists of many processes and many goals. These include
planning-for-task-interpretation (leading to complexity?), planning-for-organisation
(leading to accuracy?), and planning-as-rehearsal (again leading to accuracy but also
fluency?). Planning cannot be treated as monolithic: its effects are subtler. So, all in all,
the categories which make up the Cognition Hypothesis are not without problems,
and require clearer construct definition.
These considerations motivated the authors to design a study which systematically
investigated typical Trade-off variables (structure and lexical demands) and a typi-
cal Cognition variable (time perspective) to explore the various questions raised by
these two models. Before we describe the study itself, we need to review the evidence
relating to these specific variables a bit further. As a starting point, task structure was
inferred to be important by Skehan and Foster (1997), but this was only post-hoc from
the results of Foster and Skehan (1996) and Skehan and Foster (1997). Skehan and
Foster designed a subsequent study to explore this influence (Skehan & Foster 1999)
broadly confirming the original results. Tavakoli and Skehan (2005), and Tavakoli and
Foster (2008) published subsequent research consistent with the claim that tasks that
contain clear macrostructure (and so ease Levelt’s Formulator stage), favour accuracy,
160 Zhan Wang & Peter Skehan
and fluency. However, the later studies have complexified the picture somewhat, since
structure has emerged as a generally favourable influence on performance, and in
some cases was even associated with greater complexity (Tavakoli & Skehan 2005). It
was therefore decided to use Structure as an independent variable in the present study.
The broad motivation is that, following the Trade-off Hypothesis, researchers try to
account for raised performance not through any need to posit greater task complexity,
but instead through the use of specific targeted variables such as structure which have
been shown to enhance performance in particular areas.
As for the time perspective (Here-and-now versus There-and-then), this has
featured in a number of Cognition Hypothesis studies. It has, however, mainly been
studied through map tasks. The Here-and-Now condition is implemented with a par-
ticipant describing a route on a map that is available and visible. The There-and-then
condition requires the participant to describe the route without the map being visible.
Predictions are made that, since the non-present condition is more cognitively com-
plex, it will push speakers to greater accuracy and complexity. This prediction is rarely
fulfilled, and the most typical result is that accuracy is raised, while complexity is not
(Gilabert 2007; Rahimpour 1997; Robinson 1995), a result which is inconsistent with
the Cognition Hypothesis.
As a matter of fact, the strength of the claims about time perspective would be
much greater if alternative operationalisations (other than through map tasks) of the
same construct were used, potentially generating consistent results. In the present study
we decided to create a condition that differed from the above-mentioned map task in
two ways. First, we used a video-based presentation. We felt a video, with actual char-
acters, would constitute a more involving challenge for our participants. In addition,
with a video narrative, engagement might potentially be richer, since causal links could
be worth commenting on, as might the motives of the characters in the video. But the
essential contrast – Here-and-now vs. There-and-then – would be maintained. In this
way, the Here-and-now condition is very clear and makes demands only on working
memory within processing. But the flow of input is considerable, which puts the speak-
er’s ability to plan and organise under considerable pressure. In contrast, the There-
and-then condition does not have the same pressure of heavy input. On the other hand,
memory demands are much greater and no new stimulus material is involved, or can
be referred to. The speaker can, though, ‘shape’ the narrative in whatever way is desired,
and so plan what is going to be said. Further details will follow in the Materials section.
Secondly, following up on previous studies, (Skehan 2009b), we included lexical
demands as an additional variable, since this appears to have an impact on perfor-
mance across a range of task-based studies. Two aspects of this work are relevant.
First, Skehan (2009b), following Meara and Bell (2001), discusses the use of a statistic,
Lambda, which measures the extent to which, in the small texts typical of second
Structure, lexis, and time perspective 161
language task work, speakers use less frequent lexis. The procedure divides a text up
into ten-word chunks and then establishes how many words, in each ten word chunk,
are outside a certain frequency range. A statistic is then calculated, Lambda, which
uses a Poisson distribution (appropriate for infrequently occurring events) to capture
the extent to which less frequent words ‘penetrate’ the text. Second, Skehan (2009b)
discusses the extent to which the use of less frequent lexis has an impact on other
aspects of performance. He argues that greater use of less frequent lexis, prompted
by the demands of a particular task, is associated with lower complexity and accuracy
scores. He suggests that the need for second language speakers to access less frequent
language from a less well developed mental lexicon leads to disruption at the Formula-
tor stage in speech production. He offers the conclusion that tasks which do require
less frequent lexis can therefore have a damaging effect on task performance generally.
But this conclusion is based on post-hoc analyses of a range of studies. It was not the
result of research design. For that reason, the variable of lexical demands is included
in the present study in a more systematic way to explore its effects.
This background leads to four general research questions, and their associated
hypotheses. The research questions are:
Research Question 1: What will be the effect of time perspective (Here-and-now versus
There-and-then) on complexity and accuracy?
Research Question 2: How will task structure affect accuracy and fluency?
Research Question 3: How will the use of less frequent lexis affect accuracy and
complexity?
Research Question 4: How will time perspective, task structure, and frequency of lexis
interact?
The specific hypotheses are as follows (and see also Table 1):
Hypothesis One: There-and-then tasks will raise complexity, but not accuracy. This
follows from the analysis presented earlier regarding the differences between the
There-and-then and Here-and-now conditions, and their implications for the psy-
cholinguistics of processing. Essentially, the There-and-then condition, since it allows
‘repackaging’ of content, will give speakers more opportunity to express ideas more
densely and to bring out connections between events and to clarify the motives of
the participants in the narratives. Interestingly here, the Cognition Hypothesis should
predict that both complexity and accuracy will be raised, even though Cognition
Hypothesis motivated research has tended to report an accuracy effect only. In fact, it
is further assumed that the There-and then condition has no influence on accuracy,
since it is proposed here that There-and-then is not a more complex condition, merely
a different condition, characterised by different processing demands.
162 Zhan Wang & Peter Skehan
Hypothesis Two: Task structure will raise accuracy and fluency in performance. This
hypothesis follows from the review of previous studies which have used this variable
to explore task-based performance. The hypothesis is exploratory regarding complex-
ity, since previous results have been inconsistent. Tentatively, we can hypothesize that
there will be an increase in this area.
Hypothesis Three: Tasks which lead to the use of less frequent lexis will show lower
scores for accuracy and complexity. The motivation here derives from Skehan (2009b),
which argues (on the basis of a post-hoc interpretation of findings) that such tasks lead
to lowered performance in these areas. The present study is more systematic with the
use of tasks intended to provoke different frequencies of lexical items.
Hypothesis Four: The variable of time perspective will show interactions with task
structure and lexical frequency, and produce stronger effects, specifically more com-
plex and accurate performance for structured tasks performed under the There-
and-then condition and tasks with easy lexis performed under the There-and-then
condition as well. It is argued that the There-and-then condition, despite posing
different memory demands, contains less demanding processing conditions, so that
there is more likelihood that additional variables such as structure and lexis can have
an impact. The impact of these variables is predicted to be more muted under the
Here-and-now condition because of the pressure of input that is involved. There is
also the point here that more structured tasks, because of their greater structure, will
enable memory pressures to be lessened because of the more organised nature of the
narratives to be told.
C A F L
The various hypotheses, summarized in Table 1, and now looked at more gener-
ally, propose that different individual variables, motivated by psycholinguistic pro-
cessing concerns, will account for the results that will be obtained. Main effects are
important, but so are interactive, or conjoint effects: the one prediction from Hypoth-
esis Four, that there will be an effect on both complexity and accuracy, is derived from
such interactive effects on performance. In other words, it is not proposed that it is
necessary to hypothesise greater task complexity to achieve these results (as the Cog-
nition Hypothesis, in contrast, would claim); instead they are claimed to be the result
of the interplay of processing factors.
Methods
To examine task complexity from the three dimensions reviewed above: time perspec-
tive (Here-and-now versus There-and-then), lexical difficulty (easy vocabulary versus
difficult vocabulary), task structure (structured task versus unstructured task), and
also allow comparisons within and across these factors, this study is based on a 2x2x2
research design, with lexical difficulty and task structure as within-subject factors, and
time perspective as a between-subject factor.
Participants
Participants were 72 Chinese L1 (Mandarin) speakers who were learning English
as a second language, with slightly more female than male participants. They were
recruited from a major university in Hong Kong. Most (71%) participants were year 1
students; 14% were year 2, 5% were year 3, and 3% were year 4 students. They volun-
tarily participated in the study and received time compensation fees for their partici-
pation. Student consent forms were collected during the study.
the There-and-Then group had a mean score of 37.72 (.69) in the pre-test. A between-
subjects t-test showed no significant difference between the groups.
Identifying the vocabulary demands of the tasks was more complex. Two proce-
dures were followed, and the tasks which were used were the ones that survived these
procedures. First, groups of experienced teachers viewed candidate videos and rated
them for perceived vocabulary demands. On this basis two groups of five videos each
were identified which differed clearly in rated vocabulary demands. At a second stage,
these videos were described by several people, native as well as non-native speakers,
and these performances were assessed by means of a version of Meara’s PLex computer
program (Meara & Bell 2001). The program outputs a statistic for each performance,
Lambda, which captures the use in the performance of less frequent vocabulary (see
Skehan 2009b for discussion). The videos which were chosen were identified through
their contrasting Lambda means, which differed markedly: Tooth Fairy and Fetching
provoked use of more frequent vocabulary (and therefore avoidance of less frequent
vocabulary), and Bathtime and Off the Baal were associated with higher Lambda
scores, and a reliance on less frequent vocabulary.
Procedure
One of the researchers collected the data in a language lab during a one-on-one meet-
ing with each of the participants. Participants were asked to take an English profi-
ciency pre-test first. Next, a training session was conducted to ensure that participants
understood what they had to do, and were familiar with the Shaun the Sheep series in
general terms and the characters it contains. There was instruction and explanation,
with examples (see Appendix 1). In addition there was a brief oral trial run to make
sure that participants were familiar with speaking in these circumstances. Then partic-
ipants in each group performed narratives of the four Shaun the Sheep videos as a main
session. The four videos were selected according to the combination of task structure
and vocabulary difficulty. Table 2 presents the operationalization of the main session.
To avoid practice effects, the administration order of these four videos was arranged
according to a Latin Square design which was pre-programmed into the computer.
During the main session, both the Here-and-now and There-and-then groups narrated
166 Zhan Wang & Peter Skehan
stories facing the computer screen to make the two performance conditions as similar
as possible. Participants were also told that they were narrating to someone who had
not watched the movie so as to create imagined listeners for the task. Finally, the par-
ticipants filled out a questionnaire and had a short interview with the researcher. The
whole data collection process took approximately 1.5 hours.
Measures
The combination of CLAN and TaskProfile software meant that a wide range of mea-
sures were used in this study. These are shown in Table 3.
. The authors would like to thank Cai Jing, Gavin Bui, Christina Li, and Ren Hongtao for this
work.
168 Zhan Wang & Peter Skehan
Results
This section will present the results of the study. After some preliminary discussion on
the measures to be included, first, descriptive information will be presented, followed
by MANOVA results, and then the subsequent univariate analyses.
The data collected in this study are part of an ongoing exploratory attempt
to establish the structure of second language performance, and to research which
measures best capture the different dimensions which have been identified. To that
Structure, lexis, and time perspective 169
end, the data were subjected to a series of factor analyses, for the four different
tasks which were used. In all analyses, clear factors emerged for accuracy and
complexity. The measures which showed highest typical loadings were the index
of error-free clauses and the measure of subordination per AS-unit respectively.
Accordingly, these will be included in the main statistical procedures. In addition,
lexical sophistication, indexed by the value for Lambda, will also be included since
it is fundamental to the research design. This leaves the major area of fluency, for
which a wide range of indices were available. Separate factor analyses of this area
suggested three sub-fluency dimensions: end of clause pausing (standardised to
100 words), mid-clause pausing (standardised similarly), and repair, best indexed
in the present case by number of false starts per 100 words. Interestingly, a speed
fluency factor did not emerge. Equally interestingly, the location of pausing gener-
ated separate factors. It appears that the influences on pausing at the end of a clause
are not quite the same as those which are concerned with pausing within a clause.
On the basis of the factor analysis, therefore, the most useful dependent variables
to include are:
Descriptive statistics for the data are presented in Table 4. The table shows mean
scores, standard deviations, and N sizes for the two groups from the between-subjects
condition, with scores for the four tasks on each variable used.
Given the presence of six dependent variables, as well as a complex three-
way design with both between and within factors, the first step was to conduct a
MANOVA. The initial MANOVA showed clear significance (Pillai’s trace: p < .001).
In addition, tests of sphericity were acceptable. Accordingly, it was permissible to
proceed to the univariate tests. Regarding the between-subject condition of time
perspective, the results for those measures which attained significance are shown
in Table 5.
The significant results are for complexity, with means of 1.28 (Here-and-now)
versus 1.69 (There-and-then); AS boundary pausing (9.09 pauses per 100 words
versus 4.38 pauses) and false starts (as an index of repair, (with 0.87 false starts per
100 words for Here-and-now versus 1.33 for There-and-then). In other words, the
There-and-Then condition produced greater structural complexity, with a very strong
effect. There was also a significant difference with AS-boundary pausing, but not
170 Zhan Wang & Peter Skehan
Freq. lexis Less freq. lexis Freq. lexis Less freq. lexis
M SD M SD F Sig
with m id-clause pausing, suggesting that the difference in pausing between these
two conditions is located at the point where pausing can be considered to be more
‘appropriate’ (Skehan 2009b). Finally, the There-and-then condition produced more
repair, although the absolute values here were not very great.
Structure, lexis, and time perspective 171
Next, we turn to the within-subject effects and those for interactions. Once again,
only the significant results are shown. (As it happens, there were no results close to
significance. Results were either clearly significant or clearly non-significant here.) The
relevant results are shown in Table 6.
M SD M SD F Sig
Structure has a major impact on complexity, with means of 1.51 (Structured tasks)
versus 1.37 (Unstructured tasks), and end-of-AS unit pausing, with means of 6.96 for
Structured tasks versus 7.53 for Unstructured tasks, suggesting that the Structure con-
dition produces much greater subordination, and also that the amount of pausing at
AS Unit boundaries is much reduced. Speakers in this condition appear to produce
much denser language, with complex organisation within propositions. They also
manage to organize this discourse effectively, with fewer pauses at the end of clauses.
In addition, there is an accuracy effect, although this is not so strong (with means of
0.52, the proportion of error free clauses for Structured tasks vs 0.50 for Unstructured
tasks). In other words, the structure condition does support less error (or to put this
another way, the unstructured condition provokes more error), suggesting that a pro-
cessing condition is able to raise both aspects of form. We will return to this below.
However, the correlations between accuracy and complexity for the four tasks do not
provide strong evidence that this was sustained at the individual level, with two posi-
tive and two negative correlations between accuracy and complexity, none of which
reached significance.
The findings for Lambda, as a measure of lexical sophistication, are shown in
Table 7. As before, only variables with significant effects are shown. There was a signifi-
cant difference between the putatively high vocabulary demand and low vocabulary
demand conditions (Table 7), confirming that this condition did have the impact on
performance that the experimental design was expected to produce. In other words,
the two ‘hard vocabulary’ videos, Tooth Fairy and Off the Baal, did lead to the use of
less frequent vocabulary (which produced means of 1.87 measured by Lambda), while
the other two videos, Bathtime and Fetching, were associated with the use of more
frequent vocabulary (with means of 1.55 lambda). (In passing, it could be noted that
there was no difference in D, as an index of lexical diversity, confirming the results
reported in Skehan 2009b).
172 Zhan Wang & Peter Skehan
M SD M SD F Sig
There are also significant interactions for vocabulary with other variables, specif-
ically mid-clause pausing (with significantly more mid clause pausing for hard vocab-
ulary tasks) and false starts (with significantly more false starts for hard vocabulary
tasks too). In other words, the condition which provoked the use of less frequent
words was associated with more breakdown in the middle of a clause, and also the
need to use more repair. There was no significant result for end-of-AS unit paus-
ing, only pausing where specific lexical choices might be an issue. Equally interest-
ing is the lack of any significant interaction with complexity and accuracy. It does
appear that using less frequent vocabulary had an impact on performance, but this
was related to mid-clause pausing, rather than disrupting structure-building or influ-
encing accuracy. This conflicts with the results reported in Skehan (2009b), although
this might be considered a favourable outcome from a pedagogic perspective, since
it implies that scrutinising teaching tasks for lexical demands may not be as vital as
Skehan (2009b) proposed.
Finally, we turn to interactions. The results are presented in Table 8, and val-
ues are given for Structure-by-Condition, Vocabulary Difficulty-by-Condition, and
Structure-by-Vocabulary Difficulty.
Table 8. Interaction effects for time perspective, structure, and lexical difficulty
Structure-by-time perspective
Complexity Accuracy Lexis
Lexical sophistication
HnN TnT
Structure-by-vocabulary difficulty
False starts
variable of vocabulary, the There-and-then condition leads to the use of even more
less frequent words. So it accentuates the effect of vocabulary demands. Finally there
is one significant interaction result involving a fluency measure – that of false starts as
an index of repair, and here the relevant variables at work are structure and vocabu-
lary difficulty. Harder vocabulary and an unstructured task lead to a particularly great
amount of repair, while the easy vocabulary unstructured task leads to least repair. The
strength of this effect, though, is not great.
Discussion
Given the length and complexity of the Results section, it may be worth restating the
main findings first:
First, we will discuss the significant main effects. The There-and-then condition pro-
duced more complexity, less pausing at AS boundaries, and more repair; however,
there was no increase in accuracy. So once again, regarding the Cognition Hypothesis,
there are mixed findings. Complexity was higher, as predicted, as was fluency (oppo-
site to the Cognition Hypothesis prediction) and accuracy was unaffected (again, not
consistent with Cognition Hypothesis predictions). In some ways, it is easier to start
with the Here-and-now condition, which did not produce elevated performance in
any way. As discussed earlier, the Here-and-now condition may have the advantage
of presence of material to be described (and so lack of memory burden), but it has
considerable disadvantages. Most important are the dominance of the input and the
way this input, if all attended to, is remorseless while the video is running. The major
consequence of this is that the speaker has little time to repackage ideas, or to be selec-
tive as to what will be said. They are forced to maintain a descriptive immediate level
without an opportunity to shape their contribution. In the There-and-then condition,
Structure, lexis, and time perspective 175
in contrast, memory demands may be higher, but the speaker has the opportunity to
be selective, to organise what is being said, and even to indicate causality and character
intentions. This is not because the task (i.e. the video narrative) is more complex: it is
simply because it is crucially different. As a consequence, there is very clearly more
complex language, whether indexed by the AS-based measure of subordination or by
the number of words in clauses. We would argue, in other words, that the results for the
time perspective comparison do not provide support for the Cognition Hypothesis. In
contrast, they are consistent with pressures of psycholinguistic processing, and with
the Trade-off Hypothesis. The Here-and-now condition, in other words, deprives the
Conceptualiser of potential depth, while the There-and-then condition does enable it
to work, and to do better, despite memory limitations.
Lexical difficulty impacts upon three variables across the board: vocabulary itself,
mid-clause pausing, and false starts. This is an interesting effect. Fluency has many
sub-dimensions (breakdown fluency, repair fluency, speed, automatisation: Skehan
2009b). It seems the effect of using less frequent lexis is to disrupt processing mid-
clause, when unexpected problems of lexical selection may be thrust upon the speaker.
Not surprisingly, perhaps, repair is also associated with the engagement of more chal-
lenging lexis. In other words, the need to access less frequent words from the second
language lexicon disrupts automatisation in performance. Retrieving such words, and
the important information contained in lemmas that enables syntax building, has a
price, and this is most clearly reflected in the extent to which language is produced
atuomatically. (Length of run and overall speed were also significantly affected.)
The effects of structure are also very clear. The two structured tasks lead to higher
complexity and lower end-of-clause pausing, as well as greater accuracy. Both aspects
of form are affected, in fact, together with greater fluency, specifically with regard to
‘normal’ pausing. This does suggest relatively smooth processing, and a capacity to
approach more parallel processing.
As interesting as these main effects may be, the interactions are even more so.
First, there is the interaction between Structure and Time Perspective. Tasks which
are both structured and There-and-then elicit language which is more complex and
more accurate. In other words, under the less demanding processing conditions of
There-and-Then production, and if the task eases Formulator operations by providing
a clear macrostructure, it appears that second language speakers have more attention
available for all aspects of form. Greater lexical and structural complexity, and error
avoidance are the outcome of supportive conditions. Many second language speakers
may want to avoid error, but many things may get in the way. It appears that Here-
and-Now processing or unstructured narratives each got in the way in this particular
study. If neither is operative, the surface of language can be given more attention and
error is reduced. It is interesting that while There-and-then and the structured condi-
tion, as main effects, impact upon complexity, they also have a synergistic, interactive
176 Zhan Wang & Peter Skehan
effect. It is possible that, as in Wang’s study (this volume) with her supported on-line
planning condition, the Conceptualiser and Formulator work in harmony for second
language speakers.
In addition, the correlations between structural complexity and accuracy (a rela-
tionship of some significance to the Cognition Hypothesis) are quite intriguing. The
correlations for the two unstructured tasks are close to zero, suggesting independence
of the two performance areas. For the structured tasks, under the There-and-then
condition, the correlations are 0.31 for Bathtime (difficult vocabulary) and 0.42 for
Tooth Fairy (easier vocabulary), the latter one being significant at the 0.05 level. The
correlation may not be particularly high, accounting for only 16% of the variance,
but this is the first time that a joint effect has been demonstrated plus a correlation at
the individual level. We still argue that these results are the consequence of separate
variables working together. We do not see, for example, why time perspective linked to
structure should lead to greater task complexity, even though we accept that Cognition
Hypothesis supporters might offer a different interpretation.
Equally interesting is the interaction effect between lexical difficulty and time per-
spective, even if this is quite limited. More words of lower frequency are used in the
There-and-then condition, suggesting that the lack of Here-and-now time pressure,
together with the ability to repackage ideas, creates sufficient attentional capacity for
wider lexical retrieval. The greater flexibility in this condition enables second language
speakers greater opportunity to search for less obvious lexical choices. There is also an
interaction between structure and vocabulary, but only for the dependent variable of
repair. The focus for this interaction is a slightly raised repair mean score for the dif-
ficult vocabulary, unstructured combination (i.e. the most challenging combination).
It appears that this combination pushes learners into a greater need to modify the lan-
guage they have produced. This effect though does not impact on any other variables,
such as mid-clause pausing or accuracy.
We now need to look at the results more generally, to determine what these patterns
reveal about the nature of second language speaking. The alternative models we have
considered contrast a viewpoint that task complexity, free of attentional limitations,
drives performance (the Cognition Hypothesis), and a viewpoint that certain influ-
ences on psycholinguistic processes (e.g. through task features and task conditions),
and subject to attentional constraints, lead to systematic differences in performance
(the Trade-off account). In general, the results of the study do not sit well with the Cog-
nition Hypothesis. The Cognition Hypothesis prediction of greater complexity with the
There-and-then condition is fulfilled, but the prediction of accuracy is not, and worse,
the results for fluency are the opposite of prediction. There-and-then conditions elicited
more false starts in learners’ speech performance – in a sense, a less fluent accomplish-
ment, and the opposite to the higher fluency prediction of the Cognition Hypothesis.
In addition, the analysis provided earlier suggests that it is by no means obvious that
Structure, lexis, and time perspective 177
the There-and-then condition produces a more complex task. It has been argued that it
is, simply, different, and the contrasts in performance are associated with the difference
rather than greater task complexity (as the next paragraph makes clear). The other vari-
ables in play are not so central to the Cognition Hypothesis. Task Structure has come
to be included in more recent accounts of the Cognition Hypothesis as a resource-
dispersing variable (Robinson 2011), but in a way which is not seen as integrally linked
with task complexity and with facilitating specific form-function mappings. This vari-
able does not straightforwardly plug into clear predictions. Perhaps the role of less fre-
quent lexis would be interpreted as increasing task complexity, in which case it should
raise language complexity and accuracy as the Cognition Hypothesis would predict. If
this is the case, the results of this study are not supportive, since the condition provok-
ing use of less frequent lexis has no impact on complexity and accuracy scores.
A Trade-off interpretation of the differences between Here-and-now and There-
and-then emphasizes different aspects. First, there is the impact of input pressure in
a Here-and-Now condition, since the amount of material which has to be handled,
understood, processed, and expressed is considerable. The input keeps coming, and
so momentary attempts on the speaker’s part to encode things through language are
put under pressure by newly incoming input. Just as one set of propositions may be
assembled, a new set of pressures arrive. These factors have considerable potential for
disruption. In contrast, in the There-and-then condition, there is no immediate pres-
sure from incoming input. The speaker can be selective and choose to encode whatever
they like. In this way, they can, possibly, orient the story to their own strengths and
their linguistic knowledge. They can also repackage the material from the video and
even make links between different sections. They can interpret motives and focus
more clearly on the point of what is happening. (In passing, it should be noted that
there was no difference in overall number of words between the two conditions, and
speakers did, in both conditions, generally try to do justice to the stories concerned.
Neither condition was clearly worse than the other in general ‘narrative quality’.) So,
the difference in burden of input processing was marked. The second issue is memory.
In a sense, the Here-and-Now condition makes less demands on memory in general
(although working memory operations are intense), since what is narrated reflects
what is immediately shown on screen. In contrast, the There-and-then condition is
demanding in a different way, because there is nothing to refer to; yet a six minute
video has to be narrated. The story has to be kept in mind while the retelling proceeds.
This makes demands, but it is not clear how severe these demands are; in addition,
structured narratives by definition are organised, which may facilitate the retelling.
The Cognition Hypothesis sees the memory demands as the crucial aspect that push
for greater task complexity. From the perspective of the Trade-off Hypothesis this is
less obvious, because this hypothesis attaches greater importance to the (damaging)
processing demands of the Here-and-now condition.
178 Zhan Wang & Peter Skehan
A Trade-off interpretation is fairly clear when it comes to the other two variables,
structure and less frequent lexis. A general interpretation for structure effects (e.g.
Skehan & Foster 1999; Tavakoli & Skehan 2005) is that the speaker’s capacity to rely
on knowledge of macrostructure removes the need to engage in ‘broad brush’ planning
during performance, since the overall shape of a story is known. As a result, attention is
available to focus on the surface features of language. In other words, a significant part of
what the Conceptualiser component has to do is clear, straightforward, and undemand-
ing, so that Formulator concerns can be prioritized. Interestingly, in the present study,
the general results are modified in two ways. First, there is a main effect of complex-
ity (and some measures of fluency). It appears that participants have responded to the
potential of structured tasks to indicate more complex relationships by using more com-
plex subordination. Second, the accuracy effect only occurs in the There-and-then con-
dition. It appears to be the case that Structure can also have an organising effect which
facilitates language complexity. But where accuracy is concerned, it seems that the input
dominance of the Here-and-now condition washes out any structure effect. For struc-
ture to enhance accuracy, minimum attentional conditions must be able to operate, and
the There-and-then condition provides these, because there is space for the speaker to
use organisation and planning. This is largely a Formulator-based explanation, but it is
clear that structure is not powerful enough a variable to work in all conditions.
The need to use less frequent lexis has a similar Formulator-based explanation.
The Formulator stage in speech production has lexical and then syntactic phases,
where the lemma retrieval from the mental lexicon drives the building of syntactic
frames, assuming rich information (beyond simple word meaning) is available in the
lemma. Consistent with Skehan’s (2009b) post-hoc suggestions from previous work,
the need to access less organised, less robust, and less elaborate lemma information
derails performance, reducing mid-clause fluency, and leading to more repair. The
use of easier vocabulary, at least in the sense of more frequent vocabulary, enables
smoother processing, and perhaps a greater approximation, on the second language
speaker’s part, to parallel processing as opposed to the need to engage in a more serial,
effortful processing.
Skehan (2009a) offers an account of second language task-based spoken perfor-
mance organised in terms of the Levelt model of first language speaking. The account
is based on the evidence which has accumulated through the range of task-based
research over the last twenty-five years. The Conceptualiser, Formulator-Lexis and
Formulator-Syntax stages are the ‘spine’ of this account; the various influences are cat-
egorised as complexifying (in that they provoke the use of more complex language);
pressuring (in that they create more demanding processing conditions); easing (which
is essentially the reverse of pressuring); and focussing (in that accuracy is selectively
given greater priority in attention). We can pursue this approach here to try to inte-
grate the findings from the present study into this account.
Structure, lexis, and time perspective 179
What is particularly interesting is that two of the variables, time perspective and
structure, each appear twice in this Figure. This is shown in Figure 1 where the vari-
ables in question are italicised. We will deal with these first in the discussion.
There-and-Then Conceptualiser
Structure
Figure 1. The impact of time perspective, structure, and lexical frequency on second language
spoken performance
In contrast, there are two pressuring influences. The first is the need to use less
frequent lexis. This impacts upon the Formulator at the lemma access stage. Effec-
tive, automatised, parallel communication is disrupted when lexical items are needed
which are not instantly available. When these are encountered, the speaker has to do
something to solve a processing problem, and the need for attentional resources to
do this has an impact on the ongoing processing. This is where what would ideally be
parallel processing becomes serial in nature (Kormos 2006). Since communication
has to continue as smoothly as possible, this disruption has a major impact on the
cycles of Conceptualisation, Formulation, and Articulation which normally proceed
in parallel. Instead of a modular system working effectively, particular stages (in
this case the lemma retrieval stage) interfere with ongoing processing. The second
pressuring influence is the need to engage in Here-and-now processing. In this case,
at least in the context of a video-based narrative retelling, the quantity of incom-
ing (non-verbal) input which has to be encoded is considerable, and the (limited)
second language speaker has difficulty in analysing this input, extracting what is
important, and then formulating a response quickly while more input is arriving.
The result is that performance is disrupted (cf. the large phonation time differences,
the large average pause time differences) as the pressures of communication become
too great.
Happily, at least, not all the influences in the present study are so demanding.
Structure has an easing role at the Formulator stage, for reasons given earlier. The
clarity provided by structure eases Formulator operations. The same applies to more
frequent lexis. This too provides accessible material during speaking, so that the sec-
ond language speaker is more likely to be able to sustain parallel processing, at least
some of the time.
Conclusions
But there are also some pedagogic implications. In general, the results add to our
understanding how task choice, based on, for example, degree of structure and nature
of lexical content, and task conditions, such as time perspective, have systematic rela-
tionships on different performance areas, such as language complexity, accuracy, and
fluency. Hence, if teachers wish to promote one of these areas particularly, then results
such as those reported here can make a contribution. There are, though, some specific
areas which are worth highlighting. The results of this study while broadly supporting
previous research regarding the way structure promotes accuracy, also suggest that
structure may have some contribution to make to language complexity. This suggests
that structure can function similarly to information integration (Tavakoli & Skehan
2005; Tavakoli & Foster 2009) under certain conditions, making it a very useful tool
for teachers wishing to engender a focus on form into their communicative activities.
Time perspective also has some interesting implications. It would appear that where
video-based material is concerned, there is little to be said for the Here-and-now condi-
tion. This seems generally to create conditions which pose problems and depress levels
of achievement. The There-and-then condition seems much more supportive of peda-
gogic work, at least where the intention is that learners should be supported in gaining
control over newer or less salient language. Correspondingly, greater understanding
of what constitutes pressure enables more effective pedagogic decision-making where
the intention is to help learners mobilise their abilities to cope with pressure.
Finally, the results for lexical frequency might have the greatest relevance for peda-
gogic decision-making. If we assume that less frequent lexis is more difficult lexis, then
tasks which draw upon such lexis create difficulties for learners. They seem to lead to
lower levels of automatisation. In interaction with the There-and-then condition, they
also create problems for accuracy. If it is possible to predict which tasks will draw upon
difficult lexis, there is the possibility of pre-teaching such lexis so that when the task is
done, the lexis will become more easily available and the task can run more smoothly. To
put this another way, with tasks in classrooms and with tasks in research, it seems very
important to investigate the lexical demands that will be made. In classrooms, overly dif-
ficult lexical demands might compromise the pedagogic usefulness of tasks. In research,
overly difficult lexis (which is unidentified as such) in a particular experimental condition
can introduce unwanted variance, compromising interpretations of any research results.
Acknowledgments
This work was supported in part by The Research Grants Council, Hong Kong (grant
number 450307). The authors would like to thank Martin Bygate, John Norris, and
Kris Van den Branden for their helpful comments on an early draft of this.
182 Zhan Wang & Peter Skehan
References
Brown, G., Anderson, A., Shilcock, R., & Yule, G. (1984). Teaching talk: Strategies for production and
assessment. Cambridge: CUP.
Bygate, M. (2001). Effects of task repetition on the structure and control of oral language. In
M. Bygate, P. Skehan, & M. Swain (Eds.), Researching pedagogic tasks: Second language learning,
teaching, and testing (pp. 23–48). Harlow: Longman.
Ellis R. (2003), Task-based language learning and teaching. Oxford: OUP.
Ellis, R. (2009). The differential effects of three types of task planning on the fluency, complexity and
accuracy in L2 oral production. Applied Linguistics, 30, 474–509.
Foster, P., & Skehan, P. (1996). The influence of planning on performance in task-based learning.
Studies in Second Language Acquisition, 18, 299–324.
Foster, P., & Skehan, P. (1999). The influence of source of planning and focus of planning on task-
based performance. Language Teaching Research, 3, 185–214.
Foster, P., & Skehan, P. (2013). Anticipating a post-task activity: The effects on accuracy, complexity
and fluency of L2 language performance. Canadian Modern Language Review 69, 3, 249–273.
Foster, P., & Tavakoli, P. (2009). Native speakers and task performance: Comparing effects on com-
plexity, fluency and lexical diversity. Language Learning, 59(4), 886–896.
Foster, P., & Tavakoli, P. (2011). Task design and second language performance: The effect of narra-
tive type on learner output. Language Learning, 61(suppl.1), 37–72.
Foster, P., Tonkyn, A., & Wigglesworth, J. (2000). Measuring spoken language: A unit for all reasons.
Applied Linguistics, 21(3), 354−75.
Gilabert, R. (2007). Effects of manipulating task complexity on self-repairs during L2 oral produc-
tion. International Review of Applied Linguistics, 45, 215–240.
Hinkel, E. (2004). TOEFL test strategies with Practice Tests (3rd ed.) Hauppauge, NY: Barron’s.
Housen, A., & Kuiken, F. (2009). Complexity, accuracy, and fluency in second language acquisition.
Applied Linguistics, 30(4), 461–473.
Ishikawa, T. (2006). The effects of task complexity and language proficiency on task- based language
performance. The Journal of Asia TEFL, 3(4), 193–225.
Iwashita, N., McNamara, T., & Elder, C. (2001). Can we predict task difficulty in an oral proficiency
test? Exploring the potential of an information-processing approach to task design. Language
Learning, 51(3), 401–436.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Lawrence
Erlbaum Associates.
Kuiken, F., & Vedder, I. (2007). Task complexity and measures of linguistic performance in L2 writ-
ing. International Review of Applied Linguistics, 45(3), 261–284.
Kuiken, F., & Vedder, I. (2008). Cognitive task complexity and written output in Italian and French
as a foreign language. Journal of Second Language Writing, 17, 48–60.
Levelt, W.J.M. (1989). Speaking: From intention to articulation. Cambridge, MA: The MIT Press.
MacWhinney, B. (2000). The CHILDES Project: Tools for Analyzing Talk (3rd ed.). Mahwah, NJ:
Lawrence Erlbaum Associates.
Malvern, D., & Richards, B. (2002). Investigating accommodation in language proficiency interviews
using a new measure of lexical diversity. Language Testing, 19, 85–104.
Meara, P., & Bell, H. (2001). P_Lex: A simple and effective way of describing the lexical characteris-
tics of short L2 texts. Prospect, 16, 5–19.
Michel, M., Kuiken, F., & Vedder, I. (2007). the influence of complexity in monologic versus dialogic
tasks in Dutch L2. International Review of Applied Linguistics, 45, 241–259.
Structure, lexis, and time perspective 183
Ortega, L. (1999). Planning and focus on form in L2 oral performance. Studies in Second Language
Acquisition, 21, 109–148.
Ortega, L. (2005). What do learners plan? Learner-driven attention to form during pre-task planning.
In R. Ellis (Ed.), Planning and task performance in a second language (pp. 77–109). Amsterdam:
John Benjamins.
Rahimpour, M. (1997). Task complexity, task condition, and variation in L2 oral discourse. Unpub-
lished Ph.D. thesis. University of Queensland, Australia.
Richards, B.J., & Malvern, D.D. (1998). A new research tool: Mathematical modelling in the measure-
ment of vocabulary diversity (Award reference no. R000221995). Final Report to the Economic
and Social Research Council, Swindon, UK.
Robinson, P. (1995). Task complexity and second language narrative discourse. Language Learning,
45, 99–140.
Robinson, P. (2001a). Task complexity, task difficulty, and task production: Exploring interactions in
a componential framework. Applied Linguistics, 22(1), 27–57.
Robinson, P. (2001b). Task complexity, cognitive resources, and syllabus design: A triadic framework
for examining task influences on SLA. In P. Robinson (Ed.), Cognition and second language
instruction (pp. 287–318). Cambridge: CUP.
Robinson, P. (2007). Task complexity, theory of mind, and intentional reasoning: Effects on L2
speech production, interaction, uptake and perceptions of task difficulty. International Review
of Applied Linguistics, 45, 193–214.
Robinson, P. (2011).Task-based language learning: A review of issues. Language Learning, 61(Suppl. 1,
June 2011), 1–36.
Robinson, P., & Gilabert, R. (2007). Task complexity, the cognition hypothesis and second language
learning and performance. IRAL, 45, 161–176.
Sawaki, Y., Stricker, L.J., & Oranje, A.H. (2009). Factor structure of an internet-based test. Language
Testing, 26(1), 5–30.
Skehan, P. (2009a). Modelling second language performance: Integrating complexity, accuracy,
fluency and lexis. Applied Linguistics, 30(4), 510–532.
Skehan, P. (2009b). Lexical performance by native and non-native speakers on language-learning
tasks. In H. Daller, D. Malvern, P. Meara, J. Milton, B. Richards, & J. Treffers-Daller. (Eds.),
Vocabulary studies in first and second language acquisition: The interface between theory and
application (pp. 107–124). London: Palgrave Macmillan.
Skehan, P. (2009c). Models of speaking and the assessment of second language proficiency. In
A. Benati. (Ed.), Issues in second language proficiency (pp. 202–215). London: Continuum.
Skehan, P. (2011). Researching tasks: Performance, assessment, pedagogy. Shanghai: Shanghai Foreign
Language Education Press.
Skehan, P. (manuscript). Conventions for coding complexity, accuracy, fluency and lexis: The use of
TaskProfile. The Chinese University of Hong Kong.
Skehan, P., & Foster, P. (1997). The influence of planning and post-task activities on accuracy and
complexity in task-based learning. Language Teaching Research, 1(3), 16–33.
Skehan, P., & Foster, P. (2008).Complexity, accuracy, fluency and lexis in task-based perfor-
mance: A meta-analysis of the Ealing research. In S. Van Daele, A. Housen, F. Kuiken,
M. Pierrard, & I. Vedder. (Eds.), Complexity, accuracy and fluency in second language use,
learning and teaching (pp. 263–284). Brussels: Royal Flemish Academy of Belgium for Sciences
and Arts.
Tavakoli, P., & Foster, P. (2008). Task design and second language performance: The effect of narra-
tive type on learner output. Language Learning, 58(2), 439–473.
184 Zhan Wang & Peter Skehan
Tavakoli, P., & Skehan, P. (2005). Strategic planning, task structure, and performance testing. In
R. Ellis (Ed.), Planning and task performance in a second language (pp. 239–276). Amsterdam:
John Benjamins.
Van den Branden K. (2006). Task-based language education: From theory to practice. Cambridge:
CUP.
Van den Branden, K., Bygate M., & Norris, J. (2009). (Eds.), Task-based language teaching: A reader.
Amsterdam: John Benjamins.
Wang, Z. (2009). Modeling L2 speech production and performance: Evidence from five types of planning
and two task structure. Unpublished Ph.D. thesis. The Chinese University of Hong Kong.
Yuan, F., & Ellis, R. (2003).The effects of pre-task planning and online planning on fluency, complex-
ity, and accuracy in L2 monologicoral production. Applied Linguistics, 24, 1–27.
i. Recall what you have watched; and tell the story as clearly as possible and in the
order that the film was presented.
ii. Since you will not have time to plan how to tell the story, you should be thinking
about the story retelling even when you are watching the film.
iii. Please narrate the story from a 3rd person view point. Instead of using “I”, please
use “Shaun”, “Bitzer” for example, as the names of the main sheep. (A card of the
main sheep characters with their names is provided.).
iv. Please use the past tense. Here are some examples:
Shaun entered the room.
He wanted to call his friends.
i. Since you are required to tell the story simultaneously while the film is playing,
when the film starts (after the theme song), you should be ready to start telling the
story; when the film stops, you should be finishing your story too.
Structure, lexis, and time perspective 185
ii. Tell the story as clearly as possible and in the order that the film is presented.
iii. Please narrate the story from a 3rd person view point. Instead of using “I”, please
use “Shaun”, “Bitzer” for example, as the name of the main sheep. (A card of the
main sheep characters with their names is provided).
iv. Please use the present tense. Here are some examples:
Shaun enters the room.
He wants to call his friends.
chapter 7
Introduction
In some earlier second language task-based research, Skehan and Foster (1997, and
see also Foster & Skehan 1996) proposed that tasks which contain structure are associ-
ated with more accurate performance. They made this inference, post-hoc, after ret-
rospectively analysing tasks which were found to contain structure and elicit higher
accuracy in performance. This inference led to the design of a new study (Skehan &
Foster 1999), in which task structure was varied more systematically. In the earlier
studies, the tasks were either personal information exchange tasks or narrative retell-
ings based on a set of cartoons. In Skehan and Foster (1999), the stimulus material
consisted of two Mr. Bean videos, which differed in the structure of the story that was
shown. In one, ‘Mr. Bean plays Crazy Golf ’, after a ‘conventional’ beginning, Mr. Bean
hits the golf ball outside the course, and slavishly following an admonition not to
188 Peter Skehan & Sabrina Shum
touch the ball under any circumstances, endures a series of chaotic adventures which
have no development or connection, until he finally completes the course. In con-
trast, ‘Mr. Bean goes to a restaurant’ follows a familiar ‘schema’ (the restaurant script)
in which unpredictable things happen, but against the structure of a familiar set of
events, that is, dining out in a fancy restaurant. The results in this study partly followed
the predictions, but not wholly. There was an increase in accuracy, but not as consis-
tently as was predicted.
Two possible interpretations of this study presented themselves. The first is that
video-based retellings are, for some reason, not so supportive of demonstrating exper-
imental effects of structure. A second possibility is that the concept of structure should
be more carefully analysed than had been done for the Skehan and Foster (1999)
study. Accordingly, Tavakoli and Skehan (2005) analysed structure more subtly. They
suggested a number of ‘levels’ of structure. These were:
It was reasoned that there is a scale of structure in these different forms of story, and
that the degree of looseness is reduced as one works through the ‘scale’.1 In other
words, the second and third steps in the scale have a certain arbitrariness, but con-
ventions dictate how the story will develop, or at least the framework within which
it is told. The fourth and fifth steps, though, have a more logical, story-coherent pro-
gression, and therefore can be considered to be tighter. They derive from discourse
analysis work, e.g. that of the Winter-Hoey analysis of problem-solution structures
(Winter 1976; Hoey 1983), and reflect the way a story develops through a problem
arising from a situation, followed by a solution (or solutions) and then perhaps an
evaluation of the entire story.
Tavakoli and Skehan (2005) designed a study to explore the relevance of the scale
(omitting the third step, since it had been included in Skehan & Foster 1999). The study
used cartoon picture series, designed to represent the other four steps. The results were
partly consistent with the predictions. A scale with four distinct steps of structure
did not emerge clearly, but it was evident that ‘discourse’ structure (i.e. the final two
. Tavakoli and Skehan (2005) proposed, for cartoon picture series, that this ‘looseness’ can be
operationalised by the number of pictures in the series, other than the first or last, whose order can
be changed without impairing the capacity to tell the story.
Structure and processing condition in video-based narrative retelling 189
steps) did lead to greater accuracy, with or without planning, and at both proficiency
levels included. So there was something of a split between no structure and beginning-
middle-end organisation, on the one hand, and a causal (and problem-solution) struc-
ture, on the other. While the performance under the beginning-middle-end condition
was a little disappointing, more importantly, the notion of tight structure did survive.
Another aspect of the results is interesting. In one of the structured tasks (with a loose
problem-solution structure), there was also an increase in structural complexity. In
this particular narrative, to tell the story effectively, background and foreground infor-
mation in various cartoon pictures had to be integrated. This was necessary in only
one of the four narratives used, and it was associated with the greater degree of subor-
dination in the stories used, suggesting that the details of the story structure can raise
complexity in addition to accuracy. These factors also functioned similarly in Foster
and Tavakoli (2009) and Tavakoli and Foster (2008). These researchers report on a
study which included additional variables. Using cartoon picture series as in Tavakoli
and Skehan (2005), they included the variables of information organisation, native
versus non-native speakerness, and foreign language vs. second language contexts as
independent variables. For our current purposes, it suffices to say that these studies
confirmed the importance of structure as leading to increased accuracy in perfor-
mance, and of information organisation as raising complexity.
Naturally, this raises the question as to why more structured narrative retellings
lead to increased accuracy. Skehan (2009a) draws on Levelt’s (1989, 1999) model of
first language speaking to account for this finding. Levelt distinguishes three broad
stages in speech production: Conceptualisation (where ideas are developed), Formu-
lation (where ideas are transformed into language), and Articulation (where speech
is actually produced). In turn, the Formulation stage consists of a lemma retrieval
stage, followed by a morpho-syntax building stage which is contingent on the preced-
ing lemma-retrieval sub-stage. The model is consistent with much evidence on speech
errors, pausing and hesitation phenomena, and slips of the tongue. The model has also
been widely applied to second language speech (de Bot 1992; Kormos 2006). Draw-
ing on this analysis, Skehan (2009a) (and see also Wang & Skehan this volume, and
Skehan, Chapter 8, this volume), argues that structured tasks provide a clearer macro-
structure for the story retelling. In this way, such tasks first circumscribe what needs
to be said (less demanding Conceptualisation). Next, Skehan (2009a) proposes that in
second language speaking, the basis for more accurate speech comes from the Formu-
lation stage, and that structured tasks allow the speaker to draw upon the clearer mac-
rostructure to devote more attention to the Formulator stage and, as a result,have more
attention for lemma retrieval (solving problems of retrieval, enabling more complete
lemma information) and also devote more attention to syntax building and monitoring
of performance. All this implies a view of attention and working memory as limited. In
other words, factors which ease the speech production task (e.g. structured tasks and
190 Peter Skehan & Sabrina Shum
clearer macrostructure, which reduce the need for demanding Conceptualiser opera-
tions) release more attention which can be devoted to other speech production stages.
Reflecting on this literature, however, it seems that clearer effects for structure
have been found with cartoon story retellings, compared to video-based narratives.
However, the more sophisticated views of structure have largely been confined to such
cartoon retellings. This raises the question as to what would happen with video-based
retellings based on more theoretically defensible characterisations of structure. This is
more than a passing question. Video-based retellings are more demanding in process-
ing terms. As the video is running, the speaker is exposed to a considerable amount of
input, which has to be understood and then repackaged as production. In a theoretical
model based on limited attention capacity, anything which eases processing problems
would be particularly welcome to speakers having to wrestle with the demands on
their attentional resources as part of a video retelling. Accordingly, one aspect of the
present research is to explore whether more fine-grained task features related to struc-
ture will still have an impact even under more demanding processing conditions.
It is also worth discussing the results related to complexity reported in Tavakoli
and Skehan (2005). In this case, the task contained information which, if the story
was to be told well, required the speaker to link different bits of information, and
this seemed to push the speaker to use more subordination. In Tavakoli and Skehan
(2005), the interpretation of results was in terms of information organisation in that
foreground and background information had to be integrated in order to tell the story
effectively. But problem-solution structures themselves require different propositions
to be linked, and their causal connections to be clarified. So, a problem-solution struc-
ture may have multiple effects – the easing of processing through clear macrostruc-
ture leading to greater accuracy, and the push through the structure itself to express
relationships between elements more explicitly, thereby fostering subordination, and
raising measures of complexity based on this.
The previous discussion has concerned the nature of the tasks to be done. But it
is also worth exploring how differences in conditions can impact upon performance.
Once again, here, the starting point is the viewpoint that attentional resources are
limited, and so the conditions under which a task (e.g. a narrative retelling) is done
can be explored in terms of the attentional demands that they make. The most obvi-
ous contrast here is between what Robinson (2011) has termed ‘Here and now’, and
‘There and then’. In the former, the stimulus material is present and can be used while
the task is running. In the latter, the stimulus material is absent, but has been seen, and
so the task is to speak without the input material available. Robinson (2001) uses this
analysis of time perspective to make some challenging predictions about the nature
of performance. Essentially, he characterises the Here-and-now condition as simpler,
and the There-and-then condition as more complex. Then he proposes that the more
complex condition will lead to better performance, and specifically raise accuracy and
Structure and processing condition in video-based narrative retelling 191
complexity. This follows the Cognition Hypothesis claim that greater task complexity
pushes towards higher performance in particular areas (Robinson 2011). However,
most studies motivated by the Cognition Hypothesis have used static visual materials.
Most studies which explore this contrast in time perspective use tasks such as giving
map instructions, where the stimulus material does not change, even though it can
vary in degree of complexity.
The question is raised, then, as to what would be the influence of using video-
based presentations: in this case, the Here-and-now condition is one where the input
material is available and the task is to tell the story in a way which is appropriate
to what is happening on screen. The There-and-then condition would be one where
the video has been seen, and immediately afterwards, the speaker has to retell the
story. All this introduces a new set of psycholinguistic demands, especially concerned
with time pressure. Skehan (2009a) argues that, in this case, there is a genre difference
in time perspective and that the There-and-then condition is simply different (rather
than more complex) than the Here-and-now condition. In other words, although
memory demands are greater under the There-and-then condition, the opportunities
for shaping the task and repackaging what one wants to say are much greater than in
the Here-and-now condition.
Basically, the driving force for the comparison at issue is the impact of time
pressure. This raises the possibility of retaining the time pressure dimension of Here-
and-now but trying to attenuate it in some way. Two such methods can be proposed
here. The first directly addresses the issue of time pressure itself. Ellis (2005) and Ellis
and Yuan (2004) have proposed the construct of on-line planning to capture what
happens when speakers, who might otherwise be pressured by time, are able to operate
in less demanding time conditions, and so plan and regroup ‘on the fly’ (i.e. as they are
speaking). In other words, rather than using strategic or pre-task planning, they are
able to use planning-while-speaking. Ellis (2005) argues that such on-line planning is
associated with greater accuracy in performance. Wang (this volume) draws attention
to some research design concerns regarding this original research, which she addresses
in her own research by slowing down the videos to be retold (and so standardising the
conditions under which the online planning opportunities are created). But this is not
the only way to provide opportunities for on-line planning, especially as far as video-
based narratives are concerned.
The present study incorporates a different method of supporting preparedness
and opportunity for online planning. Participants can simply be given the power to
stop the video whenever they choose, so that they can compose themselves, ‘clear the
decks’ as it were, and prepare for what they are about to say. In this way they could, at
any point of difficulty, free themselves from the remorseless pressure of synchronising
the development of ideas and their generation of actual speech. While one cannot be
sure what learners will do while the video is paused, the fact that they can pause the
192 Peter Skehan & Sabrina Shum
video at will means that they only need to have a fairly limited ambition for what they
hope to achieve from their current pause since they know they can pause again soon.
It is likely, in other words, that they can focus on accuracy.
A contrasting approach to assisting learners while a task is running (and which is
different from strategic planning) is to give them an overview of what the video will
consist of. In other words, they can be provided with the outline of events, and thereby
be given a macrostructure. In this way, what subsequently happens may contain sur-
prises at the level of detail, but not in terms of the general things which will happen
while the video is running. As a result, they are being provided with a structure within
which they can operate. One would predict in these circumstances that knowledge
of this overall structure would assist learners and enable them to use the Formulator
stage more effectively, and so achieve greater levels of accuracy and lexical retrieval,
the aspects of speech Levelt associates with this stage.
Next, we focus on the nature of performance itself, and how it can be measured, a
concern that has permeated all chapters in this book. Most studies have used general
measures, although there are also approaches, such as Crookes (1989) and Robinson
et al. (2009), which use more specific measures. The general measures have shown some
development over the last two decades. Skehan (2009b) discusses how fluency mea-
sures have changed, and how they relate to different aspects of fluency (breakdown flu-
ency, repair fluency, speed). He also discusses pause location as particularly important
for second language speakers, addressing Segalowitz’s (2010) proposal that measures
which distinguish effectively between native and non-native speakers are particularly
useful Skehan (2009b). Regarding complexity, Norris and Ortega (2009) propose that
the commonest measure in task-based research, the use of some sort of subordina-
tion index, is not ideal and that it does not reflect improvement at higher proficiency
ranges, where complexification may occur through other processes, such as lengthen-
ing of nominal phrases (indeed, research suggests that subordination does not reflect
changes in the complexity of texts among higher proficiency learners, where other mea-
sures do). They propose that researchers should also use a measure based on number
of words per clause, which would reveal such finer-grained distinctions (and they also
suggest using larger-scale measures such as mean length of utterance or T-Unit, which
captures complexity more holistically). In this way, different aspects of c omplexity can
be captured. At the very least, the use of additional measures seems worth exploring,
alongside the more conventional measure of subordination per speech unit (Foster,
Tonkyn & Wigglesworth 2000; and also see discussion in Chapter 1).
The remaining area which is consistently measured in task performance is that
of accuracy. The ‘standard’ general measure has been to compute the percentage of
error free clauses. But there are alternatives. Gilabert (2007), for example, uses refor-
mulations as a measure of accuracy. Mehnert (1998) used a measure of errors per 100
words, arguing that this was more appropriate for German, the language used by her
Structure and processing condition in video-based narrative retelling 193
participants. In addition, Skehan and Foster (2005), who were concerned that mea-
sures of the proportion of error-free clauses might be inflated if a speaker used lots of
short clauses correctly, proposed a measure based on the length of clause that can be
accurately handled. In this measure, all clauses from a speaker are ranked for length,
and then accuracy is established for all clauses of particular lengths. Then the greatest
clause length that meets a criterion of accuracy (generally when 70% of the occur-
rences of that length of clause are error free) is taken as the measure of accuracy. This
measure does not correlate with complexity and so is not a confounded measure of
that construct. Instead, Skehan and Foster (2005) propose it as a more valid measure
of this aspect of performance.
Whichever of these measures is used, however, error-free clauses or errors per
100 words or clause length accuracy, they do not take error gravity into account in
any way. There are two reasons for building in some notion of error gravity to the
coding of second language performance. First, it may be that error scores are mislead-
ing if they reflect many but superficial errors. In other words, if one speaker produces
a large number of such errors, but few serious errors, but another speaker produces
the same overall error-free clause score based on a much higher proportion of serious
errors, a misleading index of accuracy is being created. Accordingly, separating error
scores as a function of gravity becomes desirable for reasons of validity. But second,
there is the more pragmatic goal of discrimination. Even if the general ratio of serious
to superficial errors is relatively constant across speakers, there is the problem that
lower-level students may achieve quite low scores if any error in a clause causes that
clause to be deemed incorrect. As a result, potential discrimination between speakers
may be lost.
For both these reasons, therefore, it may be desirable to code transcripts not sim-
ply for error, but also for error gravity. The clearest advocates of this procedure are
Foster and Wigglesworth (2010), who suggest that it is appropriate to separate three
levels of error gravity – low level, medium level, and high level, and that these three
levels should be defined in terms of the extent to which communication is impaired.
If an error does not really make the extraction of meaning difficult, they propose that
the error should be regarded as minor. On the other hand, if meaning cannot or can
hardly be extracted, then the error should be regarded as serious. Between these, a
level of error is one where meaning can be retrieved, but with some effort, and they
propose that these are best regarded as intermediate errors. Their system was used in
the present study, and so for all candidates we have two types of error score (error-free
clauses, length of clause error) with each of these represented at two difficulty levels
(since scrutiny of the data suggested that the intermediate and serious errors should
be combined, in that there were relatively few serious errors).
Drawing on the previous discussion, we can formulate a number of research ques-
tions, and associated hypotheses for the present study.
194 Peter Skehan & Sabrina Shum
Research Question One: In the context of video based presentation, what will be the
influence of varying the degree of structure in narrative retellings? This leads to:
Hypothesis One: Greater structure in the video-based narratives will lead to greater
accuracy in performance.
Hypothesis Two: Greater structure in the video-based narratives will lead to greater
complexity in performance.
Research Question Two: What will be the effects of varying and attenuating the time
pressure while narratives are being told? This leads to a series of sub-hypotheses:
Hypothesis Three (A): Unmediated narration while a video is running will lead to the
lowest levels of accuracy and complexity.
Hypothesis Three (B): Mediated narration while a video is running will raise
accuracy, but it is an open question whether this will be higher for the condition
providing the opportunity to pause, or the condition giving provision of a summary
(“pre-narration”).
Hypothesis Three (C): There-and-then narration will lead to higher levels of accuracy,
complexity, and fluency.
Method
The present study used a series of Mr. Bean video-based narratives, which participants
then had to narrate, under various conditions.
Materials
A wide range of excerpts from the Mr. Bean television series were examined, and after
trialling, four were selected to capture different degrees of structure. The original nar-
ratives were video-edited to reduce their length so that they ran for some 5–7 minutes.
The four narratives were:
–– Mr. Bean plays Crazy Golf: In this story, Mr. Bean plays a round of Crazy Golf. The
attendant instructs him that he shouldn’t touch the golf ball under any circum-
stances. He then accidentally knocks the golf ball out of the Crazy Golf area, even
outside the park it is situated in. A series of unconnected misadventures result,
which culminate in Bean arriving back at the golf course after dusk, to finally
complete his round. (This video was used in Skehan & Foster 1999.)
–– Mr. Bean at Christmas: In this story, Mr. Bean meets his girlfriend on Christmas
Eve. In the window of a jewellery store, she sees a ring she would like. Afterwards
he goes home and prepares for Christmas rather eccentrically. Mr. Bean wakes up
on Christmas Day, receives the present he sent himself, and then prepares dinner.
Structure and processing condition in video-based narrative retelling 195
His girlfriend arrives, gives him a nice present, and he, having misunderstood
her interest in the ring they looked at on Christmas Eve, gives her a picture and a
hook, both of which he also saw in the window of the jewellery store.
–– Mr. Bean visits the Funfair: On a trip to a funfair, Mr. Bean’s car accidentally gets
hooked to a pram, containing a baby. He takes the baby with him to the funfair,
but still wants to have fun. So he ‘parks’ the baby in a rocking car, putting in lots
of money. He then tries out various rides in the funfair, before coming back to
find the baby who is still crying. He buys some (helium filled) balloons to amuse
the baby, but they carry the baby up into the sky. He responds by using a bow and
arrow to burst the balloons, causing the pram to sink to earth, as it happens right
next to the baby’s mother.
–– Mr. Bean catches a thief: Mr. Bean visits a park and want to take a photo of himself
with a statue. He fails, and recruits a passer-by who deceives him and steals his
camera. Mr. Bean then searches for the thief, finds him, and puts a rubbish bin
over his head to immobilise him, and jabs the man with a pencil, causing the man
to shout in pain. He fetches a policewoman but the immobilised thief has run
away. Later, he goes to a police station where suspects, including the actual thief,
have been rounded up. In an identity parade he fails to recognise the thief visually,
but then gets the police to put a rubbish bin on each suspect, and jabs each, finally
recognising the thief from his squeals.
The first of these videos was unstructured, but the remaining three were structured, to
different degrees. The Christmas story has a beginning, a middle and an end, but there
is no strong causal structure. The story concerns the relationships between two people,
and how these play out against the backcloth of Christmas conventions. The other two
stories, though, have causal links, and in each case, there is a problem which is eventu-
ally solved. However, in the Funfair story, there are random diversions, such as when
Mr. Bean goes on the roller coaster, or plays darts, before the thread of the kidnapped
baby is resumed. In the Thief story, in contrast, there are no diversions – everything is
concerned with the theft or the catching of the thief.
Materials were also involved in one of the processing conditions – Summary.
Under the Summary condition, participants were provided with a brief summary of
the main events in the story. A typical summary is given below:
Funfair: In his car, Mr. Bean goes to a funfair, on his own. By accident, when he
arrives he takes a baby in its pram away from its mother. He then thinks that the
best thing will be to try to look after the baby at the same time as he is having fun
in the funfair. So when he is going on the different rides and doing different things
in the funfair, he has to think of ways of looking after the baby. In the end, and
with a lot of luck, he is able to return the baby to its very worried mother, again by
accident.
Research design
The present study has two factors, and each of the factors has four values. The study
uses four narrative tasks each chosen to exemplify a particular form of structure, as
indicated in the Materials section. These were (a) no structure, (b) a clear beginning-
middle-end structure, (c) loose problem-solution structure, (d) tight problem-solution
structure. Structure was a within-subject variable, in that all participants completed
all of the tasks, although in counterbalanced order. The second factor is processing
condition, and this is a between-subjects variable. The four conditions were: Watch-
and-Tell; Pausing; Summary; and Watch-then-Tell. In Watch-and-Tell participants
viewed the video and were required simultaneously to tell the story that they were
watching. In the Pausing condition, they were provided with the video control which
enabled them to pause the video whenever they wanted. Otherwise they had to tell the
story as the video was running. In the Summary condition, participants had to view
the video and simultaneously tell the story, as in Watch-and-Tell, but prior to viewing
the video they were provided with a summary of the story (as shown above), and given
as much time as they needed to read and understand this version, which was presented
in English. Finally, in the Watch-then-Tell condition, participants viewed the video,
but were not allowed to make notes. Then they had to recount what they had seen in
the video.
Data processing
Each narrative video retelling was recorded onto an MP3 player, generating a sound
file which typically lasted three or four minutes. Subsequently the sound file was
transferred to computer, and then transcribed, broadly, using Soundscriber. Next,
the transcription was put into modified CHAT format (MacWhinney 2000), with
appropriate headers and formatting. Each AS turn was represented in four lines. The
first was the CHAT line, and simply followed CHAT conventions. The second line
eventually became the part-of-speech coded line, using the Mor and Post subroutines
Structure and processing condition in video-based narrative retelling 197
from within CLAN.2 The third line contained timing information, in milliseconds,
regarding the start and finish times of the AS turn. The fourth line was a repeat of the
transcript from the CHAT (first) line, but was coded according to TaskProfile con-
ventions. The fourth line was processed by the computer software written by the first
author to analyse coded second language task performance. This line contained all
codings for clause boundaries, for error (including error gravity), for repair fluency,
for filled pauses, and for silence-based pausing.
Participants
The speech samples of 28 non-native speakers of English (NNSs) were included in this
study.
These were all second-year university students, who, at the time of the data col-
lection, were studying in South China University of Technology (SCUT), Guangzhou,
China, They were chosen following teacher recommendation that these students were
likely to accomplish the story-telling task. There were a few cases in which the student
had difficulty with the task and failed to produce a satisfactory quantity of speech.
These data were not included in any analysis. The participants all came from main-
land China, with L1s of Cantonese and Mandarin Chinese, and ranged in age from
nineteen to twenty-two, with a mean age of twenty-one. Fifteen were female and thir-
teen were male. Their proficiency, based on their College English Test scores, was low
intermediate.
Measures
The dependent variables in the present study were of fluency, accuracy, and complex-
ity. Fluency was measured with two indices. Previous research (Skehan & Foster 1997;
Tavakoli & Skehan 2005) has indicated a distinction within fluency between measures
based on pausing (i.e. unfilled pauses) and measures based on some sort of inter-
ruption to the message being expressed (e.g. reformulation). These have been termed
breakdown and repair fluency, respectively (Skehan 2001). In this study, breakdown
fluency was measured through the number of AS-boundary, and mid-clause pauses,
each standardised per 100 words. A pause was defined as an interruption to the speech
flow of more than 40 milliseconds. Skehan and Foster (2008) show that these mea-
sures are appropriate ones to capture breakdown fluency for non-native speakers.
. These subroutines implement a part-of-speech tagger to the CHAT line, and accomplish this
in two stages. The Mor subroutine makes initial tagging judgements, but flags cases of ambiguity,
where more than one part-of-speech is possible. The Post subroutine then addresses these cases of
ambiguity, generating the final, part-of-speech tagged line.
198 Peter Skehan & Sabrina Shum
Analyses
There were two independent variables in this study, structure (with four values – a
within-subject variable), and processing condition (a between-subjects variable – again
with four values). In addition, there were six dependent variables (two measures of
complexity, two of accuracy, and two of fluency). In addition to the calculation of basic
descriptive statistics, the central analysis used in this study was a repeated m
easures
multivariate analysis of variance, followed by appropriate univariate tests. Effect sizes
are also provided.
Structure and processing condition in video-based narrative retelling 199
Results
To gain a general picture of the results which were obtained, Table 1 shows the descrip-
tive statistics for each combination of the four tasks and the four processing conditions.
Standard deviations are given in parentheses. The dependent measures included are
two of complexity (Clauses per AS-unit, and number of words per AS-unit (ASComp
and WdsCom respectively)); two of accuracy (the proportion of error-free clauses and
the greatest length of clause, in words, that could be handled accurately at the 70%
level (EFC and LenAcc respectively)); and two of fluency (the number of pauses at the
end of a clause, standardised to 100 words, and the number of reformulations per 100
words (EoCPaus and Reform respectively)).
Given the two-dimensional nature of this research design, with a mixture of
between (processing condition) and within (task) factors, as well as the use of six
dependent variables, the data were subjected to a repeated measures multivariate
analysis of variance. The data showed significance at the.001 level for processing con-
dition (Pillai’s trace = .999; F = 4.56; d.f. = 6), and at .001 for task structure (Pillai’s
trace = .944; F = 9.29; d.f. = 18). There were no significant interactions.
Moving on to the univariate tests, all tests for task structure were significant, as
shown in Table 2, and all of them generated effect sizes which were large.
The clearest pattern here is for accuracy. Error-free clauses generates an extremely
clear trend, in the order Golf < Christmas < Funfair < Thief, with values of 0.71, 0.77,
0.81, 0.83, with all values significantly different from the others. Length Accuracy 70%
is close to this, with values of 3.96, 5.79, 6.93 and 6.5 respectively, with Golf signifi-
cantly lower than Christmas, and then both of these significantly different from the
remaining two tasks, with the latter not differing from one another. In other words,
there seems to be a clear relationship between structure of task and accuracy of per-
formance, especially as this relates to the two tasks which contain some degree of
Problem-Solution structure, a comparable result to Tavakoli and Skehan (2005).
Structure and processing condition in video-based narrative retelling 201
Table 2. Univariate tests for task structure and associated effect sizes
Measure F df Significance Effect size
partial Eta squared
The pattern for complexity is reasonably clear, but with one slightly anomalous
result. Regarding AS Complexity, the most widely used measure in the literature, the
results are broadly similar to accuracy. Golf and Christmas, the least structured nar-
ratives, do not differ from one another but they both differ from Funfair which itself
differs from the most structured task, Thief, with the respective values being 1.30, 1.31,
1.37 and 1.53. The pattern with words per clause contrasts with these results. The high-
est score, 6.53, is with the unstructured Golf task, whereas the other three tasks have
lower scores (Christmas, 5.54; Funfair, 5.64; and Thief, 5.71), and these do not differ
from one another. Structure, in other words, is associated with more subordination,
but shorter clauses. So, while there is clearly a trend towards a relationship between
complexity and structure, there are also some ways in which this trend is looser than
that for accuracy, and the results suggest that the two measures of complexity reflect
different facets of this construct, as Norris and Ortega (2009) argue.
The two fluency measures present a mixed picture. Reformulation is clearer, in
that Golf is significantly different to Christmas and both are significantly different to
the other two tasks, which themselves do not differ, with values of 3.17 (Golf) < 3.74
(Christmas) < 4.79 (Funfair) and 3.97 (Thief). In other words, the more structured
the task, the more reformulation there is, which is an intriguing result. The results for
End-of-Clause pausing present an unclear pattern. Here Funfair and Christmas, the
intermediate structured tasks, generate the highest values, which do not differ from one
another, but which do differ from the lower values Golf (unstructured) and Thief (most
structured). These become a set of results which present a challenge for interpretation.
Next, we turn to the between-subjects variable of processing condition. It should
be borne in mind that the cell size for any comparison is only seven, with the result
that the power of the statistical testing is much less than with Task Structure, where
the cell size was twenty-eight. With complexity, the pattern of results suggests that the
two conditions which eased immediate processing, Pause and Watch-then-tell had the
highest scores with both AS Complexity and Words Complexity, with Watch-then-tell
producing the highest value in each case (consistent with Wang and Skehan (this
202 Peter Skehan & Sabrina Shum
v olume)). Even so, there are no significances for the AS measure, with means of 1.36
(Watch-and-tell), 1.38 (Pause), 1.34 (Summary), and 1.44 (Watch-then-tell). There
are significant comparisons, though, with Words per Clause Complexity, between
Watch-and-tell (7.06) and Summary (7.20) versus Watch-then-tell (8.80). Watch-and-
tell and Summary, on the one hand, and Pause and Watch-then-tell do not differ. In
other words, it appears as though less pressured conditions, broadly, are supportive of
greater complexity in spoken performance, but that this relationship is not particularly
strong. (Once again, a result consistent with Wang & Skehan this volume).
There is a loose pattern also with accuracy. With the Error-free Clauses measure,
Watch-and-tell (0.75) is significantly different to Summary (0.83), and Pause (0.74)
contrasts with Summary and Watch-then-tell (0.81). Regarding Length, Accuracy
70%, Watch-and-tell (4.96) and Pause (4.82) are significantly lower than Watch-then-
tell (7.2). Summary, at 6.5, occupying a sort of middle ground, does not contrast
significantly with any of the other groups. It appears that the Watch-then-tell and to
a lesser extent, Summary, support a focus on accuracy while language is being pro-
duced, although again, the relationship is not strong.
The two fluency measures generate exactly the same significant contrasts, but with
interestingly different directions. Essentially Watch-then-tell contrasts with all other
conditions, but the other conditions do not differ from one another. This is shown
most clearly in Table 3.
Discussion
We can start this section by restating the main results which were found.
The first research question, and the associated hypothesis, concerned the relationship
between structure and accuracy. This hypothesis was broadly confirmed. The
unstructured narrative, Golf, consistently produced the lowest level of accuracy. But
beyond this, although all three more structured narratives produced higher levels of
accuracy, there was something of a split between the two problem-solution narra-
tives (Thief, Funfair) and the more conventional story structure narrative (Christmas)
which contained development, but with some arbitrariness. The two problem-solution
narratives produced higher accuracy, with the Christmas video generating accuracy
levels between the unstructured narrative and the two tightest narratives. The ‘scale’
of structure worked very well with the error-free clauses accuracy measure and rea-
sonably well for length-accuracy, although with the latter, there was little difference
between the two problem-solution videos. Obviously, it is interesting that the effect has
been found with video narratives, in addition to what has been reported in previous
studies with cartoon picture series. Once again, one can offer the interpretation that
tasks such as Funfair and Thief provide a clearer macrostructure, so that second lan-
guage speakers can organise what they say in terms of this macrostructure, and do not
need so much to work at the highest discourse level while speaking. This, then, enables
them to allocate more attention to the Formulator and focus on the surface, the accu-
racy of what is being said, even under video-based time pressure.
There is also a clear effect of structure on complexity, particularly with the sub-
ordination measure. In this case, the major opposition is between the two problem-
solution narratives and the two others. The combination of problem-solution structure
and the subordination measure is particularly interesting here. It seems as though
this structure pushes speakers to express connections between elements through
clause relations, pushing up the subordination measure. In contrast, the words-based
204 Peter Skehan & Sabrina Shum
easure of complexity is not so sensitive to these task changes, suggesting that the
m
factors which impact upon needing to use more words within clauses are different
from those which raise subordination. Indeed, with the words-based measure, it is the
Golf (unstructured) task which produces higher values than any except Thief. So in
this case, it seems as though specific task design qualities push the speaker to develop
clauses internally.
Turning to fluency, we have an interesting contrast in the two measures which
were used, end-of-clause pauses and reformulations. In the former case, structure
is associated with greater fluency, with fewer pauses, while in the latter, it is associ-
ated with less fluency, since there are more reformulations. The first case is perhaps
clearer and also more consistent with the available literature. It appears that paus-
ing is more controlled when tasks are structured. It seems as if clear macrostructure
and a straightforward and organised set of events to narrate facilitates organisation
of speech units, and a capacity to pause effectively, that is, not at the end-of-clause
points. So, it seems as though, when tasks are structured, the non-native speakers
are more able to use pausing opportunities while maintaining a good flow of speech.
This, perhaps, suggests effective Formulator functioning. But then there are the greater
reformulations to account for, which at first sight seems to tell a different story, since
these r epresent interruptions to the speech stream. We believe it is most likely that the
place of reformulation within fluency measures itself is ready for re-analysis. This is
because reformulations can be both negative and positive. In the former case, they are
‘thrust upon’ the speaker as trouble is encountered. However, one can also put a more
positive spin on reformulations. When there is effective overall discourse functioning,
sufficient attention may be available for changes in what is said, within the clause, that
reflect improvements and edits to be more precise in the message being conveyed. It is
possible that this is what is happening here. Structure, then, may be associated with an
effective flow of the discourse, and simultaneously with polishing what is said within
the clause, but without challenging the broader organisational structure of what is
said. If that is the case, then the two fluency measures seem to function as different
sides of the same coin.
Next, we turn to consider the effects of the different conditions under which tasks
were done, and so address the issues raised by Research Question 3 in its various forms.
Regarding accuracy, there is a clear contrast here between the Summary and Watch-
then-Tell conditions, on the one hand, which are associated with greater accuracy, and
the Watch-and-Tell and Pause conditions, which are associated with lower accuracy.
The predictions and the actual outcomes are shown in Table 4.
It was hypothesised that the Watch-and-Tell and Watch-then-Tell conditions
would differ most, which is, in fact, what has happened. The former is an online con-
dition, where the latter is not, and this is the difference that underlay the prediction in
favour of the latter. But the remaining conditions are both online in nature, with the
Structure and processing condition in video-based narrative retelling 205
Watch-then-Tell Watch-then-Tell
Summary: Pause Summary
Watch-and-Tell Pause: Watch-and-Tell
need to retell a video which is running. But compared to the Watch-and-Tell condi-
tion, each is mediated to some degree, to ease the processing conditions relative to
the online demands. So the prediction was exploratory here, and it was not predicted
which would lead to higher accuracy, but it was assumed that they would both gener-
ate higher accuracy than the Watch-and-Tell condition.
In the event, the Summary condition has produced higher accuracy, although not
quite as much as the Watch-then-Tell condition, while the Pause condition has not
generated greater accuracy, and is usually around the same level as the Watch-and-
Tell condition. This is slightly surprising, but nonetheless interesting. We can discuss
the Summary condition first. In this respect, it is worth noting that this condition
produced the lowest complexity values (lower than Watch-and-Tell and Pause), and
slightly higher values for breakdown dysfluency. It appears that the Summary condi-
tion functioned so as to organise what was going to be said, without that content being
challenged, with the result that speakers accepted the outline of their narration, and
thereby were helped to focus on the surface and on the detail. The speakers in this
condition seemed to have the confidence to pause at clause boundaries because they
were surer of the overall organisation of what they were going to say.
In contrast, the Pause condition did not seem to confer any particular advantage
for accuracy. Mediation of the online nature of the task by offering the opportunity to
pause the videotape did not lead to any form of assisted online planning, and indeed
for some of the accuracy measures, the Pause condition is associated with the worst
levels of accuracy! However, this condition does lead to the lowest values for reformu-
lation, suggesting that speakers in this condition were using the opportunity to pause
to achieve some gain. But this was not translated into accuracy, which is something of
a disappointment. The only additional point to make here is that this condition was
an optional one. Participants were provided with the opportunity to pause, but they
didn’t have to avail themselves of this possibility. Therefore, the operationalisation of
this condition may need to be rethought in any future research.
Finally, it is particularly interesting that the effects found for structure and for
processing are remarkably similar (and independent, with no interaction effects). Both
206 Peter Skehan & Sabrina Shum
influence accuracy, in broadly similar ways. Both also influence complexity, although
structure has clearer effects with the subordination measure, whereas processing
shows more clearly with the Words per Clause measure. Finally, both have similar
effects on the two fluency measures, bringing about fewer end-of-clause pauses and
more reformulations. In the final section, therefore, we will explore why these patterns
are so similar and what brings structure and processing together.
Conclusions
Narrating a video-based story is a very difficult thing to do, and puts considerable
pressure on a second language speaker. It is the sort of predicament which can be very
revealing about how second language speakers wrestle with communication problems.
The flow of input is considerable, and the time pressure under which it arrives is also
unforgiving. So, anything which mitigates these fundamental influences is of interest
to the psycholinguistics of second language speech production as well, ultimately, to
pedagogy.
What we have seen in this study is that there are indeed options available which
can mitigate these pressures. First and foremost is how structured the narrative task
is. Structure, we have seen, is related to higher accuracy. This, in turn, leads to the
interesting question as to why this should be the case. Three interconnected factors
are proposed here. First, there is the issue that structured tasks circumscribe what is
to be said. In other words, if there is a structure to a task, the degrees of uncertainty
and unpredictability are reduced. What has to be said is more clearly demarcated,
and so the speaker has the task of being work-man-like, and getting the job done, in
the greater certainty of what the job actually is. In other words, blind alleys are less
likely in the discourse. Moreover, to the extent that the structure is a fairly univer-
sal one, such as problem-solution, the assistance is even greater, since the point of
the communication becomes clearer, and the imagined listener becomes more real.
Second, there is the issue of organisation and framing. If there is a broader structure,
the speaker can make connections much more easily between where they are cur-
rently in the discourse, and the wider goals that they have in telling a story. In other
words, they can focus on what they are currently saying, for example, Mr. Bean try-
ing to find the policewoman, confident that they can make links with the broader
story and line of development once the current sub-section is finished. Both of these
influences, in other words, attenuate the demands placed upon the Conceptualiser.
The third influence, then, follows from the previous two. The speaker, cushioned by
this structure, is likely not to have used considerable attentional resources in deal-
ing with planning, and so can focus more clearly on what is currently being said. In
other words, attention is available to do immediate things like avoiding error and
Structure and processing condition in video-based narrative retelling 207
monitoring performance. This, then, underlies the greater degree of accuracy which
is achieved.
One might go even further with this, and propose that the greater time available
enables even better things to happen, in that time can be used to make choices which
lead to easier discourse, easier lexical selections, and even easier syntax, so that the
favourable conditions are magnified even further. The fluency effects are connected
here. Less pressure, through clearer structure, enables speakers to organise their
contributions more completely, and then pause, appropriately, at clause boundaries.
Within the clause, they are then able to concentrate on the surface of language, on
avoiding accuracy, with more opportunity to reformulate.
This, though, does not do justice to the complexity effects which were found, with
structure and with processing. In the former case, it seems that the specific effects of
problem-solution structure play themselves out through greater subordination (and
not so much through more words per clause), and speakers do justice to the need to
express the relation between different elements in the story. With processing, the lack
of time pressure in the Watch-then-tell condition (and see Wang and Skehan, this vol-
ume) enables repackaging and greater complexity, but in this case with a clearer effect
with the Words per Clause measure.
What structure seems to do here is produce a relationship between Levelt’s
Conceptualiser and Formulator that is helpful for accuracy in second language speech
performance. It limits the work that the Conceptualiser has to do initially, in develop-
ing a general plan for the story. It also eases the Formulator’s access to Conceptualiser
operations, since these are likely to connect with the broader structure of the task
and change less than might be the case in other communications. So, from the overall
amount of attention available, the Formulator does not have to compete as much as is
usual with the Conceptualiser, and can get on with doing what it does best – shaping
current language, retrieving lemmas, and building syntax. And these can, accordingly,
be done a little better, and lead to higher accuracy rates.
The other factor in the research design fits in well with this analysis. The dif-
ferent processing conditions yielded different results, with the main opposition
between the Watch-and-Tell and Pausing conditions on the one hand, and the
Watch-then-Tell and Summary conditions on the other. As elsewhere (e.g. Wang &
Skehan this volume), the There-and-then condition produces higher performance,
reflecting the lower processing pressure that is involved, associated as this is with
more opportunity for the Formulator stage to function effectively. But, even more
interestingly are the performances for the Summary condition. What this condition
does, in effect, is to provide the speaker with something akin to structure, in that a
broad outline is given, and then the speaker can take this outline, and use it as ‘the
structure’ to be recounted. It provides the sorts of ‘conceptual hooks’ that structure
can provide all alone. In this way, although there is the constant pressure of the video
208 Peter Skehan & Sabrina Shum
which is rolling, the speaker has the general macrostructure which has been given to
cope more effectively with the potential for derailment that the constant video input
provides. Once again, the issue is the relationship between the Conceptualiser and
Formulator stages.
We can connect this discussion to an even broader perspective. Skehan (2009a)
offers an account of second language speaking which is more general than that pro-
vided here. Using the Levelt model as the spine of this account, he explores groups of
influences which are categorised as Complexifying, Pressuring, Easing, and Focussing
second language performance. Essentially, the variables which have been explored in
this study are all concerned with Easing, in that structure and favourable processing
conditions in this study are those which simultaneously limit the demands coming
from the Conceptualiser, while giving more attentional resources to the Formulator.
They provide a piece in the puzzle for understanding how we can foster more effec-
tive second language performance. They also provide hints as to how, pedagogically,
tasks can be chosen and implemented which enable learners to perform at a higher
level. They also, of course, indicate how, in reverse, the task of the learner/second lan-
guage speaker can be made more difficult. But above all, they do indicate that there are
choices that can be made here, and that knowledge of these options can make peda-
gogy more targeted and effective.
A couple of points are still worth commenting on, one a limitation and the other a
suggestion for further research. The limitation concerns the Processing variable which
has been manipulated in this research. The variable was interesting and suggestive in the
results found, and encouraging for future research, as the earlier discussion indicates.
But one has to admit first that the sample sizes, of only seven per cell, were small, with
the result that any comparisons were fairly weak in statistical power. Even within the
processing variable, the pause condition was particularly problematic. In addition to
small cell size, it is clear that the condition requires more careful monitoring than was
used. There was clearly variation in the extent to which participants availed themselves
of the opportunity to pause, but we do not have data on this, and cannot explore whether
those who exploited this possibility more performed differently than those who did not.
It would be worthwhile to carry out future research which addresses this limitation.
The suggestion for further research concerns the measures of complexity which
were used. What is interesting is the way they were similar, but also how they diverged.
Both showed an impact of structure and of processing, but it is interesting that struc-
ture had a clearer impact with AS subordination and processing with Words per
Clause. The former variable, in its Problem-Solution operationalisation in the pres-
ent research, intrinsically supports more explicit and complex clause relations and the
AS subordination measure picked this up. The processing variable was interesting in
that greater time to build language seemed to push learners to develop more complex
clauses internally. This will be a fascinating contrast to probe in further research.
Structure and processing condition in video-based narrative retelling 209
References
Crookes, G. (1989). Planning and interlanguage variation. Studies in Second Language Acquisition,
11, 367–383.
de Bot, K. (1992). A bilingual production model: Levelt’s ‘Speaking Model’ adapted. Applied Linguis-
tics, 13, 1–24.
Ellis, R. (2005). Planning and task-based performance: Theory and research. In R. Ellis (Ed.), Plan-
ning and task performance in a second language (pp. 3–36). Amsterdam: John Benjamins.
Ellis, R., & Yuan, F. (2004). The effects of planning on fluency, complexity, and accuracy in second
language narrative writing. Studies in Second Language Acquisition, 26(1), 59–84.
Foster, P., & Skehan, P. (1996). The influence of planning on performance in task-based learning.
Studies in Second Language Acquisition, 18(3), 299–324.
Foster, P., Tonkyn, A., & Wigglesworth, J. (2000). Measuring spoken language: A unit for all reasons.
Applied Linguistics, 21(3), 354−75.
Foster, P., & Tavakoli, P. (2009). Lexical diversity and lexical selection: A comparison of native and
non-native speaker performance. Language Learning, 59, 866–896.
Foster, P., & Wigglesworth, G. (2010). Towards a new measure of accuracy in task-based s econd
language performance. English Department, St.Mary’s University, Twickenham.
Gilabert, R. (2007). Effects of manipulating task complexity on self-repairs during L2 oral produc-
tion. IRAL, 45, 215–240.
Hoey, M. (1983). On the surface of discourse. London: George Allen and Unwin.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Lawrence
Erlbaum Associates.
Levelt, W.J. (1989). Speaking: From intention to articulation. Cambridge: CUP.
Levelt, W.J. (1999). Language production: A blueprint of the speaker. In C. Brown & P. Hagoort
(Eds.), Neurocognition of language (pp. 83–122). Oxford: OUP.
MacWhinney, B. (2000). The CHILDES Project: Tools for analysing talk, Volume 1: Transcription for-
mat and programs (3rd ed). Mahwah, NJ: Lawrence Erlbaum Associates.
Mehnert, U. (1998). The effects of different lengths of time for planning on second language perfor-
mance. Studies in Second Language Acquisition, 20(1), 83–108.
Norris, J., & Ortega, L. (2009). Towards an organic approach to investigating CAF in instructed SLA:
The case of complexity. Applied Linguistics, 30(4), 555–578.
Robinson, P. (2001). Task complexity, cognitive resources, and syllabus design: A triadic framework
for examining task influences on SLA. In P. Robinson (Ed.), Cognition and second language
instruction (pp. 287–318). Cambridge: CUP.
Robinson, P. (2011). Second language task complexity, the Cognition Hypothesis, language learning,
and performance. In P. Robinson (Ed.), Second language task complexity: Researching the Cogni-
tion Hypothesis of language learning and performance (pp. 3–38). Amsterdam: John Benjamins.
Robinson P., Cadierno, T., & Shirai, Y. (2009). Time and motion: Measuring the effects of the concep-
tual demands of tasks on second language speech production. Applied Linguistics, 30, 533–554.
Segalowitz, N. (2010). Cognitive bases of second language fluency. London: Routledge.
Skehan, P. (2009a). Modelling second language performance: Integrating complexity, accuracy,
fluency and lexis. Applied Linguistics, 30(4), 510–532.
Skehan, P. (2009b). Models of speaking and the assessment of second language proficiency. In
A. Benati (Ed.), Issues in second language proficiency (pp. 202–215). London: Continuum.
Skehan, P., & Foster, P. (1997). Task type and task processing conditions as influences on foreign
language performance. Language Teaching Research, 1(3), 185–211.
210 Peter Skehan & Sabrina Shum
Skehan, P., & Foster, P. (1999). The influence of task structure and processing conditions on narrative
retellings. Language Learning, 49(1), 93–120.
Skehan, P., & Foster, P. (2001). Cognition and tasks. In P. Robinson (Ed.), Cognition and second
language instruction (pp. 183–205). Cambridge: CUP.
Skehan, P., & Foster, P. (2005). Strategic and on-line planning: The influence of surprise information
and task time on second language performance. In R. Ellis (Ed.), Planning and task performance
in a second language (pp. 193–218). Amsterdam: John Benjamins.
Skehan, P., & Foster, P. (2008). Complexity, accuracy, fluency and lexis in task-based performance:
A meta-analysis of the Ealing research. In S. Van Daele, A. Housen, F. Kuiken, M. Pierrard, &
I. Vedder (Eds.), Complexity, accuracy and fluency in second language use, learning and teaching
(pp. 263–284). Brussels: Royal Flemish Academy of Belgium for Sciences and Arts.
Tavakoli, P., & Skehan, P. (2005). Planning, task structure, and performance testing. In R. Ellis
(Ed.), Planning and task performance in a second language (pp. 239–276). Amsterdam: John
Benjamins.
Tavakoli, P., & Foster, P. (2008). Task design and second language performance: the effect of narrative
type on learner output. Language Learning, 58(2), 439–473.
Winter, E. (1976). Fundamentals of information structure: A pilot manual for further development
according to student need. Hatfield, Herts: The Hatfield Polytechnic Linguistics Group, School
of Humanities.
chapter 8
Peter Skehan
St. Mary’s University, Twickenham
This volume has reported a number of original empirical studies of task-based second
language performance. The main site for the studies was Hong Kong, and specifically
the Chinese University of Hong Kong, although some studies were actually conducted
in neighbouring areas (e.g. Macao, Guangdong). All the chapters emanate from the
Task-Based Performance Research Group which functioned at that university for some
six years, from 2004–2010. The studies all took a Complexity-Accuracy-Fluency-Lexis
framework as a starting point, and examined performance in these terms (with one
or two extensions here and there). They also all took a Tradeoff perspective to second
language performance, in that they assumed that attentional and working memory
resources are limited, and that the interest in studying such task-based performance is
to better understand the consequences of such limitations, as well as the factors which
lead to higher performance and overcome the limitations.
Sharing assumptions in this way across the different chapters makes this unusual
as an edited volume. Such volumes are often made up of disparate contributions which
may have some loose connections to one another, but essentially make individual and
possibly disconnected contributions. In the present case, the unified viewpoint means
that the different chapters cohere and provide a cumulative perspective on current
developments within the Tradeoff Hypothesis. Accordingly, it is the function of the
present chapter to bring together these different contributions, and to explore how,
taken together, they provide a clearer picture of how we can understand second lan-
guage task-based performance from within this framework.
There are three main sections to the chapter. The first, and by far the longest,
explores major themes that have emerged from the different studies, and covers issues
such as planning, structure, processing and cognition, selective attention, and work-
ing memory. A second brief section attempts to summarise the findings that have
been reported in terms of positive and negative influences. The third and final section
explores implications of the various chapters for pedagogy.
212 Peter Skehan
Major themes
This section will try to summarise the various chapters, but it will not do so sequen-
tially, taking things chapter by chapter. Rather, different themes will be explored and
then illustrated through the contributions that relevant chapters make. In this way,
common themes will be clearer, and studies which explored more than one variable
(i.e. most of them!) will be considered in more than one place. The discussion starts
with planning, which is the main focus for three of the empirical chapters, and then
moves on to explore the major task characteristic researched here, task structure. This
leads to a consideration of processing issues as well as notions of task complexity
and the Cognition-Tradeoff debate, before concluding with a discussion of selective
attention.
Introduction
The literature on planning has grown enormously over the last twenty-five years,
and one wonders what the authors of the two ‘seed’ articles, Ellis (1987) and Crookes
(1989), now think of the ‘beast’ that they have unleashed. So it is timely, given the
range of studies now extant, to stand back a little, at the outset, and think about what
planning consists of. The table below shows the different conceptualisations of plan-
ning that have motivated studies, and also which individual researchers have used the
different conceptualisations (with, of course, some researchers figuring in more than
one place).
Even though the literature on planning is now extensive, the table makes clear
that it is difficult to form generalisations given the range of different interpretations
of pre-task influence that have been used. The question then becomes whether one
can link these antecedents to different performance profiles. The chapters in this book
make some, albeit limited, contributions to clarifying some of these issues. But at the
outset one might even question whether it is appropriate to use the term ‘planning’ to
cover the different possibilities. That is the focus for Bui’s claim (this volume) that the
term ‘readiness’ might be more appropriate and of the proposal in Chapter 1 that we
need to discuss ‘preparedness’. Readiness, for Bui, contrasts task-internal factors (con-
tent familiarity, schematic familiarity, task familiarity) with task-external readiness
(rehearsal, strategic planning, and within-task planning). The former concerns what
the speaker already knows that is relevant for the task (and so readiness is exactly the
correct term here), while the latter is concerned with manipulable factors irrespective
of the familiarity the speaker might have with the task and its content.
Limited attentional capacity, second language performance, and task-based pedagogy 213
The notion of ‘preparedness’ is, of course, very close to this idea of readiness, but
there are some differences in emphasis. The major contrast is between what might be
termed a ‘cold’ versus ‘involved’ contrast. With the former, the information to be con-
veyed does not particularly relate to the speaker’s previous life or experiences. With
the latter, the speaker has some relationship with what is said. This could come from
the relevance of the task. It could also concern the relationship between previous expe-
rience and what is being talked about, reflecting whether similar things have happened
to the participant or not, or whether what is being said in the task has been spoken
about before, for example, recounting a near-death experience. But essentially whether
we are talking about readiness or preparedness, the intention is the same: to explore
what happens before a task as an organising concept to enable functional relationships
to emerge more clearly. Bui’s demonstration that readiness, in his study, has a lexi-
cal, accuracy, and fluency impact while strategic planning (or preparedness) has more
impact on structural complexity is a clear example of this. We have to wait and see
what future research tells us about the sort of fine-grained relationships with different
aspects of performance that these new concepts might reveal.
214 Peter Skehan
model of first language speaking. The codings so produced were then categorised,
into macro and micro codes, lexical-grammatical codes, and metalanguage codes, a
coding scheme which contrasted with that developed by Ortega (2005), which was
heavily influenced by O’Malley and Chamot’s (1990) coding scheme for learning
strategies. Even so, there is quite a lot of overlap between the two approaches. The
major change is that macro and micro codes figure more prominently in Pang and
Skehan’s (this volume) system. Monitoring and evaluation are not emphasized as in
her scheme, while Pang and Skehan (this volume) emphasise lexical planning codes
more, as well as some rehearsal codes as these relate to subsequent performance, an
area not explicitly included in the Ortega (2005) study. All the various codes were
also related to performance, and generalisations offered as to the sorts of planning
behaviours associated with success. These were:
Accuracy
–– Positive: Retrieve specific words; Structure the story: Rehearse in a targeted way
–– Negative: Create lexical difficulties for yourself; make the story more difficult than
you can handle
Complexity
–– Positive: Organise the story; link the big to the small; let the ideas lead the way
–– Negative: Be fancy with words; focus on grammar; be overambitious with planning
Fluency
–– Be resourceful and flexible with words; avoid a focus on form; rehearse and work
small; organise ideas; avoid difficulty; avoid getting stuck
–– Be fixed and inflexible with words; concentrate on form; try to be fancy; plan gen-
erally; take notes seriously
Lexis
–– Positive: Focus on words; be general with ideas
–– Negative: Be unambitious; rehearse generally not specifically
The most surprising thing here is how little in these categories is directly positive.
Building structure and being led by ideas are positive, but there is little to assist accu-
racy or even fluency in any direct way. Instead the emphasis is on things to avoid
doing, or to do in moderation. It seems that planning can easily contain pitfalls, so
good use of planning means avoiding the pitfalls just as much as doing ‘useful’ things.
It also helps to have realism about one’s abilities so as not to take on too much. Being
flexible and resourceful when things (inevitably) go wrong helps too.
Limited attentional capacity, second language performance, and task-based pedagogy 217
So with these three chapters on facets of planning, where do we end up? First,
there is the issue of needing a wider framework – either readiness (Bui, Chapter 3), or
preparedness (Skehan, Chapter 1), a framework which subsumes planning but ranges
more widely. Bui (this volume) shows that familiarity is a useful form of preparedness,
whose strongest impact is on lexical sophistication, but which also influences accuracy
and fluency, suggesting a clear Formulator focus. It seems that knowing an area, at least
in his study, enables the second language speaker to deal with pre-verbal messages with
attention, but does not seem to be associated with greater Conceptualiser operation.
Another facet of preparedness is to tell a story one has told before. In the present case
(from Wang’s chapter), that means retelling (repeating) a story which was previously
about something new. (One can imagine all sorts of other retellings of things which
were familiar, to various degrees, in the first place.) In her study, the retelling was imme-
diate, although obviously one can imagine other research designs with different time
intervals between the original and the retelling. In any case, in her study, the results
were very clear, leading to increase in performance of considerable magnitude in accu-
racy and fluency, and a smaller, but nonetheless very large effect in complexity.
Interpretations
The repetition effects in Wang (Chapter 2, this volume), are so strong that we have to
search for an explanation. Effective priming might be one such factor. The priming lit-
erature in psychology is extensive (McDonough & Trofimovitch 2009). An immediate
relevant preceding context can have a major impact on ease of access of related words.
It may be that memory traces from the first performance are still having an effect and
facilitate the subsequent processing. But plausible as this is, the size of the effect seems to
require a more powerful justification. In that respect, it is useful to compare rehearsal/
retrieval as processes within strategic planning, versus actual repetition of perfor-
mance. The background here is that (from a Leveltian perspective) the Conceptualiser-
Formulator transition is mediated through access to the (second language) mental
lexicon. As mentioned elsewhere in this volume, the first language mental lexicon is
extensive, and the lemmas within it are rich, elaborate, organised and associative. It
contains information on meaning, phonological form, orthographic form, logical rela-
tionships to other lemmas, associative and collocational information, and so on. In the
second language mental lexicon, this is not the case. Lemmas, even where they exist, are
nothing like as rich, elaborate, organized, and associated (although, of course, at higher
proficiency levels this is much less the case). The implication of this for retrieval and
rehearsal as part of strategic planning is that only limited information may be retrieved,
partly because only limited information is available, and partly because even what is
there may not be so accessible as in first language lemmas, and in any case the process
may be very effortful. But this, from the speaker’s perspective, may well be a lot better
than nothing, and may then raise the level of performance subsequently.
218 Peter Skehan
v olume; in her basic planning condition) confirmed the general finding from the lit-
erature – a complexity-fluency effect, but nothing for accuracy. Pang and Skehan (this
volume) also researched strategic planning and raise a different set of issues with their
hybrid qualitative-quantitative study. They record self-reports of behaviour of partici-
pants focusing on form, but very often these do not actually relate directly to perfor-
mance. Some of their codes (building structure, ideas) are positive in their impact and
reasonably clear to understand. But many are more complex and highlight avoiding
trouble, dealing with trouble, and the unhelpful effects of being over-ambitious. So
some planning behaviours contribute because they support actual performance, while
others are more concerned with what not to do. There are several implications here,
and most of these bear on the fragility of the accuracy effects in the literature. The most
general point is that it is clear that different people do different things when given the
opportunity to plan, and that some of these things raise performance in different areas,
some are neutral, and some are associated with reduced levels of performance in dif-
ferent areas. This being so, it is likely that if planners do the positive things which link
with accuracy (retrieving specific words, structuring the story, rehearsing in a targeted
manner), then error may be reduced. But if they do other things during planning then
it is unlikely to be. In other words, there is a certain degree of vulnerability in planning
behaviours linked to accuracy, and possibly chance factors in the ones which may be
used. The result is that different studies may report different effects depending on how
participants use their planning time. On some occasions the positive influences will
predominate. On others, the negative. Other performance areas, such as complexity
and fluency, do not seem so vulnerable to this variation, although with each, the quali-
tative data reveal potential negative factors – it seems, though, that they do not carry
the same weight. The general influence of over-extension as a result of over-ambition,
though, is there for all performance areas. It would seem that planners’ views of their
own abilities, and their capacity to match planning ambition to realism which is linked
to ability, may be a key factor, and one which has a particular connection to accuracy.
Related to this discussion is another conclusion one can draw from the Pang &
Skehan (this volume) study. This concerns the issue of memory when we examine
the transition from planning to performance. It appears that ideas, structure, and
organisation fade less, and make the transition better from planning to performance,
whereas form, most of the time, does not endure so well and transfers less effectively
(Bygate & Samuda 2005). So there is the possibility that, if ideas and form receive equal
focus during planning, it is the former far more than the latter which impact on per-
formance. This too may have connections to the smaller and less consistently reported
accuracy effects.
In this regard, there is a link to be made with another facet of the chapter by
Wang (this volume). She showed that on-line planning, in itself, was not effective, but
that her online planning condition, when preceded by a Watched condition (which
220 Peter Skehan
enabled preparedness) did raise accuracy, and complexity also, both with large effect
sizes. What we seem to have here is a synergy between ideas to be expressed and
the means to achieve such expression – Conceptualiser and Formulator working in
harmony – the first coming from the opportunity to prepare, and the second, feed-
ing off the first, coming from the supportive conditions. Having less pressure when
speaking is associated with higher levels of performance, but only with the previous
opportunity to plan – not only when there is more time available. But this suggests
a slightly different interpretation of what happens during the supportive conditions,
which would be that less pressuring performance conditions create a better context to
remember what has been planned. In other words, the key could be better retrieval of
what has been prepared because the performance conditions provide enough time for
this to happen. The planning lays the foundation. The supportive conditions enable the
fruits of the planning (accuracy as well as complexity) to be realised.
Pang and Skehan (this volume) also shed light on this. In their study, self-reported
use of planning time to focus on form did not have strong relationships with actual
performance, and indeed the only codes which did relate positively to accuracy were
retrieving specific words, rehearsing in a targeted way, and organising the story. They
proposed that form is vulnerable to fading more easily than is the planning of ideas.
If this is the case, it fits in well with Wang’s results on supported on-line planning. In
this condition, there has been planning, quite possibly of form, and the gentler perfor-
mance conditions may allow that form to be retrieved and used. More time in itself is
not the key – what matters is that foundations have been laid and there is time. It may
well be that in conventional planning studies, participants have used the planning time
to focus on form, but unless this was very specific, the fruits of planning may not have
transferred to performance. Hence, possibly, the variations in reported results.
To sum up this section, it appears we now have to consider what might be called
a ‘planning efficiency’ factor. Not only do we need to consider what happens during
planning. We also need to consider how the potential benefits from this may or may
not be transferred into actual performance. To be sure, this is speculative, but it is
consistent with the results reported in this volume. What it does, fundamentally, is
suggest a need for more qualitative research (Ortega 2005), but research which is now
more focused in how it explores processes of planning and their subsequent impact
on performance.
points’ which can restart more fluent performance. This point is discussed more exten-
sively in the later section on structure.
The other forms of readiness can be interpreted differently, and all seem directed
to enhancing parallel processing. Knowledge of an area, and familiarity generally, (see
Bui this volume) provide organisation of ideas, and possibly, accessibility of lemmas,
as indexed in the Bui study by faster speech and slightly greater lengths of run. One
assumes that such knowledge, and organization and accessibility of ideas, means that
additional Conceptualiser operations are not so necessary, and so Formulator pro-
cesses can benefit from greater attention, and enhance parallel processing in the sense
that more capacity is available for attending to formulation during the current speech
production process. Previous speech (see the repetition condition in Wang, this vol-
ume) seems to promote effective Conceptualiser readiness with major enhancement of
Formulation and Articulation, at least where immediate repetition is concerned. The
impact on accuracy and fluency in Wang’s study is very clear, suggesting that drawing
on a repertoire which has been primed by the previous performance is very effective
indeed. This is all the more remarkable in that complexity also benefits hugely, sug-
gesting that repetition even creates space for a repackaging of the ideas that are to be
expressed, to some degree (Bygate 2001). Repetition is the only condition where all
three performance areas benefit in a very clear way, and so we can now say, the only
area where additional Conceptualiser activity does not seem to be at the expense of
Formulation and Articulation.
The final area to consider is on-line planning. Ellis (2005) has argued that this
form of planning is particularly effective for raising accuracy. We have seen in the pres-
ent volume, through Wang’s research, that simply slowing down the performance, in
itself, did not produce higher levels of accuracy – this had to be linked with some way
of involving pre-task planning or Conceptualiser work. When this was done, through
Wang’s supported online planning condition, which provided strategic planning
opportunity followed by slowed performance conditions to enable better on-line plan-
ning, there was a clear impact on accuracy and complexity, although not fluency. The
provision of greater processing time here does seem to lead to a greater opportunity
to engage in parallel processing, but only if adequate preparation has been achieved
first. Simply providing more time is not enough – there has to be some guidance to
enable this extra time to be appropriately exploited. But this raises another issue –
the relationship between Conceptualiser and Formulator operations. The supported
online planning condition led to the elusive and desirable result of a joint raising of
complexity and accuracy. What seemed to be happening here is that effective think-
ing for speaking took place in the earlier ‘watched’ phase, and this was then carried
over into a performance in which error could be avoided to a greater extent because
of the actual performance conditions in which the video speed had been slowed. Most
other task research designs find ways of promoting either complexity or accuracy. This
Limited attentional capacity, second language performance, and task-based pedagogy 223
upon this advantage for themselves. More generally one would expect the opportunity
to prepare, to get material ready for performance, would ease attentional demands and
therefore make parallel processing more likely. What is particularly interesting is that
this does not apply equally to all the things that could be planned for. More specific
ideas and content seem to fade less than do a concentration on form, and on planning
generally rather than specifically. So decisions by speakers to concentrate on Concep-
tualiser preparation seem to pay bigger dividends, and so might also be more effective
at smoothing the operation of speaking and maintaining parallel processing.
The self-reports on planning also convey an awareness, on the part of second lan-
guage speakers, of their limitations and the need to overcome these. It is clear that
over-ambition, relative to a particular speaker’s level of ability, is a major factor. Many
participants report behaviours which over-extend their performance and cause dif-
ficulty. Or, to put this another way, behaviours which push learners so that serial pro-
cessing is more likely because they are trying to be too ambitious, and cannot sustain
parallel performance. In addition, a number of participants show that they are per-
fectly aware that they will encounter difficulty, and have some idea of how they could
prepare to overcome such difficulties. This too is consistent with the idea that they are
aware that they need to get a higher level of performance (i.e. parallel processing) back
on track. In sum, therefore, the view from ‘the inside’ where planning is concerned
shows a lot of evidence that maintaining flow in performance, maintaining a parallel
mode of speaking, is a major factor, one that is influenced by self-reported behav-
iours during the planning period, and also something that the speakers themselves
are aware of.
This section, however, has to be finished with a major caveat. The studies in this
volume have not systematically explored the entire proficiency range. They have, with
one exception, focused on intermediate level learners, with relatively few of these
being at high intermediate level. This even applies to Pang and Skehan (Chapter 4,
this volume) where, even though two proficiency levels were involved, these did not
go outside the intermediate range. The exception is Bui (Chapter 3, this volume) who
did have some low advanced learners in his study, in his higher proficiency condition.
There are important consequences which follow from this limitation. First, it
reduces any claims for generality. The hope in this section, as well as those which fol-
low, is that results and conclusions are applicable more widely, rather than more nar-
rowly. So it would be preferable to be able to claim that insights regarding planning,
for example, are robust and likely to apply widely. The restricted proficiency range,
while not incredibly narrow (intermediate learners are an important group of lan-
guage learners in themselves) does, though, mean that claims can only be made about
performance on spoken language tasks by this group.
There is, though, an even more specific version of this caveat. The main part of this
section has explored issues connected to serial-parallel performance, and proposed
Limited attentional capacity, second language performance, and task-based pedagogy 225
that it is differences between the first and second language mental lexicons that are
vital in altering the serial-parallel balance. Clearly, as proficiency increases, it is likely
that the second language mental lexicon will develop (have more elements, better
organization, richer information in lemmas) and that the serial-parallel balance will
be strongly affected. This is likely to become very important as high intermediate and
low advanced levels of proficiency are reached. Such a conclusion simultaneously lim-
its the power of generalizations that can be made and also indicates the urgency of the
need for research in this area.
Structure
Introduction
As in the previous section, we can start by asking what structure in a task is. Of course,
taken more broadly this connects with the literature on discourse analysis (Winter
1976; Hoey 1983) and the psycholinguistics of text structure (Kintsch 1994), both
spoken and written. Discourse analysts have tried to explore how some texts have
analysable and familiar structure, a discourse framework which has stood the test of
time and acquired some universality. Psychologists have explored the importance for
comprehension of concepts such as scripts and schemas, and how these influence our
expectations about what is likely to be said, and how, if there are major cross-cultural
differences, they mislead us and make processing difficult. Discourse analysts have
explored how discourse structure might impact on the ways we organise what we
say, as in things like restaurant scripts, or descriptions of a house or apartment. What
happens in these descriptions is not arbitrary, and follows predictabilities which are
important for listener as well as speaker. The different structures which have been
researched demonstrate that we benefit from working within them, since they remove
unpredictability and provide a shell within which the interplay of ideas and language
is facilitated.
The literature that explores relatively brief oral task-based performance has been
concerned with only a subset of the different ways of characterising structure, and has
concentrated on simple narrative structures (beginning-middle-end) or the problem-
solution structure (Hoey 1983; Winter 1976). (Structure has not particularly figured
in research into interactive tasks.) With simple narrative structures, the sequence gen-
erally consists of some sort of contextualisation, then a development through events,
with some sort of resolution to bring things to a close. The important point here, with
simpler narrative structures, is that there is arbitrariness. There is a need in narra-
tives for development, and a satisfying resolution which brings together and com-
ments upon the events (possibly amusingly). But there is also freedom in what might
226 Peter Skehan
happen, and while the development is unfolding all one knows is that there will be an
ending. In contrast, in structures such as problem-solution, the development which
takes place is constantly being related to the problem which has been ‘announced’,
and so the degree of arbitrariness is considerably reduced. The final resolution, when
it comes, is then a comment on the satisfactoriness of the solution that is posed to the
problem which was set. The parameters within which things are judged are therefore
far more precise, and correspondingly, the expectations when the narrative is listened
to or produced are much clearer. What this means for the speaker is that development
consists of handling sub-sections of the structured narrative before one returns to the
broader structure that is motivating the story. In other words, there is a ‘license’ to get
on with minor developments insofar as these minor developments mesh nicely with
the broader development of the story. The speaker, in other words, knows their place
in the story. As a consequence, given the existence of a narrative frame, Conceptualiser
operations are considerably eased, and more attention is available for the Formulator.
processing condition, with a Summary given in one case, and the opportunity to pause
the video in the other. So these factors have to be borne in mind when interpretations
of the effects of structure are provided.
A third point is to consider how comparable the video clips were that were used
in each case. Skehan and Shun (this volume) used four Mr. Bean video clips. Wang
and Skehan (this volume) used four ‘Shaun the Sheep’ clips. There was no difference in
length, no difference in amount of overt dialogue (virtually none). Obviously Mr. Bean
contains ‘real’ characters while Shaun the Sheep is an animated cartoon. Yet one could
easily argue that the sheep, dogs, and pigs in Shaun the Sheep have more convincing
human characteristics than many of the people who appear in Mr. Bean! The key com-
parison we can make, however, is in terms of structure. Skehan and Shun (this volume)
range their four videos along a scale of structure, going from no structure (Crazy Golf)
to beginning-middle-end structure (Christmas) to loose problem-solution structure
(Funfair) to tight problem-solution structure (Thief). Tight problem-solution struc-
ture is characterised by clear conformity to Winter’s four step structure of Situation-
Problem-Solution-Evaluation, with no deviation or significant extraneous material,
or sub-threads within the narrative (and this is what distinguishes ‘Thief ’ from ‘Fun-
fair’). Wang and Skehan (this volume) contrast two structured videos (Tooth Fairy,
Bathtime) with two unstructured videos (Fetching, Off the Baal). The two structured
videos are most comparable to Thief from Skehan and Shun, in that Winter’s problem-
solution structure is clearly present in each case, and although there are many amusing
events along the way, they are all tightly bound into the problem-solution sequence (at
the level of abortive trial solutions, or slightly extended solutions). So, for present pur-
poses, it is fair to locate Wang and Skehan’s two videos as reflecting a tight problem-
solution structure.
We can now explore the results from the two chapters which directly focus on
structure. Skehan and Shun (this volume) report an effect of structure on complex-
ity, accuracy, and fluency, with structure leading to raised performance in each of
these areas. With complexity, the major contrast is between the most structured video
(Thief) and all the others. With accuracy, each degree of task structure raises accuracy,
although perhaps the major difference is between no structure (Golf) and all tasks that
have some structure, whatever it might be. The impact on fluency (end of clause paus-
ing) resembles that on complexity, in that the most structured video produces clearly
the greatest degree of fluency, and the performance on the other videos do not form
a scale. So it is clear that structure is beneficial here, and the obvious point is that the
effect has been found with a video-based narrative retelling, with the time pressure
that that implies. Wang and Skehan (this volume) report a similar but far from identi-
cal story. There are main effect influences for complexity and end-of-clause pausing,
but the major effect reported is that structure has its clearest effect in the There-and-
then condition in that study. In other words, when there is processing pressure, as
228 Peter Skehan
in Here-and-now, the effect of structure, while there, is not strong with complexity
and fluency, and not really evident at all with accuracy and lexis. So, despite the tight
structure which characterises the videos used, the structure condition does not really
overcome processing pressure to any degree. Possibly it is the mediated Here-and-now
performances in Skehan and Shun (this volume) which have generated the significant
effect for structure there. But at least with Wang and Skehan (this volume) structure
does have an effect on complexity and fluency (though not great), and a fairly large
effect on these areas for There-and-then. Structure also has an effect on accuracy, but
only with There-and-then, and this not particularly large.
A final note in covering the results concerns Pang and Skehan (Chapter 4, this
volume). They did not directly investigate structure, since theirs was a qualitatively-
driven study, and so what came up in the qualitative results depended on what the par-
ticipants said. But it is interesting that some participants report using planning time, in
a study based on cartoon picture narrative retellings, to impose some organisation and
structure on the narrative they were going to produce. They seemed spontaneously to
regard structure as valuable. In addition, very importantly, this use of planning time
was associated with higher levels of performance, confirming the results reported by
Skehan and Shun (this volume) and Wang and Skehan (this volume).
Interpretations
The next task is to try to account for these results, and to understand what is going on
psycholinguistically. At the simplest level, it appears that the existence of a macrostruc-
ture does release attention for the speaker, and that the focus for such free attention
will be form, but it would seem now that different aspects of form may receive priority.
Some second language speakers may exploit the released attention to achieve greater
accuracy, in the way that has been predicted in the past. The less pressure that follows
from having a clearer macrostructure means that capacity can be directed towards bet-
ter online planning, and also monitoring and repair. Hence the accuracy effects which
have been found – attention has been directed to the Formulator. But equally, any
available attention can also be directed towards rethinking ongoing conceptualisation
and achieving a higher level of structural density, or complex phrasal structure. In fact,
the notion of structure may be directly linked here, in that if the task itself is struc-
tured, there may be more scope to use more complex language, for instance through
subordination, to bring out more clearly the connections between different aspects
of the material in the task. This may be akin to the way planning time interacts with
task complexity, with planning having greater effects when tasks have greater poten-
tial through more elements or the need to transform elements. Task structure may be
functioning a little in the same way – giving learners something more challenging to
express. The greater complexity is a response to the demands of the task, and to the
attentional resources available.
Limited attentional capacity, second language performance, and task-based pedagogy 229
It is also helpful to revisit Levelt model for L1 speech, and to explore how it
operates differently in a second language context. As mentioned earlier, in the first
language case, one assumes parallel processing, in that the different modules work
together, so that each is ‘doing its work’ at the same time, dealing with developing
ideas (the Conceptualiser), clothing the ideas in language, in lexis and in syntax (the
Formulator), and then producing that language as actual speech (the Articulator).
The model implies a sequence of speech production, but where the product of each
module is then handed over to the next to accomplish actual speech. All the while
the Conceptualiser continues to produce pre-verbal messages to replace the ones that
the other modules have completed work on, in turn. The only other essential point to
make is that the mental lexicon is accessed to help in the overall process of translating
ideas into language, so that the lemmas in that lexicon (a) are an essential element and
(b) contain the information that is needed to enable language production, including
rich information about meaning, syntax, morphology, collocation, and even articula-
tion. With the efficient operation of this system, the result is that parallel processing is
achieved, in that stages do their jobs very quickly, and so do not consume attentional
resources, in the normal case, enabling the way each stage is simultaneously operative.
The central difficulty in the case of second language performance is that the differ-
ent stages can each encounter difficulties, independently, but in each case with the
result that the parallel flow of operation is disrupted since attention has to be allocated
disproportionately to one of the stages. (It is assumed, though, that the Conceptual-
iser, which is more language independent, is least vulnerable to such disruption.) As
a result, a serial mode of performance is frequently necessary, as the flow and cyclical
progression cannot be sustained.
This analysis of the pitfalls of second language speech production leads to two
central questions:
–– For the second language case, what conditions make it most likely that parallel
functioning can be maintained smoothly?
–– For the second language case, when there is breakdown, and serial processing
results, how can parallel processing be re-established?
A first major advantage of discourse structure is that it eases the relationship between
the Conceptualiser and the Formulator. Ongoing speech requires parallel operations
in which the speaker simultaneously has to keep track of immediate propositional
content, and also the relationship between that content and the wider discourse. As
a result, Conceptualiser operations can be demanding, and the relationship between
Conceptualiser and Formulator is ongoing and often attention consuming. On occa-
sions where there is structure, however, the speaker has a clear overall framework
within which to speak and so is more able to give attention to the more detailed level
of ongoing pre-verbal messages. The result is that, other things being equal, more
230 Peter Skehan
a ttention is available for Formulation, and so there is a little bit of spare capacity in the
system should occasions arise, as they will, where accessing the mental lexicon, and
retrieving and exploiting lemma information is more demanding. As a result of this, it
is less likely that the Formulator will require excessive attention, and parallel process-
ing can be maintained more often.
But the second advantage of structure to consider relates to retrieval from break-
down. It is a problem when one is speaking a second language that when things go
awry, it can be difficult to retrieve or keep track of the general plan one was trying to
put into operation. The second language mental lexicon is not as extensive, or elabo-
rate, or as well organised, or as fast-access, as the first language lexicon. The demands
placed upon it during continuous speech cannot easily be met, since its operation is
effortful and slow. The result is that one has to find attentional resources not only for
ongoing communication, but also for the route back, so to speak, to enable the original
plan to be recovererd and executed. This can have a very considerable effect on the
harmonious production of language.
A major advantage of structure in the speech production process is that a clear
macrostructure to what is being said provides the speaker with multiple opportunities
to restart from a reasonably well-defined point. For example, if a narrative is based on
a tight situation-problem-solution-structure, a particular sub-section may cause prob-
lems, but then when the next section is reached, the decks are cleared, as it were, at a
well-defined point in the overall structure, and smoother speech production can be
restarted. This support-through-structure avoids the need to have to reorganise the big
picture of what is being said (a difficult task indeed) and enables the speaker to get on
with relatively local proposition expression. In other words, although a parallel mode
of processing may have been disrupted, a new starting point can enable parallel pro-
cessing to be regained. The result is that once again Conceptualiser-Formulator rela-
tions become more harmonious, and the second language mental lexicon has attention
available to handle the pre-verbal message demands made by the Formulator. And of
course, this can be done more than once! So we see that structure can make a use-
ful contribution to both the questions posed earlier in relation to the maintenance
of parallel processing -easier availability of attentional resources for the Formulator,
and also multiple potential re-entry points to recover parallel processing in the wider
discourse.
All the chapters in this volume have subscribed to a view of attentional and working
memory capacities as limited and have explored consequences of such limitations.
The next section tries to bring together issues which derive from a central aspect of
Limited attentional capacity, second language performance, and task-based pedagogy 231
retelling, or by having the ability to pause the video. These two conditions were asso-
ciated, slightly, with higher levels of performance, with the provision of a summary
before the narrative was done having a good effect on accuracy. The clearest effects
here were with the There-and-then condition which generated raised performance in
all areas, but especially fluency, where end-of-clause pausing was significantly lower
and with reformulations, which were higher, i.e. more dysfluent, but where one could
argue that the greater number of reformulations indicated a greater engagement with
the discourse that was being produced. However, and this is a very important quali-
fication, the cell numbers for the comparisons of processing condition were small, so
the conclusions one can draw from this study are only tentative.
Pang and Skehan (this volume), in their qualitative study of reported planning
behaviours, draw attention to two types of report which bear upon processing. First,
there were participants who reported, in effect, overdoing their ambition when they
were planning, so that the subsequent performance was pressured because of their
own behaviour. Second, there were reports of participants anticipating the problems
of pressure, and preparing for it, to some degree. This study does not directly address
processing influences, but it is interesting that some of the participants themselves
seemed to be aware of the importance of processing pressure, and better performance
was associated with those participants who were realistic and prepared.
The major study in this volume which directly addressed the processing issue is
Wang and Skehan (this volume). As we have just seen, they used a research design
which manipulated structure (discussed above), vocabulary difficulty, and time per-
spective (as the variable bearing upon processing). They report a general and strong
effect for time perspective with There-and-then performances which have more struc-
turally complex, more accurate, more lexically complex, and more fluent language. But
in addition there is an effect for structure, but only with the There-and-then condition,
in that Structured There-and-then conditions produce the highest performance of all.
This is a much clearer effect than Skehan and Shun (this volume) and it shows that
processing pressure, which comes from the need to retell a narrative while a video is
running and therefore providing considerable and ever-changing input, has a clear
effect on performance, and one which is not good, in any performance area.
Broadly, then, processing pressure is an issue. One minor and two major factors
will be proposed here as relevant. The minor factor is that of vocabulary difficulty.
It was interesting to manipulate this factor for the first time, but unfortunately, its
impact was limited. Earlier post-hoc work with previous studies had suggested that
vocabulary difficulty can have a disruptive effect on processing, and push second lan-
guage speakers into more serial processing modes, as second language mental lexicon
problems have a strong impact. The evidence in Wang and Skehan (this volume) was
not supportive. Further research here may be warranted, but in a perverse way, we can
conclude for the present that vocabulary burdens for second language speakers were
234 Peter Skehan
not as troublesome as was expected, which may have its attractions, pedagogically! If
task input lexical demands had a strong impact on performance irrespective of other
variables, considerable effort would have to be put into scrutinizing pedagogic tasks
in order to avoid accidental points of difficulty. The first major factor is the issue of
quantity/speed of input that has to be handled for a task to be effectively transacted.
The faster and more extensive the input, the greater the problem that is posed for the
speaker. In a sense, this is the opposite of the claim made in the previous section about
structure – that it enables effective restarts after trouble has been encountered. Here,
the problem is the remorselessness of input, and the way this poses serious problems
for the second language speaker. More input means that the speaker has to struggle
more and more to keep up with the input, and its lexical and propositional demands,
with the result that a serial mode of speaking dominates. Restarting is possible, but
not with anything other than a reactive mode. There are no opportunities to regroup,
and link to any general discourse macrostructure – the speaker simply has to deal with
whatever is new in the input, with the result that effective parallel processing becomes
close to impossible.
Related to this is the second major factor – that of non-negotiability. In simultane-
ous narrative tasks, the input has to be heeded. But with other types of task, there are
times when the speaker can shape what is to be said, and in so doing, make things eas-
ier for themselves. In a way, an aspect of the difficulty of simultaneous narrative tasks
is that they are based on input that is non-negotiable and with no time to interpret or
reframe it – the speaker then has the role of describing that input in its own terms. In
other tasks, selection, reorganisation, and alternative orientations become possible.
These have the considerable advantage that the speaker can then play to their strengths
and away from their weaknesses (Foster & Skehan 2013). Since narratives provide less
scope for this to happen, it is clear that they make serial processing more likely, inde-
pendent of amount of input – they deprive the speaker of methods of personalising
the task. Perhaps discourse structure has a role here. At least with some awareness of
the overall structure the speaker can free themselves of input dominance, and decide
on narrative paths of their own, thereby enabling them to find a pathway through the
task that is strategically easier to manage, and so making parallel processing just a little
more likely. In that respect it is worth noting that the chapters in the present volume
(with the exception of Li, this volume) only used narrative tasks. It is likely that inter-
active tasks are going to be much less susceptible to this influence of non-negotiability.
They are much less input-based, and also more likely to be ‘shaped’ by the directions
suitable to the participants.
The results reported in Wang and Skehan (this volume) do provide some
encouragement for Cognition Hypothesis advocates. The There-and-then condi-
tion, proposed as more complex by this hypothesis, did generate higher accuracy
and complexity, as predicted by the Cognition Hypothesis. Unfortunately for the
Limited attentional capacity, second language performance, and task-based pedagogy 235
We have looked, so far, at task features and conditions which impact on performance.
The previous section was concerned with processing issues more directly. We con-
tinue that emphasis here, but the focus is more on the act of speaking itself rather
than the input conditions which ‘surround’ that speaking. The broad theme is how
attention can be used selectively during task performance, and specifically how it can
be ‘nudged’ towards a focus on form. Two related questions underlie this discussion.
First, we have to ask whether attentional functioning and prioritising is influenced
by the learner, or by the task, or the conditions under which a task is done, or some
combination of all of these. Second, we need to explore whether, for performance
on second language speaking tasks, there is any sort of default position regarding
where attention will be directed – to meaning and fluency, to lexical or structural
complexity, or to accuracy.
Learner characteristics have been remarkably under-researched in the task
literature. Working memory has been the most examined, but there are many
other possibilities. Skehan (1986) has discussed analysis-oriented learners versus
memory-oriented learners, and there are many other learner characteristics which
might be relevant here (Skehan 1989; Dornyei 2005), such as learning style, field
independence, personality, and so on. For now we can only recognise the possibil-
ity that these various factors might have some impact on typical approaches to
attentional priorities, and we can hope for future research to explicate matters.
More central to task explorations have been research studies exploring how dif-
ferent task characteristics can influence performance. Robinson (2001), through
the Cognition Hypothesis, proposes task complexity as the driver here, with its
joint impact on accuracy and complexity. Skehan (2009a), using a limited capac-
ity account, prefers to look for specific potential influences, explored separately or
in combination, to see if there are generalisations that can be made regarding task
quality-by-performance associations. For example, with complexity, tasks which
require information integration (Foster & Skehan 1996) or tasks with problem-
solution structure lead the speaker to encode the relationships which character-
ise the task in more complex syntax, essentially a resource-directing influence. In
fact, the limited attention approach is perfectly comfortable with the concept of a
resource-directing influence: what it does not extend to is a joint influence of task
complexity on accuracy and structural complexity, as a bundle. In any case, the
literature contains many reviews of systematic task influences on performance (e.g.
Skehan 2001; Ellis 2009). The basic point being made here is that task characteris-
tics induce speakers to devote attention to particular areas of performance, and so
the decision to choose a particular task, with its attendant qualities, is a decision to
push for higher levels of performance in particular dimensions. Similar arguments
can be put forward regarding task conditions, as we have seen in this volume, and
Limited attentional capacity, second language performance, and task-based pedagogy 237
a ttention, but instead to try to guide attention in some way, so that accuracy, spe-
cifically, is fostered and made more important for the speaker. The two studies to be
reviewed in this section are concerned with exactly this challenge, and they explore
ways in which that most difficult aspect of performance to influence, accuracy, can be
nurtured deliberately, rather than simply as a somewhat lucky consequence of a less
demanding task.
Wang’s study (this volume) manipulated several variables, as we have seen. Rel-
evant to the present section is her supported online planning condition. She reported
that when online planning was facilitated, with a slowed video retelling, there was no
impact on performance. However, when this online planning condition was preceded
by the opportunity to engage in strategic planning, it produced higher accuracy and
complexity. In other words, simply having more time was not enough here. It was also
necessary that there was some push towards an orientation towards form. As we saw
in Wang’s chapter, the planning probably pushed speakers to have more ideas that they
wanted to express. It may also have pushed them to retrieve and rehearse material,
material which could then be more effectively recalled and used because of the less
pressured performance conditions. A default view of attention here would have been
to argue that simply having more time should have led to an increase in accuracy. This
did not happen. It is clear that something more was needed to bring out the potential
for a focus-on-form. Language had first to be mobilised, and then it assumed greater
importance for the speaker. In other words, a form of guiding was necessary to exploit
the greater attentional potential that the on-line planning condition permitted. Wang’s
on-line planning condition when supported by pre-task planning raised form selec-
tively. Here it appeared that the organisation and preparation facilitated by the pre-
task planning needed the ‘space’ provided by the on-line planning condition (a slowed
video) to release attention which could raise accuracy and complexity. There was noth-
ing in the conditions here which oriented learners towards form – it was rather the
availability of attentional resources (released by the strategic planning) which could
be channelled in this way. One assumes that attentional availability was used to enable
more effective monitoring to be carried out.
The one chapter which directly addresses the issue of selective attention is Li
(Chapter 5). Following earlier research by Foster and Skehan (1997, 2013) she explored
whether anticipating the need to do a post-task transcription of one’s own performance
would have a selective impact on accuracy in performance. Confirming Foster and
Skehan (2013), she demonstrated such an accuracy effect. This suggests that within the
attentional resources that are available during task performance, where tasks are of the
appropriate level of difficulty, it is possible to prioritise particular areas. Li (this vol-
ume) showed that learners who were anticipating post-task transcriptions produced
significantly more accurate language. Knowing that they would be confronted by their
own voices seemed to alert them the problem of error and the way mistakes they made
Limited attentional capacity, second language performance, and task-based pedagogy 239
–– To placate the researcher: We have just mentioned payment, and more generally,
participants may view a ‘minimum’ level of cooperation as being involved, and
even calculate how little they can get away with. Alternatively, if the researcher has
engaged their interest, they may try harder.
–– To handle the input: Obviously this only involves input-heavy tasks, but we have
seen several of these in the present volume.
–– To say something you want to say: The clear starting point here is that the chosen
task engages the speaker, who has relevant things that can be said. A decision-
making task could be of this nature, if the decisions connect with the value system
of the participant. Planning might make this more likely if it enabled the speaker
to bring to bear on the task their own personal opinions more cogently.
240 Peter Skehan
–– To look good, or at least, to avoid looking silly: In effect this would concern situ-
ations where the speaker has awareness of their own speaking, and might want to
look good in the most obvious way – to avoid error. A post-task condition might
fit in here.
–– To do better: Clearly here the starting point is to ask, better than what? Which
brings out that the repetition condition in Wang (this volume) would illustrate
this situation – a reference performance has been established which is still likely
to be in the memory of the speaker.
–– To get a good grade: Here the focus is clearly on being tested. This motive is worth
including because it draws attention to the split between findings in the task lit-
erature itself, and studies which have been done with tasks-as-tests (Iwashita et al.
2004). The difference may well be linked to differences in perceived purpose in
doing a test-task.
This discussion of purposes for doing tasks frames the discussion of the two studies
reviewed in this section. The discussion is proposing, essentially, that it is important to
have ‘drivers’ for tasks which are done, influences, that is, which inject some purpose
(Bygate & Samuda 2009). The purpose may come from the participant, or the purpose
may be contrived through the conditions which are used. When there is such a purpose,
either inherent to the participant, or contrived by the researcher, it is assumed that the
degree of focus that the participant brings to doing the task is heightened, and perfor-
mance may be influenced. In Wang (this volume), under the supported on-line condi-
tion, we see that the strategic planning gives the speaker something to say, something
which then energises the performance later, so that later the supportive time condi-
tions are exploited to do greater justice to what has been planned, under the conditions
where retrieval and rehearsal are more accessible. With Li (this volume), the post-task
conditions signal effectively to the speaker that doing the task is not everything, and
that there will be consequences later. As a result, anticipation of what will come later
is the motive which causes speakers to allocate attention to the aspects of performance
that they have been induced to value because of the later condition. An accuracy effect
was found for all conditions, but in addition, it was interesting that individual tran-
scription led to more impressive lexical performance, that pair-based transcription
pushed for greater complexity, and that transcription with revision strengthened the
accuracy effect. In broad terms, Li (this volume) is consistent in her findings with the
same broad class of influence reported in Skehan and Foster (1997) and Foster and
Skehan (2013), although the detail of her different conditions adds to our knowledge
considerably.
It is important to stress here that we assume that the tasks involved in both these
studies are neither very easy nor excessively difficult. But for speakers of the interme-
diate level of proficiency involved, they do constitute a challenge. The speaker has to
Limited attentional capacity, second language performance, and task-based pedagogy 241
focus engendered by a post-task activity. Second, and even more interesting, though,
is that there is now evidence that when a decision-making task is concerned, there
can also be effects for structural complexity. In other words, the focus on form is not
simply on accuracy, but also includes a sensitivity to using more complex language.
Once again there is no correlation between accuracy and complexity, suggesting that
while form is in focus, typically speakers can only manage to achieve more highly in
one area, not both.
We can return to the earlier discussion of the accuracy effect found in Wang's
study for supported online planning. There it was argued that such a condition makes
it more likely that there will be a Conceptualiser-Formulator balance, and that clar-
ity about the general task will free up attention during performance if conditions
are appropriate, as when the video on which the narrative is based is slowed. It may
be that decision-making tasks, because of their turn-and-turn-about nature, create
similar conditions. The opportunity to regroup while one interlocutor is speaking
can give a speaker the opportunity to create the equivalent of a structured task, and
with each new turn, embark on a parallel processing approach to speech. It appears
that with Li's revision condition, the interactive nature of the decision-making task
gives participants time to focus on language, and the revision condition directs this
attention to avoiding error. It is a combination of circumstances which produce a par-
ticular result. So in their way, Wang (this volume) and Li (this volume) have achieved
a similar thing – a good Conceptualiser-Formulator balance, and greater chances of
second language speakers engaging in parallel-process-based speech production.
A problem here though is that there are too many explanations for this one effect.
Li (2010) proposes a sociocultural account – participants collaborate and build an
encounter which takes them collectively further than they would go if operating alone.
At a lower level, and focussing on interactive opportunities, Foster and Skehan (2013)
propose that decision-making tasks enable ‘stealing’ of the interlocutor's language, a
finding which would perhaps apply to complexity and accuracy. They also suggest that,
within interaction, when it is the turn of the interlocutor, one could feign listening,
and thereby finesse planning time. Possibly, also, the immediate and obvious presence
of an interlocutor pushes for precision, and therefore greater accuracy. Finally, if one
has an interactive task, which is not driven by input in the same way as a narrative,
then there is greater negotiability as to what might be said, with the result that difficul-
ties can be avoided and strengths exploited.
This list of possibilities has been briefly enumerated so that we can make a link to
Wang's (this volume) study. Her supported on-line planning condition linked strategic
planning (and opportunity for Conceptualiser work linked to Formulator retrieval and
rehearsal) with an opportunity to speak where unpressured conditions enabled what
had been prepared for to be retrieved and utilised. Here, within interaction, we have
opportunity to plan, during interaction when one's interlocutor is speaking, or oppor-
tunity to steal (similarly while one's interlocutor is speaking) and then immediately
Limited attentional capacity, second language performance, and task-based pedagogy 243
an occasion to use what has been prepared for during this rest period from speaking.
Given the focus injected into the task towards a focus on form (through the post-task
condition) plus the useful ‘planning to performance’ conditions, it is perhaps not so
surprising that interactive tasks are a good locus for experimental effects to be found.
These, following the above analysis, can often involve both complexity and accuracy,
but not often both for the very same participant.
We have now reviewed the studies which form the heart of this volume. The survey has
been extensive, not least because the range of variables which have been studied has
also been extensive. Accordingly, Table 2 shows, in summary form, the range of find-
ings from the studies. Setting them out in this way can then be a prelude to a reflection
on the nature of second language performance and the ways it can be supported, but
equally the ways that difficulties can be caused.
Planning
Familiarity Associated with greater More familiar topics enable more specialist
lexical sophistication. vocabulary to be accessed.
Little effect on other
performance areas.
Conventional Greater structural Ideas make the transition from planning
pre-task planning complexity and fluency. to performance most generally and most
Few effects on accuracy. dependably. Dangers of excessive ambition.
Supported on-line Greater structural Good Conceptualiser engagement, and
planning complexity and greater then good Formulator conditions for use of
accuracy. rehearsal and retrieval from planning.
Repetition Strong effects on First performance (a) enables ideas and
complexity, accuracy, and language to be made more salient, and (b)
fluency. triggers ‘deep’ lemma activation which is still
available for subsequent performance.
Structure Raises accuracy and Clarifies what Conceptualiser needs to do
complexity, especially Releases attention for Formulator
under There-and-then Enables ‘restarts’ after serial processing.
condition.
Processing Accuracy and complexity Less input pressure enables more focus on
benefit when processing form, as does the opportunity to choose how
pressure is reduced, to narrate the story, with freedom from input
through There-and-then dependence.
tasks.
(Continued)
244 Peter Skehan
Effects Interpretation
Selective Attention
Supported on-line Raises accuracy and Less pressured conditions are vital because
planning complexity. the planning has prepared the ground, and
the retrieval conditions exploit this planning.
Post-task conditions Raise accuracy, and also Pedagogic norms are emphasised through
lexical and structural anticipation of the post-task, and attention,
complexity, the latter even though under some pressure, is
especially with an directed towards form, especially accuracy.
interactive task.
Skehan (2009a), revised in Skehan, Bei, Li and Wang (2012), attempts to charac-
terise influences on second language spoken performance through two interlocking
systems. First, influences are organised according to Levelt’s stages of speech produc-
tion (Conceptualisation, Formulation-lexical, Formulation-morphosyntactic, Articu-
lation), and second, task and task condition influences are grouped as leading (a) to
complexification, (b) to pressured performance, (c) to eased performance, and (d) to
focussed/monitored performance. This same approach underlies Table 2. The table
attempts to organise what we have learned about influences on task performance,
largely following a Tradeoff account. The discussion which follows recapitulates the
evidence and argument from several publications (Skehan 2009a, 2009b; Skehan et al.
2012), but emphasises what is new, especially what derives from the chapters in the
present volume.
It is helpful to restate the basics of the Levelt model. Conceptualisation is con-
cerned with the ideas to be expressed, and delivers what is termed the ‘pre-verbal
message’. This is the starting point for Formulator operations, involving retrieval
from the (second language) mental lexicon, the building of morphosyntax, and the
preparation of phonological representations. Finally, the Articulator takes the out-
put of the Formulator and produces actual phonological realisations to capture what
the speaker is trying to say. Each component in the model is meant to function in
modular fashion, so that all three components are working simultaneously, but on
different things. Importantly, of course, this process is ongoing, as speaking contin-
ues, so that Conceptualisation continues to drive forward Formulator and Articula-
tor operations.
A major aspect of Conceptualiser operations, especially for the second language
case, is that it has both a general and also an ongoing-specific aspect. If, for example,
there is planning, the speaker effectively tries to load the Conceptualiser with mate-
rial which will continue to have an impact later in performance. The Conceptualiser,
in other words, has a ‘slow burn’ impact on communication that is not immediate.
Limited attentional capacity, second language performance, and task-based pedagogy 245
On other occasions, and this is obviously the norm, cycles of operation mean that
Conceptualiser operations at Time 1 are passed on to the Formulator at Time 2 (and
to the Articulator at Time 3) while new material will be occupying the Conceptual-
iser at Time 2, and so on. In this case, the ideal scenario is that the pre-verbal mes-
sage’s demands of the Conceptualiser are met by a rich mental lexicon, and speaking
modules proceed in parallel. As we have seen, this is often not the case with second
language speakers, especially those below advanced levels of proficiency. For them,
speaking is often a process of rescue, as the ideas they would like to express have to be
modified or expressed much more slowly.
However, there is a third manner in which the Conceptualiser can have an impact
on performance which is mid-way between the two just outlined. This occurs when
the Conceptualiser, through planning or through quick-thinking, is able to exploit
macrostructure in what is being said. Conceptualiser operations are partly concerned
with retaining the general structure, and the speaker’s place in it, and partly with the
ongoing detail of current speech. In this case, there is considerable potential for Con-
ceptualiser operations to span several time periods if we are thinking about the macro
planning role that it may discharge.
Against this background, we can discuss the findings in this volume as they impact
upon the second language speaker’s balance of parallel and serial speech performance.
It would seem that the following influences promote a parallel mode of functioning:
–– doing tasks which are structured, with this impacting on Formulator operations,
as there can be focus on what is being said at a particular moment because the
speaker does not have to wrestle with wider organisational issues
–– doing structured tasks in which the speaker is ‘pushed down’ to serial processing,
but where the task structure enables parallel functioning to be regained, since a
fresh starting point can be identified through the clarity of task structure
–– preparedness, which can promote parallel processing variously
{{ through retrieval and rehearsal operations which are recalled during the actual
performance, and which then ease Conceptualiser and Formulator operations
{{ through supported online planning, which combines effective use of planning
with unpressured retrieval conditions while speaking
{{ familiarity, through greater access to relevant lemmas and the information
that they contain
{{ immediate repetition, which activates all aspects of performance, and which
specifically triggers lemma access to the greatest extent possible, thereby
advantaging Conceptualiser, Formulator, and Articulator
–– unpressured performance conditions, where there is not a constant (and possibly
rapid) flow of new input (included here would be There-and-then time perspec-
tive tasks)
246 Peter Skehan
Conversely, of course, there are conditions which make it more likely that serial pro-
cessing will be engaged in. At the risk of repetition, with points which simply reverse
those covered above:
–– ineffective preparedness
{{ where the speaker has been overly ambitious in the planning which is
done, with the result that the speaker tries to take on language which is too
demanding
{{ where the speaker has tended to focus on material (e.g. specific form, which
tends to fade and then not be usable in actual performance)
–– unstructured tasks, in which there is little clear overall structure that the speaker
can use for guidance, or as the basis for retrieving parallel processing after it has
become unsustainable
–– heavier processing pressure, such as quantity or speed of input, typical, in fact, of
Here-and-now conditions
This long section has surveyed the findings from the different research studies in the
book, and shown that portraying performance in terms of complexity, accuracy, lexis,
and fluency is still viable and useful. It has also brought out what progress has been
made through the different chapters. Finally it has attempted to relate these findings to
the way second language speakers are supported or frustrated in achieving the sort of
parallel processing that is the norm for first language speakers. We turn next though
to application, and how the sort of research which has been described might have an
impact within the classroom.
Pedagogy
There are two parts to the Pedagogy section. The first explores issues operative at the
within-lesson level and tries to relate the research-based discussions from the rest
of the book to decisions which have to be made in this context. The second section
outlines some wider principles for the use of tasks over more extended pedagogic
sequences. However, note that it is not the intent of this chapter or book to provide a
detailed discussion of issues of task sequencing (Robinson 2007) or of how a series of
tasks can be linked, within or across lessons, as in a scheme of work (Van Den Branden
2006), or of how project work could be a wider framework within which tasks could
Limited attentional capacity, second language performance, and task-based pedagogy 247
operate (Skehan 1998, 2013). The focus here is on how the kinds of tasks utilized in the
research reported in this volume, and the findings from that research, might provide
one basis for informing task-based pedagogy, though certainly not the only basis.
One additional feature of a task based approach is important. There are assumptions
here about the role of the teacher and the pedagogic activities s/he orchestrates. It
is assumed, for example, that the teacher is able, at the pre-task stage, to devise and
organise activities that are relevant to task completion other than explicit presenta-
tion and teaching. In addition, at the during task stage, it is assumed that the teacher
will not be intrusive, but will nonetheless be very alert and in some way paying atten-
tion to the language which is used, and even possibly findings ways of recording it
without interfering in the way tasks are completed. Finally, at the post task stage, it
is assumed the teacher will have some knowledge of what has happened while the
248 Peter Skehan
task was taking place and can draw on this knowledge effectively when any focussed
language work is carried out. The discussion which follows will presuppose such a
(fairly obvious) set of teaching possibilities and teacher behaviour and will link differ-
ent stages to ways in which tasks can be used more effectively. This is obviously quite
restricted in treatment and a good as well as broader account of task-based teaching
can be found in Norris (2009).
The framework for this discussion is a series of stages which can be proposed for
second language acquisition (Skehan 2002). The stages are intended to capture how
new language develops, and then how progressively greater control is achieved over
that language. The sequence implied is meant to apply to any particular element in
an emerging interlanguage system, but it is assumed that different elements of the
language being learned will be at different points on this sequence. The sections which
follow will clarify each of these stages. They are:
–– noticing
–– hypothesising
–– complexifying/extending
–– restructuring/integrating
–– repertoire creation, availability, accessibility
–– achieving supported control, avoiding error
–– automatizing
–– lexicalising
We can discuss each of these in turn. The importance of noticing has been recognised
through the work of Schmidt (1990), particularly for noticing in input, and Swain
(1985, 1995) for noticing in one’s own output. Schmidt (1990) argued that noticing is
a necessary precursor to subsequent acquisition – that which will be acquired has first
to be noticed. He emphasised the way input may lead to noticing, but that if condi-
tions can be created where noticing is more likely to occur, then there are greater pos-
sibilities for intake (Corder 1981) and processing. Swain, in contrast, was concerned
with the idea of ‘noticing the gap’ where, through communication, a speaker becomes
aware of a deficiency, and only then may do something to address this deficiency.
Clearly, for each of these possibilities, noticing is only a starting point, but it is a very
important starting point. Two additional points are worth bringing out in that regard.
First, to repeat: noticing is a necessary but not a sufficient condition. More needs to
happen, particularly in developing and consolidating what has been noticed (and see
below for more discussion of this). Second, noticing meshes rather neatly with notions
of developmental readiness (Pienemann 2003). If one assumes, following much sec-
ond language acquisition research, that there are sequences of development, then it is
important not simply to notice, but also to notice the right thing, as it were, in terms
Limited attentional capacity, second language performance, and task-based pedagogy 249
of development. In this way, the learner is more likely to be able to make progress with
what has been noticed, as opposed to something coming into awareness which then
leaves awareness just as quickly.
A benign view of second language acquisition would be that interaction contains
all that is necessary for development to proceed. I am assuming that this is not the
case – a cornerstone of the proposals being made here is that something more needs
to happen. In that light, what is important is that noticing can be built on, and nur-
tured (Skehan 2013). The noticing has to come from the learner, but the teacher can
attempt to trigger and/or elicit and certainly respond to such noticings, and return
to them, with the broad aims of developing and consolidating them. In the stages
indicated above, noticing could easily occur at the pre-task stage, for example, when
planning is taking place or when pre-task input, such as text, is being provided. This
perhaps would be more likely to be a noticing-the-hole in projected output for the
task-to-come. It could also occur at the actual task stage, as input is received from
an interlocutor, input perhaps which is particularly salient because it is important in
task fulfillment, and so the form-function mapping of particular input will be clearer.
In any case, at either of these stages, pre-task or task, noticing could easily occur. We
return then to the role of the teacher. It is important that the noticing does not occur
and then disappear – teacher activity can be good at reminding learners about their
own insights, and then working with these insights.
Similar considerations apply with hypothesising. Once again, it is entirely likely
that hypothesising will take place at the pre-task or task stages. Learners may want
to say something (or they hear something) and realise that this prompts reflection
on interlanguage structure. Once again the motive is the language made salient by
the need to do the task. For example, they may realise that they can extrapolate from
some particular item of language because they see how it may be connected to a wider
rule. Context, as with noticing, is the key, since the language is related to what they
are trying to achieve in doing the task, but hypothesising is potentially more power-
ful than noticing. It indicates a greater breadth and depth regarding the target lan-
guage. And the key here is even more clearly the post-task stage. Of course, during
task preparation the learner may formulate a hypothesis and do so very well, so that
little more needs to be said, but it is more likely that there is a tentative or even unnec-
essarily circumscribed nature to the hypothesis. For these reasons, teacher-focus on
this hypothesis afterwards is crucial. Given that the language involved has been made
salient, and given that there is every chance of readiness, since it was the learner who
formulated the hypothesis, the moment is ripe for teacher follow-up. In other words,
in these circumstances, where the language has been made salient by the learner doing
a task, it may now be appropriate for the teacher to be explicit (where being explicit
earlier risked flirting with the irrelevant). The teacher in other words can now rein-
force the hypothesis, extend it, link it with other parts of language, or even correct a
250 Peter Skehan
mistaken hypothesis. Naturally the teacher will have to be judicious in judgments that
are made – one learner’s hypothesis risks being another learner’s boredom or confu-
sion. But assuming the teacher can make good judgements here, the post-task stage,
once again, is vital for ensuring that a good insight, a perceptive hypothesis, is not
abandoned but built upon.
So far, with noticing and hypothesising, the use of tasks has been essentially as a
vehicle so that certain processes occur, and then these processes are exploited at the
post-task stage. At this stage, the teacher can actually be a teacher! In fact, we continue
this pattern (although it will change soon!) with the next couple of stages. Learning a
language is a complex undertaking. Languages are complex systems and sub-systems
and so making inroads into such systems is not easy. Noticing and hypothesising are
good, but only go so far. They are also likely to be limited in scope. If what has been
noticed or hypothesised about is small in scope and self-contained, then perhaps little
more is involved in real development, but often (think of the development of tense
systems, or modality) what is noticed fits into a larger whole. So while the noticing is
essential, what is even more important is that the outcome of that noticing is extended
and connected with other parts of the developing interlanguage system. In other
words, following noticing, there may be a need to complexify, and to see that what is
new bears a relationship to other aspects of the language being learned.
This analysis, though, does not cover all types of development. Sometimes prog-
ress means understanding that previous understanding was partial, and that a larger
system is involved, which pushes the learner to restructure and reorganise. The past
tense in English would be a good example, where at some point the coexistence of
regular and irregular past has to be organised into a more complex system than the
separate item-based or rule-based systems that were previously dominant. In other
words, there is a need to take two steps back to go three steps forward. In such
cases restructuring a developing system, or integrating what was regarded as a self-
contained and independent system into some other larger system may be a fairer way
to capture what is going on in development. Once again, the pre-task and task phases
may provide the insights which are important, particularly if we still think in terms of
readiness. But it is likely to be the post-task phase which is most effective. Then lan-
guage which has been made salient during earlier phases (provided that some record
is available of that language) can come into focus and enable the teacher to help learn-
ers deal with what was uncertain and only partially understood. Once again one has
to emphasise the importance of this being the agenda announced by the learner so
that the teacher, in helping restructuring and integration to occur, is ‘counterpunch-
ing’ to the input that is relevant to the learner. The importance of the post-task phase
cannot be overstated for these developments to occur. Equally, the central way in
which learners can recall the insights that emerged in earlier preparation or com-
munication is vital. They are what drive the usefulness of teacher contributions at the
Limited attentional capacity, second language performance, and task-based pedagogy 251
post-task stage. This is where the teacher can contribute expertise about language,
and consolidate and clarify so that learners have some confidence in the learning that
has taken place (Willis & W illis 2007).
To this point the reader may be thinking that this is an odd presentation of what
happens in task-based approaches to instruction. The focus has been on using tasks,
certainly, and preparing for these tasks, but then the real ‘action’ seems to come at the
post-task stage. The tasks have been important as vehicles to enable useful language to
emerge, but then the significant work is deferred until later. The reason for this is the
problem of new language, sometimes erroneously regarded as a deficiency in a task-
based approach (Swan 2005). What the previous discussion brings out is that there are
ways in which such new language can be brought into focus, something which it seems
has to be made clear for critics of a task-based approach. The other side of this is the
vital importance that the methodology outlined above has the central feature that what
is ‘announced’ as the focus for such language work is material that makes sense to the
learner and which the learner is ready for. The problem in instruction (task-based or
otherwise) is not new language in itself, but which new language. The view taken here
is that it is important that it is the learner, not the syllabus designer or materials writer,
who is influencing what will be done. It is considered that this is crucial, and consis-
tent with contemporary second language acquisition. That the treatment comes a little
later than the need was ‘announced’ is not the issue (although, of course, the teacher
may also respond to learners mid-task, if that is appropriate, and not disruptive or too
extensive). The central point is that the language concerned makes sense given the
learners’ current stage of development (Skehan 2007).
Now we can move on, and in so doing, bring conventional approaches to task
performance more into focus. What we have done so far is look at the language which
emerges from completing tasks, and how that language can be made less transitory
and instead contribute to an evolving and complexifying interlanguage system. How-
ever, now we are assuming that some new language has been noticed, hypotheses have
been formulated, and complexification, extension, and integration have taken place,
wherever they are appropriate. In other words, the learner is clearly aware about fea-
tures of the target language which he or she was not before.
Essentially, the next stage consists of exploring how a degree of control can be
achieved with this new language. How, that is, can language which has been wres-
tled with, possibly laboriously, be converted into language which can be readily used,
appropriately, and with reasonable speed and lack of error? The first stage in acquiring
greater control is to be able to use such new language under supportive conditions
(and possibly not totally accurately at first). In other words, it is assumed that when
some new language is apprehended and linked to previous interlanguage, there is no
magic way in which this language is suddenly available for correct use in real time in
a range of situations. It has to be nurtured and control developed gradually. It is here
252 Peter Skehan
that much of the previous discussion on tasks is relevant. We have seen (and this is
captured in Table 2) that a whole range of influences enhance levels of accuracy and
fluency. In effect, these are the conditions which are needed to make the development
of greater control a reality. So, for example:
are all very important here. By choosing tasks which maximise accuracy and fluency,
and task conditions likewise, the learner is being supported to achieve greater levels of
control. This can assist first stages in making the transition from halting speech pro-
duction to the capacity to use language in real-time. In effect we are dealing here, more
generally, with either creating tasks and conditions for more attentional resources to
be available for the speaker, so that greater accuracy, for example, can be achieved; or
for tasks and conditions to push learners to higher performance in particular areas,
again highlighting accuracy; or for a situation where attention is ‘nudged’ in particular
directions. So knowledge of research into second language production is relevant to
helping learners to follow desirable pedagogic directions.
An important point needs restating here. It is clear that the use of tasks in this way
draws on the importance of implicit learning and even of practice, since the learner
is being given (supportive) opportunities to gain control over language. We are now
beyond the stages where new language has been noticed, complexified, integrated, and
instead are concerned with making this newly-acquired knowledge implicit. A pro-
ponent of a presentation-practice-production (3Ps) approach might then claim that
this view of tasks is no different from the role ascribed to them by task critics such as
Swan (2005) or Bruton (2002), or even task sympathisers such as Littlewood (2004).
The key point, though, is the issue of what language is being used, and that in turn
connects with the issue of pre-selection. A 3Ps approach is characterised by selections
being made by the teacher/materials writer, and then the presentation phase is a phase
working on something selected by someone other than the learner. A task approach
is one where the language which was earlier selected for treatment was selected by
the learner or emerged from learner performance. The use of task-informed criteria
here to promote control is consistent with this view that the selection comes from the
learner. There is no pre-selection of forms for task performance – that is the learner’s
choice. The decisions linked to task selection and task implementation are intended
to create conditions to support control of the language which is chosen by the learner.
In fact, developing this point, we can return to the usefulness of the post-task
stage. Where the language that is being focussed on has emerged from the learner,
there is no reason, if a teacher deems this appropriate, for the post-task stage not to
Limited attentional capacity, second language performance, and task-based pedagogy 253
include practice activities. So far we have regarded this stage as one where noticing,
hypothesising, complexifying, and restructuring generate a fairly cognitive view of
language itself, of its patterns and of the emerging rule-governed system, but devel-
oping language skill, as we all know, involved more than insight and understanding.
It also involves performance, and if a teacher decides that some aspect of language
is emerging, but could benefit from more traditional practice activities, then there is
no reason not to use them at this later stage in a teaching sequence. The point, a bit
laboured by now, is only that what is being practised is a response to learner need, not
syllabus prescription.
Continuing this analysis of tasks in terms of the development of control, the next
two stages are essentially extensions of what we have already seen. Automatising, fol-
lowing Anderson (2004), consists of speeding up performance while eliminating error.
Supported control, the previous stage, is likely to be characterised by slow perfor-
mance, and the intrusion of errors. Automatisation does not really involve much that
is different but it does lead to a greater degree of confidence in performance, and even
robustness in face of contextual difficulties. With tasks, a similar analysis operates. A
range of influences are relevant in creating the conditions in which automatisation is
more likely to occur, and in some ways, these are an extension of what was mentioned
with supported control. But another factor which is important is, in a sense, the reverse
argument. Choosing tasks and task conditions to nurture automatisation is one thing,
but it does not serve learners if they only develop the capacity to use language in sup-
portive conditions. So another goal, where automatisation is concerned, is to choose
tasks and task conditions to put pressure on learners so that they feel more comfort-
able functioning in the wider range of circumstances they will encounter in the real
world. So automatisation here is a speed factor, but also a generalisation challenge,
which can only be attempted when some degree of automatisation has been achieved.
The final stage which is given here is lexicalisation. This stage is proposed (Skehan
1998) to reflect a dual-mode system in which on the one hand we can use rule-based
language produced quickly as highly automatised, or on the other, the products of such
rule-based language can be lexicalised and then produced as exemplars or chunks.
Such a mode enables not simply speed of processing but also the advantage that there
are not many computational demands, so that attention is, to some extent, is freed up
while performance is ongoing (Skehan 2013).
Clearly using tasks is compatible with fostering fluency on the basis of lexicalisa-
tion. Using tasks which support Formulator operations, for example, will help in any
process of lexicalisation that might occur. But in all truth, it is likely that it is ask-
ing too much of tasks to expect them to contribute significantly to any such process.
Of course they will do much more than many other teaching methodologies, but the
amount of communication that is necessary for any process of lexicalisation to occur is
probably too great for any task-based approach to deliver. There simply is not enough
254 Peter Skehan
time for things to develop in this way. Lexicalisation is a desirable goal, but one that
probably can only be achieved by long and extensive exposure to the target language
in question. Task-based syllabuses developed to cover several years may achieve this,
although such a claim is currently speculative. Shorter-term task use would struggle to
achieve wide-ranging lexicalization.
The sequence which has just been described starts with the new, with something
that is perhaps not understood completely and is used haltingly and sometimes incor-
rectly, to a point where the formerly new language is now well-integrated into a devel-
oping system and can be used, in real-time fluently and correctly, and even without
undue processing effort. But there is another aspect of the sequence which we have
temporarily left out and now needs further consideration. This is the issue of saliency
or repertoire creation, what in French can be referred to as disponibilité. The previous
discussion has assumed there is some aspect of the language system and its choice
for use is self-evident – if something is known, it will be used. But such an approach
misses a very important point about language learning – one may know a great deal
of language that one does not use. So in addition to trying to teach new things, a goal
of teaching has to be to increase the access that the learner has to what is known, but
whose relevance and usefulness may not be appreciated. In other words, if learners
have found methods of solving communicative problems which are not pretty, or help-
ful for development, but are nonetheless effective, they may learn new aspects of the
target language, but not use them. They may plateau at a certain, unnecessary level
because they can get by using older methods of solving problems. So the teaching chal-
lenge is not simply to introduce new forms effectively, but also to get those new forms
to supplant older language or at least to become part of a communicative repertoire.
For this goal, a task-based approach is very well suited. Tasks, especially tasks
which are reflected upon afterwards, can support learners to develop such a repertoire,
and for them to see how what has been learned has communicative utility. In this case,
the range of variables which have been shown to influence performance can be related
to the promotion of accessibility. The emphasis here will be on Formulator operations,
rather than Conceptualisation, so that either attention is made more available when the
surface structure of language is being put in place, or there is a focus on accuracy to
some degree (e.g. through post-task activities, or through monitoring). These can make
it more likely that the newer forms will not be there for a rainy day, so to speak, but have
sufficient salience that they can become usable even in more difficult circumstances.
The last section has tried to clarify the pedagogic contributions that a task-based
approach can uniquely make. The section, though, was driven by the sequence of what
Limited attentional capacity, second language performance, and task-based pedagogy 255
What this set of principles tried to address is the tension between two statements:
Second language acquisition demonstrates that internal factors have a strong influence
on patterns of development, such that learners do not necessarily learn what teachers
teach
vs.
Some degree of system and completeness in what is being learned is preferable
The principles tried to strike a balance between these two statements. Unrestrained
applications of a task-based approach based on relatively brief speaking tasks would
risk over-valuing the first statement at the expense of the second. Applications of tra-
ditional approaches would risk over-valuing the second at the expense of the first. So,
if one regards the first two principles above as somewhat preliminary, the third and
the fourth attempt to nudge learners towards a focus-on-form, and the fifth, final, and
very important principle suggests that teachers have an important role in monitoring
the development of their students and designing pedagogic activities which deal with
the lacunae in learning, and orient the input (e.g. through pre-task work, task selec-
tion) towards areas which have not been developed.
But there is vagueness in these principles, and I would like to modify these pro-
posals slightly. First, I would now add to the third and fourth principles a set of sub-
principles. These are:
–– complexifying
–– pressuring
–– easing
–– focussing
–– monitoring
. This proposal is not driven by any precept based on learners’ functional needs (such as Long &
Crookes 1992). It could accommodate such an approach, but in fact offers greater freedom to the
teacher or course-designer regarding the basis for task choice.
256 Peter Skehan
In other words, to make the general principles more accessible, I would suggest using
the sorts of outcomes captured in Table 2, based as this table is on a range of empiri-
cal results, to make more specific the sorts of things that could promote balanced goal
development, and a strong focus on form. We have learned quite a lot from research
as to how a general focus on form can be promoted. So the principles are not quite so
abstract now as they were when first proposed in 1998, and the claims follow from a
range of research results.
Second, I would also like to add a new principle, and ideally place it as a new fifth
principle (pushing the old last principle, use cycles of accountability) down to sixth
position. The new principle would be:
5. Use the post-task phase to nurture language made salient by the task, through:
–– explanation
–– extension
–– integration
–– practice and consolidation
As we have seen in earlier discussion, the post-task phase is vital as the place to capi-
talise on the language which has been made salient by the task. The language which
emerges in the task is the language which is relevant to learners. But the operations
on that language orchestrated by the teacher can enable the sixth principle, the use of
cycles of accountability, to function more effectively. Because it is here that the teacher
can select from the range of language made salient that particular language which it is
most propitious to work on, safe in the knowledge that it will be learner-led language.
This may be developmental language, or it may be language which needs further con-
solidation and practice. It could even be new language, in the sense that a task may
have created a need to mean, and then the teacher can supply that need in a focussed
manner. This is quite a challenge for the teacher. Learners differ; experiences differ.
As a result, there may be a range of candidate language elements to pick up on at the
post-task stage, not all of which can receive focus. It is reliant on the teacher’s profes-
sionalism and training which of these to work with, which to defer until possibly later,
and which to ignore.
We can now represent the set of principles in a more complete, comprehensive
form:
–– easing
–– focussing
–– monitoring
4. Maximise the chances of a focus on form through attention manipulation through
–– complexifying
–– pressuring
–– easing
–– focussing
–– monitoring
5. Use the post-task phase to nurture language made salient by the task, through:
1. explanation
2. extension
3. integration
4. practice and consolidation
6. Use cycles of accountability
So far, we have taken what could be considered to be a micro stance towards pedagogy.
Sub-principles for 3 and 4 concern relatively small-scale tasks, and the task cycle that
is envisaged here would be completed with one or two lessons (and could be broadly
similar to the methodology proposed by Willis & Willis 2007). But teaching extends
over more than just a short time span, obviously, and so, if pedagogic planning is to be
effective, it needs to have means of organising these longer stretches of time.
Project work is one such method of linking a series of tasks in ways that retains
the focus on meaning that tasks provide, but at the same time is susceptible to lon-
ger stretches of planned teaching. Projects, and series of projects, can be designed to
occupy long stretches of teaching. But if that is done, the post-task work which has
been described so far needs to be conceptualised slightly differently, since it is here
that the sixth principle becomes important. ‘Micro’ post-task work takes what has
emerged from a task or group of tasks, and responds to the needs and opportunities
which emerge (Principle 5). In a sense, the teacher’s decision is to examine what is
available, what has become salient through the task, and from the range of possibilities,
choose those which would sensibly be worked on. If, though, one has more extensive
task based performances to work with, which extend over time, then there is the need
to keep records, explore what has been achieved over longer timespans by particular
learners, and make decisions accordingly, decisions which can be collaboratively nego-
tiated and made with students. In other words, the notion of using cycles of account-
ability, where responsibility is shared between learners and teachers, becomes more
important. It can become clearer, with reflective post-task work of this sort, where there
are still gaps and what needs to be focussed on in the future. The broad parameters of
learners not necessarily learning what teachers teach still apply, but the reflection can
258 Peter Skehan
give insights as to what tasks might best be chosen and how they might be exploited. In
this way, the need to ensure some degree of systematicity is enhanced very considerably,
and we have a bridge between micro and macro perspectives on tasks.
References
Anderson, J.R. (2004). Cognitive psychology and its implications (6th ed.). New York, NY: Worth.
Bruton, A. (2002). From tasking purposes to purposing tasks. English Language Teaching Journal,
56, 280–288.
Bygate, M. (2001). Effects of task repetition on the structure and control of oral language. In M. Bygate,
P. Skehan, & M. Swain (Eds.), Researching pedagogic tasks (pp. 23–48). London: Longman.
Bygate, M. (2006). Areas of research that influence L2 speaking instruction. In E. Uso-Juan &
A. Martinez-Flor (Eds.), Current trends in the development and teaching of the four language
skills (pp. 159–186). Berlin: Mouton de Gruyter.
Bygate M., & Samuda V. (2009). Creating pressure in task pedagogy: The joint roles of field, purpose,
and engagement within the interactional approach. In A. Mackey & C. Polio (Eds.), Multiple
perspectives on interaction (pp. 90–116). New York, NY: Routledge.
Bygate, M., Skehan, P., & Swain, M. (Eds.) (2001). Researching pedagogic tasks. London: Longman.
Corder, S. Pit (1981). Error analysis and interlanguage. Oxford: OUP.
Crookes, G. (1989). Planning and interlanguage variation. Studies in Second Language Acquisition,
11, 367–383.
Dornyei, Z. (2005). The psychology of the language learner: Individual differences in second language
acquisition. Mahwah, NJ: Lawrence Erlbaum Associates.
Ellis, R. (1987). Interlanguage variability in narrative discourse: Style shifting in the use of the past
tense. Studies in Second Language Acquisition, 9, 12–20.
Ellis, R. (2005). Planning and task-based performance: Theory and research. In R. Ellis (Ed.), Plan-
ning and task performance in a second language (pp. 3–34). Amsterdam: John Benjamins.
Ellis, R. (2009). The differential effects of three types of task planning on the fluency, complexity, and
accuracy in L2 oral production. Applied Linguistics, 30(4), 474–509.
Foster, P., & Skehan, P. (1996). The influence of planning on performance in task-based learning.
Studies in Second Language Acquisition, 18, 299–324.
Hoey, M. (1983). On the surface of discourse. London: George Allen and Unwin.
Kintsch, W. (1994). The psychology of discourse processing. In M.A. Gernsbacher (Ed.), Handbook
of psycholinguistics (pp. 721–740). San Diego CA: Academic Press.
Kormos, J. (1999). Monitoring and self-repair. Language Learning, 49, 303–342.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Lawrence
Erlbaum Associates.
Levelt, W.J. (1989). Speaking: From intention to articulation. Cambridge: CUP.
Levelt, W.J. (1999). Language production: A blueprint of the speaker. In C. Brown & P. Hagoort
(Eds.), Neurocognition of language (pp. 83–122). Oxford: OUP.
Li, Q. (2010). Focus on form in task based language teaching: exploring the effects of post-task activities
and task practice on learners’ oral performance. Unpublished Ph.D. thesis. Chinese University
of Hong Kong.
Littlewood, W. (2004). The task-based approach: Some problems and suggestions. English Language
Teaching Journal, 58(4), 319–326.
Limited attentional capacity, second language performance, and task-based pedagogy 259
Long, M.H., & Crookes, G. (1992). Three approaches to task-based syllabus design. TESOL Quar-
terly, 26, 27–55.
Loschky, L., & Bley-Vroman, R. (1993). Grammar and task-based methodology. In G. Crookes &
S. Gass (Eds.), Tasks and language learning: Integrating theory and practice. Clevedon: Multilin-
gual Matters.
McDonough, K., & Trofimovich, P. (2009). Using priming methods in second language research.
London: Routledge.
Norris, J. (2009). Task-based teaching and testing. In M.H.Long & C.Doughty (Eds.), Handbook of
language teaching (pp. 578–594). Oxford: Blackwell.
O'Malley, J.M. & Chamot, A.U. (1990). Learning strategies in second language acquisition. Cambridge:
Cambridge University Press.
Ortega, L. (2005). What do learners plan? Learner-driven attention to form during pre-task planning.
In R. Ellis (Ed.), Planning and task performance in a second language (pp. 77–109). Amsterdam:
John Benjamins.
Pienemann, M. (2003). Language processing capacity. In C. Doughty & M. H. Long (Eds.), The hand-
book of second language acquisition (pp. 679–714). Oxford: Blackwell.
Pinter, A. (2005). Task repetition with a 10-year-old. In C. Edwards & J. Willis (Eds.), Teachers explor-
ing tasks in English language teaching (pp. 113–126). Basingstoke: Palgrave Macmillan.
Robinson, P. (2001). Task complexity, cognitive resources, and syllabus design: A triadic framework
for examining task influences on SLA. In P. Robinson (Ed.), Cognition and second language
instruction (pp. 287–318). Cambridge: CUP.
Robinson, P. (2011). Second language task complexity, the Cognition Hypothesis, language learn-
ing, and performance. In P. Robinson P. (Ed.), Second language task complexity: Researching
the Cognition Hypothesis of language learning and performance (pp. 3–38). Amsterdam: John
Benjamins.
Robinson, P., & Gilabert, R. (2007). Task complexity, the Cognition Hypothesis, and second language
learning and performance. International Review of Applied Linguistics, 45, 161–176.
Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11,
17–46.
Skehan, P. (1986). Cluster analysis and the identification of learner types. In V. Cook (Ed.), Experi-
mental approaches to second language acquisition (pp. 81–94). Oxford: Pergamon.
Skehan, P. (1989). Individual differences in second language learning. London: Edward Arnold.
Skehan, P. (1998). A cognitive approach to language learning. Oxford: OUP.
Skehan, P. (2001). Tasks and language performance. In M. Bygate, P. Skehan, & Swain M. (Eds.),
Researching pedagogic tasks: Second language learning, teaching, and testing (pp. 167–185).
London: Longman.
Skehan, P. (2007). Task research and language teaching: Reciprocal relationships. In S. Fotos &
H. Nassaji (Eds.), Form-focused instruction and teacher education: Studies in honour of Rod Ellis
(pp. 55–69). Oxford: OUP.
Skehan, P. (2009a). Modelling second language performance: Integrating complexity, accuracy,
fluency and lexis. Applied Linguistics, 30(4), 510–532.
Skehan, P. (2009b). Models of speaking and the assessment of second language proficiency. In
A. Benati (Ed.), Issues in second language proficiency (pp. 202–215). London: Continuum.
Skehan, P. (2013). Nurturing noticing. In J. Bergleitner & S. Fotos (Eds.), Festschrift in honour of
Richard Schmidt. Honolulu, HI: National Foreign Language Resource Center.
Skehan, P., Bei, X., Li, Q., & Wang Z. (2012). The task is not enough: Processing approaches to task-
based performance. Language Teaching Research, 16(2), 170–187.
260 Peter Skehan
Skehan, P.,& Foster, P. (1997). Task type and task processing conditions as influences on foreign
language performance. Language Teaching Research, 1(3), 185–211.
Skehan, P., & Foster, P. (1999). The influence of task structure and processing conditions on narrative
retellings. Language Learning, 49(1), 93–120.
Swain, M. (1985). Communicative competence: Some roles of comprehensible input and compre-
hensible output in its development. In S. Gass & C. Madden (Eds.), Input in second language
acquisition. Rowley, MA: Newbury House.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook & B. Seidlhofer
(Eds.), Principle and practice in applied linguistics (pp. 245–256). Oxford: OUP.
Swan, M. (2005). Legislating by hypothesis: The case of task-based instruction. Applied Linguistics,
26, 376–401.
Tannen, D. (1989). Talking voices: Repetition, dialogue, and imagery in conversational discourse.
Cambridge: CUP.
Tavakoli, P., & Skehan, P. (2005). Strategic planning, task structure, and performance testing. In
R. Ellis (Ed.), Planning and task performance in a second language (pp. 239–276). Amsterdam:
John Benjamins.
Wang, Z. (2009). Modelling speech production and performance: Evidence from five types of plan-
ning and two task structures. Unpublished Ph.D. thesis. Chinese University of Hong Kong.
Willis, J. (1996). A framework for task-based learning. London: Longman.
Willis, D., & Willis, J. (2007). Doing task-based teaching. Oxford: OUP.
Winter, E. (1976). Fundamentals of information structure: A pilot manual for further development
according to student need. Hatfield, Herts: The Hatfield Polytechnic Linguistics Group, School
of Humanities.
Author Biodata
BUI Hiu Yuet, Gavin obtained his Ph.D. in applied linguistics from The Chinese
University of Hong Kong. Currently he is Assistant Professor at the English Depart-
ment of Hang Seng Management College in Hong Kong where he teaches linguistics
and applied linguistics courses with some occasional addition of EAP/ESP classes.
Dr. Bui’s research interests include task-based language teaching, psycholinguistics,
and second language acquisition.
LI Qian, Christina, obtained her Ph.D. in applied linguistics from the Chinese
University of Hong Kong. Currently, she is an assistant research professor in English
at Guangdong University of Foreign Studies, China. Her interests include task-based
language teaching and research, the acquisition of formulaic sequences by L2 speakers
and bilingual lexicography. Her most recent articles appeared on Language Teaching
Research (2012) (coauthored with Skehan, Bei, Wang) and Foreign Language Teaching
and Research (2013).
PANG Soi Meng, Francine, obtained her Ph.D. in applied linguistics from the Chinese
University of Hong Kong. She is currently Associate Professor at Macao Polytechnic
Institute, and before that she was Assistant Professor and Postdoctoral Fellow at the
Chinese University of Hong Kong and the University of Macao. Dr. Pang has lectured
in applied linguistics, psycholinguistics and Business English. Dr. Pang’s research
interests include second language acquisition, second language reading, and second
language task planning behaviour.
WANG Zhan (Jan) is a postdoctoral researcher in the Learning Research and Devel-
opment Center (LRDC), University of Pittsburgh. She works on projects related to
fostering second language fluency and first language reading development, funded by
the NSF at the Pittsburgh Science of Learning Center (PSLC). She received her Ph.D.
in Applied Linguistics from the Chinese University of Hong Kong.
Index
A C content-based instruction 89
accuracy ix, 2–4, 7, 10, 12–18, causal structure 195 content familiarity 212, 237
34–37, 41–42, 44–53, 63–65, 68, CHAT 15–16, 21, 42, 101, 138, control ix, 2–3, 33–35, 37–40,
71–72, 76–80, 82–90, 99–100, 166, 196–197 101, 147, 181, 202, 248, 251–253
102, 107–111, 116–121, 131–133, CHILDES 15, 72 controlled processing 49
138, 140–150, 155–163, 166–178, Chinese University xi, x, xii, critical period 32
187–194, 197–207, 213–223, 9–10, 211, 261–262 cycles of accountability 255–257
227–228, 231–244, 252, 254 chunks 22, 79, 81–82, 161, 253
accuracy clause length 76–77, 83 CLAN 15, 21, 41–42, 138, D
accuracy versus complexity 166–167, 197 declarative memory 29, 32
effects 96 clause boundary pausing 19, default view of attention 238
analysis-oriented learners 236 113, 118 disponibilité 254
anticipating post-task clause-end pauses 71, 74, 75 dual-mode system 253
transcription 238 cloze test 135 easing 13–14, 35, 52, 157, 178–180,
articulation 10, 27–29, 36–38, coding scheme 11, 95, 97–99, 190, 208, 235, 237, 255, 257
47–49, 79, 221–223, 229, 244 102–104, 116, 125, 215–216 effect size 42–43, 45, 73–74,
articulator 5, 28–30, 34, 218, Cognition Hypothesis ix, xii, 76–77, 139, 142–143, 201
244–245 3–4, 7–9, 13, 68, 96, 120–121, emerging rule-governed
AS-unit 16, 138, 142–143, 147, 155–161, 163, 174–177, 191, system 253
168–169, 198–199, 221 231–232, 234–236, 241 encapsulated 221
assessment 1, 237 Cognition-Tradeoff debate 212, encoding specificity
attention xi, 3, 10, 34–35, 48–53, 231 principle 79
82, 96, 126–127, 129–132, cognitive comparison 31, 145, 148 end-of-clause pausing 21, 175,
145–150, 178–179, 189–191, Cohen’s d 42, 72–77, 83, 86, 187, 201, 227, 233
203–204, 206–207, 211–212, 141–144 error correction 148
214, 221–223, 228–231, collaborative dialogue 130 error free clauses 41, 46,
236–244, 252–255, 257 collaborative transcribing 107–108, 110, 142, 166, 168, 171,
attentional limitations 3, 13, 133–134 192, 198, 201
156–157, 176 communicative language error gravity 18, 193, 197
authenticity 218 teaching (CLT) 1, 32, 130 errors per 100 words 17–18,
automatic processing 29, 31 complexity ix, x, 2–7, 9–10, 76–77, 138, 141, 192–193
automatisation 21, 175, 181, 253 12–16, 32–37, 41–42, 44–50, exemplars 253
automatising 253 63–65, 68, 71–72, 77–78, 80–88, exemplar-based system 79
avoiding error 206, 215, 90, 96, 102–103, 111, 116–121, extended pedagogic
241–242, 248 131–132, 140–144, 146–150, sequences 246
155–163, 166–179, 181, 189–194,
B 197–199, 201–208, 211–223, F
background information 156, 227–228, 230–232, 234–244, 246 factor analysis 167, 169
190 complexifying 9, 14, 157, 178, false starts 20, 71, 73, 76,
beginning-middle-end 208, 248, 251, 253, 255–257 168–170, 172–176
structure 196, 227 conceptualisation 10, 99, 107, familiarity 5–7, 14, 51, 63, 65–71,
breakdown 19–20, 41, 71, 73–76, 157, 179–180, 189, 213, 221, 223, 73–90, 104, 212, 214, 217–218,
80, 83, 85–87, 132, 167–168, 172, 228, 244, 254 222, 237, 243, 245
175, 192, 197, 205, 229–230 Conceptualiser 5, 79, 95, feedback 2, 7, 14, 40, 130, 132,
breakdown fluency 19, 71, 107, 175–176, 178–179, 190, 137, 147–148
73–76, 80, 83, 85–87, 167, 175, 206–208, 215, 217, 220–222, filled pauses 19–21, 71, 197
192, 197 224, 226, 229–230, 242–245 flow 2, 18, 20, 28, 160, 197, 204,
British National Corpus 22 conjoint influence 13, 235 224, 229, 232, 235
264 Index
notice the gap 32 planning-as-familiarity 214 processing ix, x, xi, xii, 1–3,
noticing 129, 132, 145–146, planning-as-organisation 221 5, 8, 10, 13–14, 27–32, 37, 39,
248–250, 253 planning-as-time 214 48–49, 51–52, 64, 68, 79–82,
noticing the hole 146 planning-while-speaking 191 107, 119, 145, 155, 157, 160–163,
noun phrase complexity 82–84 PLex 22, 165 171, 175, 177–180, 187, 190,
nudged 236, 252 post xi, 3, 5, 8–9, 12, 51, 96, 195–196, 198–202, 205–208,
129, 131–134, 136–139, 141–148, 211–212, 215, 217–218, 220–237,
O 150–151, 156, 238–244, 247, 242–243, 245–246, 248,
on-line planning 8, 10–11, 14, 249–252, 254, 256–257 253–254
27, 33, 35–37, 39–40, 42, 45–48, post-task activities 5, 8, 51, 84, processing approach xi, 68, 242
50–53, 63, 65, 79–80, 87, 96, 129, 131–133, 137, 150, 156, 239, processing capacity 64, 79, 82
118, 159, 176, 191, 219–220, 222, 247, 254 processing conditions 8, 10, 13,
231, 238, 242–244, 252 post-task focus stage 32, 162, 175, 178, 187, 190, 195,
operating principles 121 post-task manipulation 12 199–200, 202, 205, 207–208,
opportunity to negotiate 246 post-task phase 3, 9, 14, 132, 250, 215, 223, 226, 231–232
organisational structure 204, 221 256–257 processing limitations 3
over-ambition 95, 111, 219, 224 post-task stage 129, 131, 136, processing pressure(s)
overt speech plan 28, 52 144–146, 151, 249–252, 256 proficiency range 224
post-task transcribing project work 246, 257
P condition 129, 138 propositional demands 234
pair-based transcription 239–240 post-task transcription 12, pruned words 72–73, 168
pair transcription 134 133–134, 138, 144, 148, 150, pseudo-filled pauses 20–21
pair transcribing 137–138, 140, 238–239 pseudo-filled pausing
142–144, 146–147, 149–150 practice 32, 66, 136, 146, 165, psycholinguistic processes 4–5,
parallel (mode of) processing 252–253, 256–257 21, 157, 176, 180
partial lemma access practice activities 253
pausing 18–21, 73, 102, 107, prefabricated expressions 81 R
109, 111–115, 117–119, 168–172, preparedness 5, 7–8, 11, 14, 65, readiness 11, 63–68, 80–82,
174–176, 187, 189, 196–197, 201, 89, 191, 212–213, 217–218, 220, 84–90, 212–213, 217, 220, 222,
204, 207, 227, 233 223, 245–246 248–250
pause location 192 pre-selection 252 reasoning demands 157–158, 231
pedagogic norms 244 presentation-practice- re-entry points 230
pedagogic principles 14 production 252 reformulation 41, 44–45, 71, 73,
pedagogy x, 1, 4, 13–14, 27, 52, pre-task influence 212 101, 133, 166, 168, 197–198, 201,
88, 129, 131–132, 149, 151, 155, pre-task planning 3, 7, 10, 33, 203–205
206, 208, 211, 218, 246–247, 257 35–36, 49–51, 64–65, 70, 75–76, rehearsal 33, 36, 52, 63–68, 86,
phonation time 20, 73, 180 80–82, 130, 191, 214, 221–222, 89, 97, 107, 116–117, 120, 127,
phonological plans 215 238, 243 159, 212, 216–218, 221, 223, 240,
pickup points 221 pre-verbal message 28, 49, 242–243, 245
planning xi, x, 3, 5, 7–14, 52, 78–79, 107, 157, 230, rehearsal strategies 97
27, 33–40, 42–53, 60–61, 244–245 repair 51, 71
63–71, 73–90, 95–107, 109, 111, pre-watching 33–36, 39–40, 47, repair fluency 20, 73, 76, 78,
114–122, 124–127, 130, 136, 156, 49–50 83, 85–86, 166–167, 175, 192,
158–159, 176, 178, 189, 191–192, pressuring 14, 157, 178, 180, 208, 197–198
205–206, 211–224, 228, 220, 255–257 repertoire creation 248, 254
231–233, 237–240, 242–247, prime 223 repetition 5, 7, 10–11, 20, 27,
249, 252, 257, 261 primed 78, 222 34, 36–40, 42–44, 46–48, 50,
planning efficiency 220 priming 217, 223 52–53, 61, 65–67, 73, 84, 86, 89,
planning time 11, 34–36, 39–40, problem of new language 251 101, 156, 166–168, 203, 214–215,
45, 50, 65, 73, 75, 81, 83–85, 87, problem-solution structure 164, 217–218, 222–223, 240, 243,
89, 98, 100–101, 111, 118–120, 187–190, 196, 200, 203, 207, 245–246
124, 127, 158–159, 214–215, 225, 227, 236 replacement 20, 73, 167–168
219–221, 223, 228, 231, 242 procedural memory 29, 32 resource deficits 30–31
266 Index