Butler 2013
Butler 2013
Butler 2013
Elizabeth J. Marsh
Duke University
Among the many factors that influence the efficacy of feedhack on learning, the information
contained in the feedhack message is arguably the most important. One common assumption is that
there is a benefit to increasing the complexity of the feedback message beyond providing the correct
answer. Surprisingly, studies that have manipulated the content of the feedback message in order to
isolate the unique effect of greater complexity have failed to support this assumption. However, the
final test in most of these studies consisted of a repetition of the same questions from the initial test.
The present research investigated whether feedback that provides an explanation of the correct
answer promotes superior transfer of learning to new questions. In 2 experiments, subjects studied
prose passages and then took an initial short-answer test on concepts from the text. After each
question, they received correct answer feedback, explanation feedback, or no feedback (Experiment
I only). Two days later, subjects returned for a final test that consisted of both repeated questions
and new inference questions. The results showed that correct answer feedback and explanation
feedback led to equivalent performance on the repeated questions, but explanation feedback
produced superior performance on the new inference questions.
Feedback is a critical component of any leaming process be- On the basis of a wealth of studies in the literature, the current
cause it allows leamers to reduce the discrepancy between actual answer is that the feedback message should contain the correct
and desired knowledge (Black & Wiliam, 1998). Although prior answer. At the most basic level, feedback must convey information
research has identified many factors that influence the efficacy of about the veracity of the leamer's response (i.e., correct vs. incor-
feedback (for reviews, see D. L. Butier & Winne, 1995; Hattie & rect). However, many studies have shown little or no benefit of
Timperley, 2007; Kluger & DeNisi, 1996; Shute, 2008), the con- providing verification feedback relative to no feedback (e.g., Pa-
tent of the feedback message is arguably the most important aspect shler et al., 2005; Plowman & Stroud, 1942; Roper, 1977; but see
of any feedback procedure. The information supplied in the feed- Fazio, Huelser, Johnson, & Marsh, 2010). Including the correct
back message is critical because it enables leamers to correct answer in the feedback message substantially increases the effi-
errors (e.g., Pashler, Cepeda, Wixted, & Rohrer, 2005) and main- cacy of feedback because it provides the information that leamers
tain correct responses (e.g., A. C. Butler, Karpicke, & Roediger, need to correct their errors. Indeed, the vast majority of studies that
2008). Thus, a primary objective of feedback research is to deter- have compared correct answer feedback with verification feedback
niine what information the feedback message should contain in have shown a superiority of correct answer feedback (e.g., Pashler
order to be maximally effective. et al., 2005; Phye & Sanders, 1994; Roper, 1977; Travers, Van
Wagenen, Haygood, & McCormick, 1964; Whyte, Karolick,
Neilsen, Elder, & Hawley, 1995).
This article was published Online First December 17, 2012. Is it beneficial for the feedback message to include other infor-
Andrew C. Butler, Psychology and Neuroscience, Duke University; mation in addition to the correct answer? A common assumption
Namrata Godbole, Department of Psychology, University of North Caro- among educators and researchers is that providing students with
lina-Greensboro; Elizabeth J. Marsh, Psychology and Neuroscience, Duke additional information in the feedback message will improve
University. leaming. The umbrella term elaborative feedback is often used to
This research was supported by a Collaborative Activity Award from the describe any type of feedback that is more complex than correct
James S. McDonnell Foundation's 21st Century Science Initiative in answer feedback, and there are many ways of elaborating the
Bridging Brain, Mind and Behavior awarded to Elizabeth J. Marsh. We
feedback message (for a taxonomy of feedback messages, see
thank Katherine Rawson and the members of the Marsh Lab for their
Kulhavy & Stock, 1989). Examples of elaborative feedback in-
helpful comments and suggestions on drafts of the article.
Correspondence conceming this article should be addressed to Andrew
clude providing an explanation of why a particular response is
C. Butler, Psychology and Neuroscience, Duke University, Box 90086, correct or incorrect (explanation feedback) and re-presenting the
Durham, NC 27708-0086. E-mail; andrew.butler@duke.edu original leaming materials (restudy feedback). Due to the assump-
290
FEEDBACK AND TRANSFER 291
tion that elaborative feedback is helpful to students, it is often rect answer feedback, or no feedback. Two days later, they took
included as a component in methods of instmction, such as intel- a final test that assessed both retention (via repeated questions
ligent tutoring systems (Corbett, Koedinger, & Anderson, 1997) from the initial test) and transfer (via new inference questions).
and computer-assisted instmction programs (Gibbons & Fair- We predicted that explanation feedback would lead to better
weather, 1998). For instance, the AutoTutor is an intelligent tu- transfer relative to correct answer feedback, but the two types of
toring system that helps leamers solve complex physics problems feedback would produce equivalent retention of the correct
by providing many different types of feedback, including hints, answers.
corrections, and explanations (Graesser, Chipman, Haynes, & 01-
ney, 2005). However, elaborative feedback is just one of many
components combined to enhance student leaming in such sys-
Method
tems, and its independent contribution to teaming is not assessed. Participants. Sixty Duke University students participated for
Surprisingly, studies that have directly compared elaborative either course credit or payment. Four additional subjects were
feedback with correct answer feedback have found little or no excluded because they failed to follow experimental instmctions.
benefit to increasing the complexity of the feedback message (for Design. The experiment had a 3 (type of feedback: no feed-
a review, see Kulhavy & Stock, 1989; for a meta-analysis, see back, correct answer, explanation) X 2 (type of final test question:
Bangert-Drowns, Kulik, Kulik, & Morgan, 1991). For example, repeated, new) mixed factorial design. Type of feedback was
many studies have found that there is no benefit of providing manipulated between subjects, and type of final test question was
explanation feedback relative to correct answer feedback (e.g., manipulated within subjects, between materials.
Gilman, 1969; Kulhavy, White, Topp, Chan, & Adams, 1985; Materials and counterbalancing. Materials consisted of 10
Mandemach, 2005; Pridemore & Klein, 1995; Sassenrath & Gav- passages about a variety of topics (e.g., the respiratory system,
erick, 1965; Smits, Boon, Sluijsmans, & van Gog, 2008; Whyte et tropical cyclones, etc.) and associated questions. Six of the pas-
al., 1995). Similarly, other studies have shown that providing sages and the associated questions were adapted from A. C. Butter
restudy feedback yields equivalent performance to correct answer (2010), and the rest were created to match. Each passage consisted
feedback (e.g., Andre & Thieman, 1988; Kulhavy et al., 1985; of 500 words of text and contained two critical concepts (see
Peeck, 1979). Critically, the content of the feedback message was Appendix A for sample passages). Thus, there was a total of 20
manipulated as an independent variable in these studies, which critical concepts. A concept was operationally defined as a piece of
allowed the unique effect of greater complexity (or lack thereof) to information that must be abstracted from multiple sentences. Two
be isolated. questions were associated with each concept: a definition question
The lack of empirical support for the efficacy of elaborating the and an inference question. All of the definition questions were
feedback message is surprising, but these null effects may be due used on the initial test to assess memory for the 20 concepts. The
to how leaming was assessed on the final test. Almost all of the definition question was repeated on the final test for 10 of
studies on elaborative feedback have used a final test that assessed the concepts in order to assess retention of the correct answer. The
retention of the correct answer by repeating the same questions inference question was given on the final test for the other 10
from the initial test. If the leamer only needs to remember the concepts in order to assess transfer of knowledge. The materials
correct answer to perform well on the final test, then the additional were counterbalanced by creating two versions of the final test. In
information contained in elaborative feedback is superfluous. each version, one of the two concepts for each passage was tested
However, this additional information may be important for foster- by repeating the definition question, whereas the other concept was
ing better comprehension of the material. For example, providing tested with a new inference question. Thus, each concept was
an explanation of why a response is correct (i.e., explanation tested equally often in each final test condition across subjects.
feedback) might help the leamer to move from superficial factual Two feedback messages were created for each definition
knowledge to a more complex understanding of the concept. Thus, question: a correct answer message and an explanation mes-
elaborative feedback might be expected to facilitate performance sage. The correct answer message consisted of a statement of
on a final test that assesses understanding rather than retention of the correct answer, whereas the explanation message consisted
the correct response. One hallmark of superior understanding is the of the correct answer as well as two additional sentences
ability to transfer knowledge to new contexts. Transfer can be elaborating on the correct answer. The two additional sentences
broadly defined as "the influence of prior teaming (retained until in the explanation feedback message were taken from the pas-
the present) upon the teaming of, or response to, new material..." sage and helped to explain the concept. The explanation feed-
(McGeoch, 1942, p. 394). In the present study, we assess under- back did not contain any new information and it did not provide
standing by investigating leamers' ability to transfer their knowl- the answer to the inference question. Appendix B contains
edge on a final test that involves making inferences using previ- sample questions and feedback.
ously teamed concepts. Procedure. The experiment consisted of two sessions spaced
2 days apart. Individual PCs mnning MediaLab software (Jarvis,
2004a, 2004b) were used to present all the materials and collect the
Experiment 1 responses. In Session 1, subjects were randomly assigned to one of
The goal of the first experiment was to investigate the hy- the three feedback conditions (no feedback, correct answer feed-
pothesis that the efficacy of elaborative feedback depends on back, and explanation feedback). Regardless of condition, they
how learning is assessed. Subjects studied a set of passages and studied the 10 passages in a random order determined by the
then took an initial test on critical concepts from the passages. computer. Each passage was divided into two paragraphs, and each
After each question, they received explanation feedback, cor- paragraph was presented for 80 s (pilot testing showed this amount
292 BUTLER, GODBOLE, AND MARSH
of time to be sufficient to read the entire paragraph once). Next, DNo Feedback
subjects engaged in a distractor task for 5 min (solving visuospatial •Correct Answer
puzzles). Finally, they completed a self-paced short-answer test on .70
•Explanation
the critical concepts that consisted of the 20 definition questions. .60
The questions were presented one at a time in a random order, and
subjects were required to generate a response to each question. If JO
they did not know the answer, they were instmcted to make a
plausible guess. Immediately after answering each question, sub- .40
jects received the type of feedback that they had been assigned (no .30
.66
feedback, correct answer feedback, or explanation feedback). The
question was always re-presented with the feedback message to .20 .43
provide context. Feedback was provided regardless of whether the
response was correct or incorrect, and subjects were required to .10
Results
proportion of correct responses on the new inference questions
All results were significant at the .05 level unless otherwise
relative to both the correct answer feedback and no-feedback
stated. Pairwise comparisons were Bonferroni-corrected to the .05
conditions (.45 vs. .30), i(38) = 3.13, SED = .05, d = .90; and (.45
level. Eta-square and Cohen's d are the measures of effect size
vs. .28), i(38) = 2.97, SED = .05, d = 1.09, respectively. The
reported for all significant effects in the analysis of variance
correct answer and no-feedback conditions did not differ (t < 1).
(ANOVA) and r-test analyses, respectively.
In addition, an item analysis was conducted for the critical
Scoring. Two coders independently coded the responses as
comparison between the correct answer and explanation feed-
correct or incorrect according to a scoring mbric. Both coders were
back conditions by computing a t test with items as the unit of
blind to condition and coded all the responses for a given question
observation instead of subjects. This item analysis confirmed
together to increase consistency. Cohen's kappa was calculated to
that explanation feedback produced superior transfer, i(40) =
assess interrater reliability. Reliability was high (K = .89), and the
2.04, SED = .07, d = .61.
first author (ACB) resolved the disagreements in scoring.
Initial test performance. Initial test performance was rela-
tively low (grand M = .43), which was desirable for investigating Discussion
the effects of feedback. A one-way ANOVA showed no effect of The results of Experiment 1 showed that the benefits of expla-
feedback condition (E < 1). nation feedback depend on how leaming is assessed. Replicating
Final test performance. Figure 1 shows the proportion of the findings of previous studies, explanation feedback produced
correct responses on the final test as a function of feedback equivalent performance relative to correct answer feedback when
condition on the initial test and type of final test question. When retention was assessed with repeated questions on the final test
subjects received correct answer or explanation feedback on the (e.g., Gilman, 1969; Kulhavy et al., 1985; Mandemach, 2005;
initial test, they performed better on the repeated definition ques- Pridemore & Klein, 1995; Sassenrath & Gaverick, 1965; Smits et
tions relative to when they did not receive feedback. A one-way al., 2008; Whyte et al., 1995). However, when the final test
ANOVA confirmed this observation by revealing a significant assessed understanding by requiring subjects to transfer their
main effect of type of feedback, F(2, 57) = 6.54, MSE = .05, TI^ = knowledge of the concept to a new context, explanation feedback
.19. Follow-up pairwise comparisons showed that both the correct led to better performance than correct answer feedback. If it can be
answer and explanation feedback conditions led to a greater pro- replicated, this novel finding is important because it opens the door
portion of correct responses on repeated questions relative to the to a promising new direction for future research: the use of
tio-feedback condition (.62 vs. .43), /(38) = 2.63, SED = .07, d = elaborative feedback to promote transfer of leaming.
.85; and (.66 vs. .43), /('38) = 3.34, SED = .07, d = 1.06,
respectively. However, there was no significant difference be-
tween the correct answer and explanation feedback conditions Experiment 2
(Í < 1). One of the goals of Experiment 2 was to replicate the novel
On the new inference questions, subjects performed best when finding from Experiment 1 that explanation feedback produced
they had received explanation feedback relative to when they got better transfer to new inference questions than did correct answer
correct answer or no feedback. A one-way ANOVA showed a feedback. A second goal was to investigate a potential explanation
significant main effect of type of feedback, F(2, 57) = 6.55, for this finding. As described in the introduction, the ability to
MSE = 04, ~c^ = .19. Pairwise comparisons confirmed that the transfer knowledge to new contexts requires understanding; how-
explanation feedback condition produced a significantly greater ever, transfer also requires retention, especially if the ability to
FEEDBACK AND TRANSFER 293
transfer knowledge is assessed after a delay, such as in Experiment In addition to the inclusion of the reanswer phase, a few other
1. One way of conceptualizing the process of transfer involves changes were made for Experiment 2. First, the type of feedback
breaking it down into three steps: (1) The leamer must recognize variable was manipulated within subjects in order to show that this
that previously acquired knowledge is relevant, (2) the leamer finding would generalize across experimental designs. Second, the
must recall that knowledge, and (3) the leamer must apply that no-feedback condition was dropped in order to maximize the
knowledge to the new context (see Bamett & Ceci, 2002). In this number of items in the explanation and correct answer feedback
conceptualization, the first two steps in the transfer process refiect conditions. Third, the final test consisted of only new inference
retention of knowledge, whereas the third step refiects understand- questions (i.e., no repeated questions) in order to focus on repli-
ing. cating the key finding from Experiment L
In Experiment 1, the first step (recognition) was unlikely to
have been a problem: All subjects were instructed that the final
Method
test questions were about information that they had read in the
passages, and therefore they recognized that they had to recall Participants. Twenty-four Duke University students partici-
and apply their knowledge about the passages. Thus, the dif- pated for either course credit or a payment. One additional subject
ference between the two feedback conditions in the ability to was excluded for not following the instructions.
transfer knowledge must have been due to differences in recall, Design. A single variable (type of feedback: correct answer,
application, or both of these steps. Each explanation feedback explanation) was manipulated within subjects, between materials.
message re-presented some information from the passage about Materials. The materials from Experiment 1 were used again.
the critical concept, so it is possible that this re-presentation Procedure. The procedure was the same as Experiment 1
boosted later recall of that information on the final test. In except for the following changes. First, subjects received correct
contrast, subjects who received correct answer feedback might answer feedback on 10 of the definition questions on the initial test
have been less likely to recall this information because they had and explanation feedback for the other 10 questions. Second, the
only studied it once when they read the passages. Although it is final test consisted of 20 new inference questions (no questions
possible that differences in recall (Step 2) may have contributed were repeated from the initial test). Third, the final test consisted
to the results, we believe it is more likely that explanation of two phases. In Phase 1, subjects answered the new inference
feedback fostered a deeper understanding of the concepts, questions in the same manner as Experiment 1. In Phase 2, they
which facilitated the application of that knowledge to complete were given the opportunity to reanswer each inference question
the final step. while also viewing the relevant explanation feedback (i.e., regard-
In order to investigate this idea, a second phase was added to less of whether they had seen the explanation feedback on the
the final test in which all subjects reanswered the inference initial test or not). Subjects were told that they could re-enter their
questions with the explanation feedback present (i.e., regardless initial response or modify their response based on the information
of whether they had received explanation or correct answer presented in the explanation feedback.
feedback on the initial test). The rationale for the inclusion of
the "reanswer" phase was that it would separate the recall and
Results
application steps in the transfer process (see Table 1 for a
schematic explanation of the logic). As described above, any Scoring. Again, two coders independently scored the re-
difference in performance between the correct answer and ex- sponses. Reliability was almost perfect (K = .98), and the frrst
planation feedback conditions when answering the new infer- author (ACB) resolved the few disagreements.
ence questions could be due to recall, application, or both of Initial test performance. Overall, subjects correctly an-
these components. By allowing subjects to consult the expla- swered a little less than half the questions (grand M = .AA), and
nation feedback during the subsequent reanswer phase, the need there was no significant difference between the two feedback
to retain the information would be eliminated. Thus, any dif- conditions (i < 1).
ference in performance between the two feedback conditions in Final test performance. The left panel of Figure 2 shows the
the reanswer phase would reflect the subjects' ability to apply proportion of correct responses on the initial answer phase of the
their knowledge (i.e., their depth of understanding). final test as a function of feedback condition on the initial test.
Table 1
The Logic Behind the Two-Phase Einal Test Used I 'n Experiment 2
Final test phase
Step in the transfer process Initial answer to new' inference question Reanswer with explanation feedback
Recognition EX = CA EX = CA
Recall EX > CA EX = CA
Application EX > CA EX > CA
Note. When initially answering the new inference questions, there should be no difference between the two
feedback conditions with respect to the recognition component of the feedback process; however, the explanation
feedback condition could lead to better recall and/or application. In the reanswer phase, both recognition and
recall are equated; thus, the superiority of explanation feedback over correct answer feedback must be due to the
application component. EX = explanation feedback; CA = correct answer feedback.
294 BUTLER, GODBOLE, AND MARSH
,80 1
^Correct Answer
feedback was manipulated between subjects in Experiment 1 and
•Explanation
within subjects in Experiment 2.
,70
When subjects had the opportunity to reanswer the inference
,60 questions with the explanation feedback present, the results were
intriguing. Performance improved in the explanation feedback
condition, which suggests that some of the information from the
I ,40 feedback had been forgotten; once subjects were re-presented with
this information, they were able to successfully apply this knowl-
edge to answer the inference questions. Performance also im-
proved in the correct answer feedback condition. This improve-
,20
ment presumably also reflects the recall component of the transfer
,10 process—because subjects did not receive the explanation feed-
back on the initial test, they may not have retained this information
.(X)
Initial Answer to Inference Qtiestion Re-An.'iwer with Explanation Feedback (unless they remembered it from the passage). Practically speak-
(Phase I) (Phase 2) ing, this finding is important because it shows that giving expla-
Final Itet Phase nation feedback after a delay can still help to improve transfer,
which is consistent with recent research that shows a benefit of
Figure 2. Proportion of correct responses on the fmal test as a function of feedback even when its presentation is delayed (e.g., A. C. Butler,
feedback condition on the initial test in the initial answer (left side) and Karpicke, & Roediger, 2007; Metcalfe, Komell, & Finn, 2009).
reanswer (right side) phases of the final test in Experiment 2. Error bars
represent 95% confidence intervals. Most importantly, the difference in performance between the
two feedback conditions for the initial answers to the new
inference questions was also observed in the subsequent
Replicating the key result from Experiment 1, explanation feed- reanswer phase. In both feedback conditions, the presence of
back led to a significantly greater proportion of correct responses the explanation feedback while reanswering the inference ques-
on the new inference questions relative to correct answer feedback tions meant that the burden to recall this information was
(.37 vs. .27), i(23) = 4.18, SED = .02, d = .50. removed, and any difference between the two conditions had to
The right panel of Figure 2 shows the proportion of correct be due to their ability to apply their knowledge. Receiving
responses on the reanswer phase of final test as a function of explanation feedback on the initial test may have enabled
feedback condition on the initial test. Overall, the opportunity to subjects to acquire a deeper understanding of the critical con-
reanswer the inference questions with the explanation feedback cepts, which helped them to correctly answer more inference
present improved in both the explanation feedback and correct questions in the reanswer phase. Furthermore, this finding sug-
answer feedback conditions; however, explanation feedback still gests that it may be particularly important to receive the explana-
produced a significantly greater proportion of correct responses tion feedback soon after retrieving a concept from memory be-
than correct answer feedback (.46 vs. .37), i(23) = 2.64, SED = cause the difference between the two feedback conditions persisted
.04, d = .37. In order to compare performance on the two phases, in the reanswer phase when the explanation feedback was always
a 2 (final test phase: initial answer, reanswer) X 2 (type of present. We tum now to discussing the importance of these find-
feedback: correct answer, explanation) ANOVA was conducted. ings in the context of the broader feedback literature.
This analysis revealed significant main effects of final test phase,
F( 1, 23) = 14.50, MSE = .02, -t^ = .39, and type of feedback, F( 1, General Discussion
23) = 15.86, MSE = .01, -t^ = .41. However, the interaction was
not significant (F < 1). In addition, an item analysis was con- The present research helps to resolve a paradox about elaborative
ducted by computing the same 2 X 2 ANOVA with items as the feedback. Although elaborative feedback is assumed to benefit leam-
unit of observation instead of subjects. This item analysis ers and it is often included in instructional methods (e.g., Corbett et
revealed the same pattern of results: significant main effects of al., 1997; Gibbons & Fairweather, 1998), reviewers of the feedback
phase, F(l, 19) = 18.45, MSE = .01, TI^ = .15, and type of literature had concluded that increasing the complexity of the feed-
feedback, F(l, 19) = 5.98, MSE = .01_ •x^ = .15, but no back message does not benefit leaming (e.g., Bangert-Drowns et al.,
significant interaction (F < 1). 1991; Kulhavy & Stock, 1989). With respect to the existing evidence
in the literature, this conclusion was warranted—many studies that
have isolated the effects of greater feedback complexity have found
Discussion no benefit of elaborative feedback relative to correct answer feedback
Experiment 2 replicated the key novel finding from Experiment (e.g., Andre & Thieman, 1988; Gilman, 1969; Kulhavy et al., 1985;
1. When subjects received explanation feedback on the initial test, Peeck, 1979; Pridemore & Klein, 1995; Sassenrath & Gaverick, 1965;
they were more successful at transferring their knowledge on the Whyte et al., 1995). However, all of these studies assessed retention
new inferences questions than when they received correct answer of the correct response to a previously presented question rather than
feedback. The additional information contained in the explanation deeper understanding of the material. When understanding was as-
feedback message fostered better understanding of the critical sessed in the present study, explanation feedback produced better
concepts, which enabled subjects to apply this knowledge to an- performance than correct answer feedback. This finding suggests the
swer new inference questions. Importantly, this result also shows need for a fundamental réévaluation of how elaborative feedback
that the effect generalizes across experimental design—type of affects leaming.
FEEDBACK AND TRANSFER 295
Why did explanation feedback produce superior performance on ports the transfer of knowledge. Within the context of the present
new inference questions relative to correct answer feedback? One study, processing the explanation feedback after an initial retrieval
might expect to find an answer to tiiis question among the various attempt may have helped subjects to improve their situation model of
theories that have been proposed to explain how feedback affects the text and achieve a deeper understanding. A more developjed
leaming. However, many of these theories do not address this situation model would be expected to enable superior transfer of
question at all because they seek to describe the effects of feedback knolwedge to the new inference questions, which were aligned with
at a more complex level than that of a single task (e.g., D. L. Butler this representational level. In contrast, the repeated questions used to
& Winne, 1995; Hattie & Timperiey, 2007; Kluger & DeNisi, assess retention were aligned with memory for the textbase, and thus
1996). Such "macrolevel" theories model the influence of feed- explanation feedback would not be expected to benefit performance
back on various student behaviors, such as self-regulation, leaming on these items relative to correct answer feedback.
strategies, and motivation, during a continuous process of leaming One remaining puzzle is why explanation feedback was effec-
that includes repeated presentations of feedback. Although other tive at facilitating understanding when it was given on the initial
theories provide a "microlevel" account of leaming from feedback test, but it did not have the same effect on the correct answer
during a single task, these theories are either too general (e.g., condition when it was presented during the reanswer phase of the
Bangert-Drowns et al., 1991) or focus on explaining other feed- final test. Although additional research will be needed to further
back phenomena (e.g., the relationship between response confi- explore this finding, one potential explanation revolves around the
dence and feedback processing; Kulhavy, 1977). Kulhavy and concept of memory reconsolidation. In general, practice retrieving
Stock (1989) put forth the only theoretical framework that specif- the critical concepts from memory would be expected to help
ically addresses the effects of elaborating the feedback message subjects to better retain these concepts and transfer them to new
beyond providing the correct answer. Despite their efforts to contexts, regardless of feedback condition (e.g., A. C. Butler,
develop a coherent account of how elaboration affects leaming, 2010; Roediger & Butler, 2011). However, retrieval may also
they were "unable to reach any useful conclusion regarding how reopen a memory so that it must be reconsolidated, meaning that
the elaborative component of the feedback operates" (Kulhavy & the memory enters a labile state in which it can be altered (e.g.,
Stock, 1989, p. 289). Recent microlevel reviews of the feedback Hupbach, Gomez, Hardt, & Nadel, 2007; for a review, see Dudai,
literature describe many of these theories but offer no new ideas 2006; Lee, 2009). For example, a recent study by Finn and Roe-
regarding elaborative feedback (e.g., Mory, 2004; Shute, 2008). diger (2011) showed that postretrieval processing of new informa-
Given the dearth of existing feedback theory upon which to tion results in the integration of this information into the existing
draw, we looked to theories in other domains in order to develop memory, thereby enhancing retention. In the present study, postre-
an explanation for our findings. One relevant theory is the frame- trieval processing of the explanation feedback on the initial test
work proposed by Bamett and Ceci (2002) to explain the process may have resulted in the information being integrated into the
of transfer and the factors that influence whether it will occur. As memory of the concept, thus building a deeper understanding (i.e.,
described above, they conceptualize the process of transfer in a more developed situation model). Retrieval during the final test
terms of three steps: recognition, recall, and application. Both should also involve reopening the memory, giving a chance for
correct answer and explanation feedback can improve the retention both groups of subjects to integrate the explanation feedback
of specific knowledge, which would facilitate later recall of the presented in the reanswer phase into their memories; however, it
information (i.e., the second step in the transfer process); this may be that the memory must be successfully reconsolidated (over
conclusion is supported by the finding that the two types of time) before a deeper understanding is developed. Although ad-
feedback produced equivalent performance on the definition ques- mittedly somewhat speculative, this reconsolidation hypothesis
tions that were repeated on the final test in Experiment 1. How- provides a potential starting point for follow-up studies.
ever, explanation feedback may also enable leamers to better The present findings open the door for new research that inves-
comprehend the concepts, thus facilitating the application of that tigates the role of feedback in promoting transfer of knowledge.
knowledge to new contexts (i.e., the third step in the transfer The need for this research is apparent with respect to all types of
process). The results of the reanswer phase in Experiment 2 elaborative feedback, but also more generally with other factors
support this conclusion. When subjects reanswered the inference that influence the efficacy of feedback. The vast majority of
questions with the explanation feedback present, the superiority of feedback studies in the literature use final tests with repeated
explanation feedback persisted even though the recall demands questions to assess retention of knowledge. Although retention is
were removed, suggesting that the locus of the effect is the certainly an important leaming outcome, so too is understanding.
application step of the transfer process. Thus, there is a great need for research on how feedback affects
Another way of framing our findings is through the lens of transfer for both theoretical and pedagogical purposes. If under-
text-processing theories that conceptualize the development of standing is ignored as a leaming outcome, many promising meth-
understanding as a process that requires representing a text on ods of providing feedback may be misconceived and overlooked.
multiple levels (for a review, see Graesser, Millis, & Zwaan, 1997; For example, one method that may help to produce substantial
Kintsch, 1998). Such theories often differentiate between three understanding is to give students correct answer feedback and then
levels of representation: the surface level—the specific words and have them generate their own explanations for why their response
syntax used in the text; the textbase—an abstract representation of is correct or incorrect. Previous studies have not found a benefit of
the ideas and their connections; and the situation model—a per- such a procedure relative to simply providing correct answer
sonal interpretation of the text that often includes preexisting feedback (e.g., McDaniel & Fisher, 1991); however, these studies
knowledge. According to most theories, the situation model is the have measured retention rather than understanding. In summary,
representational level that reflects deep understanding and sup- the findings of the present study indicate tiiat transfer of knowl-
296 BUTLER, GODBOLE, AND MARSH
edge represents a fmitful new frontier for feedback research—it is Jarvis, B. G. (2004b). DirectRT (Version 2004.1.0.55) [Computer soft-
time for feedback researchers to move beyond measuring retention ware]. New York, NY: Bmpirisoft Corporation.
and investigate how feedback affects understanding. Kintsch, W. (1998). Comprehension: A paradigmfor cognition. Cam-
bridge, UK: Cambridge University Press.
Kluger, A. N., & DeNisi, A. (1996). The effects of feedback interventions
References on performance: A historical review, a meta-analysis, and a preliminary
feedback intervention theory. Psychological Bulletin, 119, 254-284.
Andre, T., & Thieman, A. (1988). Level of adjunct question, type of doi: 10.1037/0033-2909.119.2.254
feedback, and leaming concepts by reading. Contemporary Educational Kulhavy, R. W. (1977). Feedback in written instruction. Review of Edu-
Psychology. 13, 296-307. doi:10.1016/0361-476X(88)90028-8 cational Research, 47, UX-l'il.
Bangert-Drowns, R. L., Kulik, C. C , Kulik, J. A., & Morgan, M. (1991). Kulhavy, R. W., & Stock, W. A. (1989). Feedback in written instruction:
The instructional effect of feedback in test-like events. Review of Edu- The place of response certitude. Educational Psychology Review, 1,
cational Research, 61, 213-238. 279-308. doi:10.1007/BF01320096
Bamett, S. M., & Ceci, S. J. (2002). When and where do we apply what we Kulhavy, R. W., White, M. T., Topp, B. W., Chan, A. L., & Adams, J.
leam? A taxonomy for far transfer. Psychological Bulletin, ¡28, 612- (1985). Feedback complexity and corrective efficiency. Contemporary
637. Educational Psychology, 10, 285-291. doi:10.1016/0361-
Black, P., & Wiliam, D. (1998). Assessment and classroom leaming. 476X(85)90025-6
Assessment in Education: Principles, Policy, & Practice, 5, 1-1 A. Lee, J. L. C. (2009). Reconsolidation: Maintaining memory relevance.
Butler, A. C. (2010). Repeated testing produces superior transfer of leam- Trends in Neurosciences, 32, 413-420. doi:10.1016/j.tins.2009.05.002
ing relative to repeated studying. Joumal of Experimental Psychology: Mandernach, B. J. (2005). Relative effectiveness of computer-based and
Learning, Memory, and Cognition, 36, 1118-1133. doi:10.1037/ human feedback for enhancing student leaming. The Joumal of Educa-
a0019902 tors Online, 2, 1-17.
Butler, A. C , Karpicke, J. D., & Roediger, H. L., m . (2007). The effect of McDaniel, M. A., & Fisher, R. P. (1991). Tests and test feedback as
type and timing of feedback on leaming from multiple-choice tests. leaming sources. Contemporary Educational Psychology, 16, 192-201.
Journal of Experimental Psychology: Applied, 13, 273-281. doi: doi:10.1016/0361-476X(91)90037-L
10.1037/1076-898X.13.4.273
McGeoch, J. A. (1942). The psychology of human leaming: An introduc-
Butler, A. C , Karpicke, J. D., & Roediger, H. L., ffl. (2008). Correcting a tion. New York, NY: Longmans, Green and Co. doi: 10.2307/2262568
meta-cognitive error: Feedback enhances retention of low confidence
Metcalfe, J., Komell, N., & Finn, B. (2009). Delayed versus immediate
correct responses. Journal of Experimental Psychology: Learning, Mem-
feedback in children's and adult's vocabulary leaming. Memory &
ory, and Cognition, 34, 918-928. doi:10.1037/0278-7393.34.4.918
Cognition, 37, 1077-1087. doi:10.3758/MC.37.8.1077
Butler, D. L., & Winne, P. H. (1995). Feedback and self regulated leaming: A
Mory, E. H. (2004). Feedback research review. In D. Jonassen (Ed.),
theoretical synthesis. Review of Educational Psychology, 65, 245-281.
Handbook of research on educational communications and technology
Corbett, A. T., Koedinger, K. R., & Anderson, J. R. (1997). Intelligent
(pp. 745-783). Mahwah, NJ: Erlbaum.
tutoring systems. In M. Helander, T. K. Landauer, & P. Prabhu (Eds.),
Pashler, H., Cepeda, N. J., Wixted, J. T., & Rohrer, D. (2005). When does
Handbook of human-computer interaction (2nd ed., pp. 849-874). New
feedback facilitate leaming of words? Joumal of Experimental Psychol-
York, NY: Elsevier.
ogy: Leaming, Memory, and Cognition, 31, 3-8. doi:10.1037/0278-
Dudai, Y. (2006). Reconsolidation: The advantage of being refocused.
7393.31.1.3
Current Opinion in Neurobiology, 16, 174-178. doi:10.1016/j.conb
Peeck, J. (1979). Effects of differential feedback on the answering of two
.2006.03.010
types of questions by fifth- and sixth-graders. British Joumal of Edu-
Fazio, L. K., Huelser, B. J., Johnson, A., & Marsh, E. J. (2010). Receiving
cational Psychology, 49, 87-92. doi:10.111 l/j.2044-8279.1979
right/wrong feedback: Consequences for leaming. Memory, 18, 335-
350. doi:10.1080/09658211003652491 .tb02401.x
Phye, G. D., & Sanders, C. E. (1994). Advice and feedback: Elements of
Finn, B., & Roediger, H. L., III. (2011). Enhancing retention through
reconsolidation: Negative emotional arousal following retrieval en- practice for problem solving. Contemporary Educational Psychology,
hances later memory. Psychological Science, 22, 781-786. doi: 10.1177/ 19, 286-301. doi:10.1006/ceps.l994.1022
0956797611407932 Plowman, L., & Stroud, J. B. (1942). Effect of informing pupils of the
Gibbons, A. S., & Fairweather, P. G. (1998). Computer-based instruction: correctness of their responses to objective test questions. Joumal of
Design and development. Englewood Cliffs, NJ: Educational Technology. Educational Research, 36, 16-20.
Gilman, D. A. (1969). Comparison of several feedback methods for cor- Pridemore, D. R., & Klein, J. D. (1995). Control of practice and level of
recting errors by computer-assisted instruction. Journal of Educational feedback in computer-assisted instruction. Contemporary Educational
Psychology, 60, 503-508. doi:10.1037/h0028501 Psychology, 20. 444-450. doi:10.1006/ceps.l995.1030
Graesser, A. C , Chipman, P., Haynes, B. C , & Olney, A. (2005). Auto- Roediger, H. L., Ill, & Butier, A. C. (2011). The critical role of retrieval
Tutor: An intelligent tutoring system with mixed-initiative dialogue. practice in long-term retention. Trends in Cognitive Sciences, 15. 20-27.
IEEE Transactions on Education, 48, 612-618. doi:10.1109/TE.2005 doi:10.1016/j.tics.2010.09.003
.856149 Roper, W. J. (1977). Feedback in computer assisted instruction. Pro-
Graesser, A. C , Millis, K. K., & Zwaan, R. A. (1997). Discourse compre- grammed Leaming and Educational Technology, 14, 43-49.
hension. Annual Review of Psychology, 48, 163-189. doi:10.1146/ Sassenrath, J. M., & Gaverick, C. M. (1965). Effects of differential feed-
annurev.psych.48.1.163 back from examinations on retention and transfer. Joumal of Educa-
Hattie, J., & Timperley, H. (2007). The power of feedback. Review of tional Psychology, 56. 259-263. doi:10.1037/h0022474
Educational Research, 77, 81-112. doi: 10.3102/003465430298487 Shute, V. (2008). Focus on formative feedback. Review of Educational
Hupbach, A., Gomez, R., Hardt, O., & Nadel, L. (2007). Reconsolidation Research, 78, 153-189. doi:10.3102/0034654307313795
of episodic memories: A subtle reminder triggers integration of new Smits, M. H. S. B., Boon, J., Sluijsmans, D. M. A., & van Gog, T. (2008).
information. Leaming & Memory, 14, 47-53. doi:10.1101/lm.365707 Content and timing of feedback in a web-based leaming environment:
Jarvis, B. G. (2004a). Medialab (Version 2004.2.87) [Computer software]. Effects on leaming as a function of prior knowledge. Interactive Leam-
New York, NY: Empirisoft Corporation. ing Environments, 16, 183-193. doi:10.1080/10494820701365952
FEEDBACK AND TRANSFER 297
Travers, R. M. W., Van Wagenen, R. K., Haygood, D. H., & McCormick, Whyte, M. M., Karolick, D. M., Neilsen, M. C , Elder, G. D., & Hawley,
M. (1964). Learning as a consequence of the learner's task involvement W. T. (1995). Cognitive styles and feedback in computer-assisted in-
under different conditions of feedback. Joumal of Educational Psychol- struction. Joumal of Educational Computing Research, 12, 195-203.
ogy, 55, 167-173. doi:10.1037/h0048319 doi:10.2190/M2AV-GEHE-CM9G-J9P7
Appendix A
Sample Passages Used in Experiments 1 and 2
These passages are associated with the sample questions and blood. Specialized nerve cells within the aorta and carotid arteries
feedback provided in Table 1. called peripheral chemoreceptors monitor the oxygen concentra-
tion of the blood. If the oxygen concentration decreases, the
The Respiratory System chemoreceptors signal to the respiratory centers in the brain to
increase the rate and depth of breathing. These peripheral chemo-
Humans breathe in and out anywhere from 15 to 25 times per receptors also monitor the carbon dioxide concentration in the
minute. The main function of the respiratory system is gas ex- blood. Another factor is chemical irritants. Nerve cells in the
change between the extemal environment and the circulatory sys- airways can sense the presence of unwanted substances like pollen,
tem. A gas that the body needs to get rid of, carbon dioxide, is dust, water, or cigarette smoke. If chemical irritants are detected,
exchanged for a gas that the body can use, oxygen. The lungs are these cells signal the respiratory centers to contract the respiratory
the most critical component of the respiratory system because they muscles, and the coughing that results expels the irritant from the
are responsible for the oxygénation of the blood and the concom-
lungs.
itant removal of carbon dioxide from the circulatory system. Gas
exchange occurs in tiny, thin-walled air sacs called alveoli, which
lie at the end of the many branches of tubes in the lungs. Within Vaccines
the alveoli, gas exchange occurs as a result of diffusion. Diffusion
is the movement of particles from a region of high concentration to A vaccine is a biological preparation that establishes or im-
a region of low concentration. The oxygen concentration is high in proves immunity to a particular disease. Most vaccines are pro-
the alveoli, so oxygen diffuses across the alveolar membrane into phylactic, which means that they prevent or ameliorate the effects
the pulmonary capillaries, which are small blood vessels that of a future infection by any natural pathogen. However, vaccines
surround each alveolus. The hemoglobin in the red blood cells have also been used for therapeutic purposes, such as for allevi-
passing through the pulmonary capillaries has carbon dioxide ating the suffering of people already afflicted with a disease. The
bound to it and very little oxygen. The oxygen binds to hemoglo- early vaccines were inspired by the concept of variolation, which
bin and the carbon dioxide is released. Since the concentration of originated in Asia during the 13th century. Variolation is a tech-
carbon dioxide is high in the pulmonary capillaries relative to the nique in which a person is deliberately infected with a weak form
alveolus, carbon dioxide diffuses across the alveolar membrane in of a disease by inhaling it through the nose or mouth. Upon
the opposite direction. The exchange of gases across the alveolar recovery, the individual was immune to the disease. A small
membrane occurs rapidly—usually in fractions of a second. proportion of the people who were variolated died, but nowhere
Humans do not have to think about breathing because the body's near the proportion that died when they contracted the disease
autonomie nervous system controls it. The respiratory centers that naturally. By the 18th century, knowledge of variolation had
control the rate of breathing are located in the pons and medulla spread to Europe where medical researchers Edward Jenner and
oblongata, which are both part of the brainstem. The neurons that Louis Pasteur transformed the ancient technique into the modem
live within these centers automatically send signals to the dia- day practice of inoculation with vaccines. Inoculation represented
phragm and intercostal muscles to contract and relax at regular a major breakthrough because it reduced the risk of vaccination,
intervals. Neurons in the cerebral cortex can also voluntarily while maintaining its effectiveness. Inoculation is the practice of
influence the activity of the respiratory centers. A region within the deliberate infection through a skin wound. This new technique
cerebral cortex, called motor cortex, controls all voluntary motor produces a smaller, more localized infection relative to variolation
functions, including telling the respiratory center to speed up, slow in which inhalation of viral particles spreads the infection more
down, or even stop. However, the influence of the nerve centers widely. The smaller infection works better because it is sufficient
that control voluntary movements can be overridden by the auto- to stimulate immunity to the virus, but it keeps the virus from
nomie nervous system. Several factors can trigger such an over- replicating enough to reach levels of infection likely to kill a
ride. One of tiiese factors is the concentration of oxygen in the patient.
(Appendices continue)
298 BUTLER, GODBOLE, AND MARSH
Vaccines work because they prepare the immune system to and typhoid. As long as the vast majority of people are vacci-
deal with pathogens that it may encounter in the future. When nated, it is much more difficult for an outbreak of disease to
a vaccine is given, the immune system recognizes the vaccine occur and spread because of herd immunity. Herd immunity
agents as foreign, destroys them, and then "remembers" them. describes a type of immunity that occurs when the vaccination
When the real version of virus comes along, the body recog- of a portion of the population (or herd) provides protection to
nizes it and destroys the infected cells before they multiply. Of unvaccinated individuals. Herd immunity theory proposes that
course, vaccines do not guarantee complete protection against for diseases passed from person-to-person, it is more difficult to
the disease because sometimes a person's immune system does maintain a chain of infection when large numbers of a popula-
not respond for various reasons. Still, even when a vaccinated tion are immune. The higher the proportion of individuals who
individual does develop the disease vaccinated against, the are immune, the lower the likelihood that a susceptible person
disease is likely to be milder than without vaccination. Overall, will come into contact with an infected individual. Despite
the invention of vaccines has led to a marked decrease in the potential protection from herd immunity, mainstream medical
prevalence of deadly diseases, such as smallpox, polio, measles. opinion is that everyone should be vaccinated.
Appendix B
Sample Questions and Feedback Taken From Passages on the Respiratory System and Vaccines, Respectively
Retention questions were used on the initial test and repeated on Vaccines
the final test, whereas transfer questions were only used on the
Retention Question: What vaccination technique did Edward
final test.
Jenner and Louis Pasteur develop that improved upon the ancient
practice of variolation?
The Respiratory System Correct Answer Feedback: Edward Jenner and Louis Pasteur devel-
oped the technique of inoculation to improve upon the ancient prac-
Retention Question: What is the process by which gas exchange
tice of variolation.
occurs in the part of the human respiratory system called the
alveoli? Explanation Feedback: Edward Jenner and Louis Pasteur developed
the technique of inoculation to improve upon the ancient practice of
variolation. Inoculation is the practice of deliberate infection through
Correct Answer Feedback: Gas exchange occurs within the alveoli
a skin wound, whereas variolation involves inhaling a weak form of
through diffusion.
the disease. The new technique produces a smaller, more localized
infection that is adequate to stimulate immunity to the virus, but keeps
Explanation Feedback: Gas exchange occurs within the alveoli it from replicating enough to be dangerous.
through diffusion. Diffusion is the movement of particles from a
region of high concentration to a region of low concentration. The Inference Question: The recently developed nasal spray flu
oxygen concentration is high in the alveoli and the carbon dioxide
vaccine, which is inhaled through the nose, contains weakened
concentration is high in the pulmonary capillaries, so the two gases
viruses that only cause infection at the cooler temperatures found
diffuse across the alveolar membrane in opposite directions towards
within the nose. In what sense does this new method of vaccination
lower concentrations.
combine the techniques of inoculation and variolation?
Inference Question: If people are having trouble breathing, they Answer: The nasal spray flu vaccine is similar to inoculation in that
are often given pure oxygen to inhale. How does breathing pure it produces a smaller, more localized infection, but also like variola-
oxygen facilitate gas exchange relative to regular air? tion in that the virus is inhaled.
Answer: Breathing pure oxygen increases the oxygen concentration in Received May 16, 2011
the alveoli, so oxygen will diffuse more rapidly across the alveolar Revision received September 24, 2012
membrane into blood in the pulmonary capillaries. Accepted October 11, 2012 •
Copyright of Journal of Educational Psychology is the property of American Psychological Association and its
content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's
express written permission. However, users may print, download, or email articles for individual use.