Excepcional Children
Viil. 75. No. 3. pp- 32Í-M0.
©2009 Councilfar Exceptioiml Children.
An Examination of the Evidence
Base for Eunction-Based
Interventions for Students
With Emotional and/or
Behavioral Disorders Attending
Middle and High Schools
KATHLEEN LYNNE LANE
JEMMA ROBERTSON KALBERG
JENNA COURTNEY SHEPCARO
Ptahödy College ofVanderbilt University
ABSTRACT:
r: The authors field-tested the core quality indicators and standardsforevidence-based
practices for single-case design studies developed by Horner and colleagues (2005) by applying them
to the literature exploring fiinctional assessment-based interventions conducted with secondary-age
students with emotional and/or behavioral disorders (EBD). First, we evaluated this knowledge
base by applying the indicators to determine if the studies identified (n = 12) were of acceptable
methodological quality. Second, we analyzed studies meeting the recommended quality indicators
to determine whether fiinction-based interventions with students with EBD might be considered
an evidence-based practice. Results reveal that only 1 study addressed all proposed quality indicators, suggesting that function-based interventions are not yet an evidence-based practice for this
population per these indicators and standards. Limitations and recommendations are posed.
S
tudents with emotional and/or EBD have behavioral, social, and academic
behavioral disorders (EBD) rep- deficits that pose challenges within and beyond
resent between 2% and 20% of the school setting (Kauffman, 2005). For examihe school-age population and pie, they have impaired social skills that strain reare among some of the most lationships with teachers and peers (Gresham,
challenging students to teach (Walker, Ramsey, & 2002). In addition, students with EBD have
Gresham, 2004). By definition, students with broad academic deficits that, at best, remain
Exceptional Children
321
stable over time {Nelson, Benner, Lane, Ô£ Smith,
2004). Unfortunately, outcomes do not improve
when EBD students leave the school setting as evidenced by employment difficulties, contact with
the juvenile justice system, limited community
involvement, and high rates of access to mental
health services (Bullis & Yovanoff, 2006).
During the past 30 years, schools have responded with a range of interventions to support
these youngsters including schoolwide primary
prevention efforts (e.g., antibullying programs);
secondary prevention efforts (e.g., small group instruction in conflict resolution skills); and tertiary
prevention efforts (e.g., individualized intervention efforts; Horner & Sugai, 2000). One tertiary
intervention effort that has met with demonstrated success, particularly with elementary-age
students with F.BD is function-based interventions (Conroy, Dunlap, Glarke, & Alter, 2005;
Kern, Hilt, & Gresham, 2004; Lane, Umbreit, &
Beebe-Frankenberger, 1999).
Function-based interventions refer to interventions designed based on the reasons why problem behaviors occur (Umbreit, Ferro, Liaupsin, &
Lane, 2007). The motive for a given behavior is
derived through a functional behavioral assessment. In brief, descriptive (e.g., interviews, direct
observations of behavior, rating scales) and experimental (e.g., functional analysis) procedures are
used to identif)' the antecedent conditions that
prompt a target behavior (e.g., disruption) to
occur and the consequences that maintain the behavior. These data are used to generate a hypothesis statement regarding the function of the
behavior.
In general, all behaviors occur to either obtain (positive reinforcement) or avoid (negative
reinforcement) attention; activities or tasks; or
tangible or sensory conditions (Umbreit et al.,
2007). Often, the hypothesis statement is tested
by systematically manipulating environmental
conditions to identify or confirm maintaining
consequences. Next, an intervention is designed
based on the function of the target behavior with
a goal of teaching the student a more reliable, efficient method of meeting his or her objective (e.g.,
escaping a too difficult or too easy task; Umbreit,
f.ane, & Dejud, 2004). This is done by constructing an intervention that (a) adjusts antecedent
conditions that prompt the problem behavior, (b)
increases reinforcement rates for the replacement
behavior, and (c) extinguishes reinforcement for
the target behavior.
Functional assessment procedures were originally developed in clinical settings with individuals with developmental disabilities (Iwata, Dorsey,
Slifer, Bauman, & Richman, 1982). Since that
time, functional assessment-based interventions
have been used to shape a variety of behaviors in a
range of educational settings (e.g., general education classes, self-contained classrooms, self-contained schools), with students with a range of
conditions including severe disabilities (Sasso,
Reimers, Cooper, & Wacker, 1992); attention
deficit disorders and behavioral concerns (Ervin,
DuPauI, Kern, & Friman, 1998); and emotional
and/or behavioral problems (Kern, Childs, Dunlap, Clarke, & Falk, 1994; Kern, Delaney, Clarke,
Dunlap, & Childs, 2001).
In fact, functional behavioral assessments
have been endorsed by the National Association
of School Psychologists, National Association of
State Directors of Education, and National Institutes of Health, and mandated in the Individuals
With Disabilities Education Act (IDEA; first in
1997 and again in 2004) when certain disciplinary circumstances occur (Kern et al., 2004).
Namely, school personnel must conduct a functional behavioral assessment when (a) a student is
placed in an alternative placement for behavior
deemed to be dangerous ro self or others; (b) a
student is placed in an alternative setting for 45
days due to drug or weapons violations; or (c) a
student's suspension or alternative setting placement extends beyond 10 days or constitutes a
change in placetnent (Drasgow & Yell, 2001).
Given the behaviors typical of students with
EBD, many of these students may require function-based interventions.
Yet, several researchers contend that such a
mandate may not be entirely appropriate (e.g..
Fox, Conroy, & Heckaman, 1998; Gresham,
2004; Kern et al., 2004; Quinn et al., 2001 ; Sasso,
Conroy, Seichter, & Fox, 2001). Specifically, there
are concerns ofa generalization error in the sense
that existing functional assessment procedures,
which were originally developed for persons with
developmental disabilities, have not been validated
for use with students with EBD (Fox et al.; Kern
et al., 2004; Sasso et al., 2001). At best, there is a
Spring 2009
modest body of literature exploring the effectiveness of function-based interventions for students
with EBD, with most of the studies conducted in
elementary grades (Lane et al., 1999; Quinn et
al.). In the reviews of function-based interventions
conducted with students with and at risk for EBD.
the populations have been predominantly male,
with limited inquiry with secondary-age students
(Conroy et al., 2005; Kern et al., 2004; Lane et
al., 1999;Sassoetal., 2001).
ture focused on single-case methodology, we evaluated this knowledge base by field-testing the
quality indicators posed by Horner et al. (2005)
to determine if the studies identified in a systematic literature review met the recommended quality indicators. Second, we analyzed studies that
met the recommended quality indicators to determine whether function-based interventions with
secondary-age students with EBD are an evidence-based practice according to Horner et al.'s
proposed standards. Third, we discussed the extent to which the quality indicators represent reasonable standards and offered considerations for
future application and evaluation of the quality
indicators.
Therefore, questions arise as to the efïicacy of
function-based interventions, particularly for
oldet school-age students. It is possible that designing, implementing, and evaluating functionbased intervetitions with this population will
prove to be a highly formidable task given the increased importance of the peer group (Morrison,
Arefiinctionalassessment-based
Robertson, Laurie, & Kelly, 2002); topographical
changes in discipline problems (e.g., covert acts of
interventions an evidence-basedpractice
aggression, internalizing behaviors; Loebet,
for secondary-age students with EBD?
Green, Lahey, Frick, & McBurnett, 2000; Morris,
Shah, & Morris, 2002); and difficulties in identifying meaningful reinforcers that can compete
with the reinforcing value of the undesired, target METHOD
behavior (e.g., truancy). Thus, the question arises: ARTICLE SELECTION PROCEDURES
Are functional assessment-based interventions an
evidence-based practice for secondary-age stu- We conducted a systematic search of psychology
and educational databases (PsycINFO and Educadents with EBD?
Answers to questions about intervention effi- tional Resources Information Center, ERIC) to
cacy have become more complex as researchers identify function-based intervention studies conhave sought to define what constitutes evidence- ducted with secondary-age students with or at
based practices. Gersten et al. (2005) and Horner risk for EBD. Search terms included all possible
et al. (2005) introduced criteria for determining combinations and derivatives of the following sets
whether a practice is evidence-based using group of terms: {jx) functional assessment, fiinctional analdesign and single-case experimental investiga- ysis, assessment based, intervention, and procedures;
tions, respectively. These research teams devel- and (b) seriously emotionally disturbed, einotinnal
oped quality indicators for group design and andJor behavioral disorders, at risk, and problem besingle-case design inquiry that can be used to de- havior (Lane et al., 1999). The title and abstract
termine the extent to which a given study meets of each article from the electronic search was evalrequisite criteria, thereby establishing the study as uated to determine if the article should be read in
a reputable, appropriate study. Further, each team its entirety to evaluate inclusion eligibility. Next, a
offered guidelines for evaluating bodies of rep- master list of journals that published the included
utable studies that meet the quality indicators to studies was created. We conducted hand searches
determine if the practice Is evidence based.
of those journals that published two or more of
The goal of this review was to field-test these the articles from 1980 to present to gather any
quality indicators by applying them to the body other articles that met inclusion criteria. Searches
ol- literature exploring functional assessment- were conducted in the following journals; Behavbased interventions conducted with secondary- ioral Disorders, Education and Treatment of Chilage students with EBD. Specifically, the intent dren, Journal of Applied Behavior A}jalysis, Joumal
was threefold. First, given that this body of litera- of Emotional and Behavioral Disorders, Journal of
Exceptional Children
323
Positive Behavior Interventions, and School Psychol-
•
ogy Review. Finally, we compared our search results with other reviews of function-based
interventions (e.g., Dunlap & Childs, 1996;
Heckaman, Conroy, Fox, & Chait, 2000; Lane et
al., 1999).
Thirty-three articles, all of which employed
•
single-subject designs, were identified as appropriate for further review using the procedures previously stated. Each article was read in its entirery
to determine if the article met the following inclusion criteria.
•
INCLUSION CRITERIA
The Intent of this review was to evaluate the extent to which function-based interventions conducted with secondary-age students with or at
risk for EBD met the recommended indicators
and to determine if function-based interventions
are an evidence-hased practice for this population.
Studies were Included in this review only if (a) the
participants were diagnosed with or were at risk
for EBD, (b) the participants were educated in a
secondary school setting, (c) an intervention
derived from a functional assessment was implemented and evaluated using single-case methodology, (d) intervention results included a graphic
display of student outcomes, and (e) the study
was published in a refereed journal.
Participants included in the studies had to be
adolescents, defined as students ages 13 to 18,
with or at risk for EBD. This group included students with
•
EBD, an inclusive term to describe students
with behavioral concerns.
•
EBD and another disability specified in
IDEA (e.g., learning disability, other health
impairment, speech and language disorder),
except for students with a dual diagnosis of
moderate mental retardation or developmental disabilities (e.g.. Cole, Davenport, Bambara, & Ager, 1997; O'Reilly et al., 2002) as
these students typically participate In a functional skills curricula rather than traditional
core curricula (Heckaman et al., 2000; Lane
et al., 1999).
•
A label of emotional disturbance (ED), as
specified by IDEA (2004).
Psychiatric diagnoses specified in the Diagnostic and Statistical Manual of Mental Disor-
ders (DSM-IV-TR; American Psychiatric
Association, 2001) such as condttct disorder
(CD) or oppositional defiant disorder
(ODD).
A general behavioral concern (e.g., noncompllance) and attention deficit/hyperactivity
disorder, a group of students with attention
and behavioral concerns that place them at
heightened risk for behavior disorders.
Psychiatric (e.g., ODD, CD) or educational
(ED) diagnosis that co-occurred with an attention disorder (e.g., Ervln et al., 1998;
Lane et al., 1999).
Second, all function-based Interventions
needed to take place in a secondary school setting,
inclusive of middle, junior high, and high
schools. If the study reported function-based interventions implemented in multiple school levels
(e.g., elementary and middle), only results of the
investigation taking place at the secondary school
were included (e.g., DePaepe, Shores, Jack, &
Denny, 1996; Gunter, Jack, Shores, Carrell, &
Flowers, 1993; Stage et al., 2006). Interventions
implemented in clinics, day treatment centers, diagnostic centers, or residential day treatment centers (e.g., Platt, Harris, & Clements, 1980) were
excluded as the purpose of this review was to examine school-based interventions conducted in
secondary schools. If the school level was not
stated, the article was excluded unless the student
was 13 years or older as there was very limited
possibility that a 13-year-old would still be in elementary school.
Third, a functional assessment had to he
conducted, yielding a hypothesis regarding the
reason why the target behavior occurred. Functional assessment procedures—descriptive (e.g.,
interview, behavior rating scales, direct observation) or experimental (e.g., functional analysis)—
must have been delineated. Consistent witb other
review articles, at least one of the preceding functional assessment procedures must have been employed In the methodological procedures and a
hypothesis statement generated from the functional assessment results (Heckaman et al., 2000;
Lane et al., 1999). Further, the article needed to
include an intervention based on functional as-
Spring 2009
sessment results (Heckaman et al.) and evaluated
using single-case mechodology. Articles thac included only functional assessment resulcs, funccional analyses that did not lead eo sustained
interventions, or ehose with inecrvcntiotis noe
based on functional assessment results were excluded (e.g., DePaepc ee al., 1996; Ervin ec al.,
2000).
Fourth, chc scudies must have reported a
graphic display of student outcomes for individual students. Studies reporting only narrative outcomes (e.g., Sterling-Turner, Robinson, &
Wilczynski, 2001) w^ere excluded. We viewed this
visual display as essential to evaluate the accuracy
of creatment-outcome resulcs and the analytical
tools (e.g., stability, level, trend) employed. Further, studies reporting graphic display of group
outcomes (e.g.. Center, Deitz, & Kaufman, 1982)
were excluded as they did noc allow for inspection
of individual outcomes.
Finally, only articles published in peer-reviewed journals were included in chis review. Dissercacions, book chapcers, and monographs were
excluded because our goal was to draw conclusions based on información thac had wiehseood
ehe peer review process.
Of the 33 arricies identified in the initial
search, 12 areicles mee ehe inclusion criteria as determined by all ehree auehors. These anieles were
coded independently by the first and third authors as described in the following section.
CODING PROCEDURES FOR QUALITY
INDICATORS
ticipant selection, and (c) secting description. To
meet the first componenc, more ehan a general
definition (e.g., ERD) was required. Participancs
had to be described in sufficient detail chat included (a) the specific disability as well as (b) the
method used Co determine the disability. Participant selección criteria needed to be defined precisely enough to allow replication (e.g.,
quantifiable daca co indicate rcpiicacion selection
criceria). Setting description required a description of the physical setcing that also included sufficienc details (e.g., number of adults presenc,
room arrangemenc) that allowed others to recruit
similar participancs from similar sectings. The
coding reliability was as follows: participant description 83.33%; participant selection 100%;
and setting description 91.67%.
Dependent Variable. Horner et al. (2005)
identified five components to determine the quality of the dependent variables. First, the description of each dependenc variable had to be
operationally defined. If more than one dependent variable was reported, and both variables
were noc defined precisely, chen chis componenc
was considered absent. Second, each dependent
variable needed to be measured using a procedure
chac produced a quantifiable index such as the frequency of a given behavior per minute. Third, che
measuremenc of the dependent variable needed co
be valid and described with sufficient precision to
allow for replication (e.g., appropriate system of
measure dependenc on the nature of che cargec behavior ¡whole interval for variables such as engagement, partial interval for variables such as
disruption] wich details of the data collection procedures provided).
Articles that mec ehe inclusion criceria were read
in cheir entirety by all chree auehors and coded by
Fourth, the dependent variable needed to be
the first and third authors. Each article was coded measured repeatedly over time. We further dealong the 21 components conscicuting the seven fined chis component to require a minimum of 3
qualicy indicators specified by Horner et al. data points per condition (Kennedy, 2005). As
(2005; see Table 1): (a) describing participancs mentioned by Kennedy, 3 data points per phase is
and seccings; (b) dependenc variable; (c) indepen- an acceptable standard among researchers emdenc variable; (d) baseline; (e) experimental con- ploying single-case mechodology. Fifth, data
crol/incernal validicy; (0 excernal validity; and (g) needed co be reported regarding the reliability or
social validity. Specifically, each component was incerobserver agreement (IOA) of each dependent
evaluated as being present or absent according co variable. Further, Horner et al. indicated chat
IOA levels had to meet the following minimum
che guidelines in che seccions that follow.
Describing Participants and Setting. Per Hor-scandards: IOA = 80% and Kappa = 60%. Bener et al. (2005), chis indicacor contained chree cause some articles reported ranges as well as
componencs: (a) parcicipant description, (b) par- means, we further defined this component to reExceptional Children
325
oo«~i
O O r n
o o o o o
—sCStNfNrNfN
mm"^
--^PHr^rn
o
o
^-.lAirv
o c o
O O O
q (S
0 do d
2 Z
.—..
SSS oSSSSS oSSS oSS ^SSS 8
^
S
^
Z
-m
O
fi n
O
(N
rñ
0
o
o rn
o
o o
'M
n
fN
co o
rs
o
«
n o ri
rs
o o o o
^
o
o
o o o
o ^0 ifî o
o
O
.—.
0
r,
¿H
Z
2
" o
;¿í Z
00)
rs
o o
o
2
Yes
Yes
Yes
rs
o
o O o
o « SI 0
.>- . > •
h-'
>!
2
2
o
o
^^^^
o
o o o
o
o
r^
Ip
o
r-j
O
\r<
o
>- Z
o
o o
o o o
o
o m
o o
o o O
o m o
o f S o tN o
.o
—. o o o
o .—. rn o m .—,
o
o
o
o o o o
3 o O o ,—. o o o o o 3
o 3 s
o
^_ o
d o 0 0 o o
3
0
o 3
o ¿ „ 0
S 0
2
2
Z
2
o
o
z 0 z
z o
z
z z o
Z
2
Z
2
z
z
i^ u^ o o
,—. iN (N O o
o Ö Ö Ö d
o <Ji 'y¡ Q 0
o ^ ^ Z 2
z
a
•
.2 '^
.y S^ ^ a.
^
1
>. E
IJ .S
a. c
-îi
-- 2 F
Abi
3
O
•5
111
(N
O
.—.
o
o o o
o
'n
o o
r'
o
o
'f-l"
J—,
n
m
„
No(0.
33)
33)
Yes
Yes
Yes
Yes
o
o
m
,^ m
Xo o o
C 3
Yes(
o^
.—.
•
O
M
Yes
Yes
Yes
Yes(l.
Yes
Yes
Yes
Yes
Yes
SI
-*"
-*•
Z
N' 00
(N
O
o o
(0.
(0.
0
o ''-
•1)
^
(0
fn
o 'm"
o o
m .—, ( S (N
o 1-1 o O o
o o
o o o
{0
o
.—-
20)
20)
20)
20)
o o o o o
Yes
o o o
o o
Z
fc--t
•
o
00)
00)
33)
7*
33)
33)
33)
z
E
S .2
£•
^
at.§
1
bb
1
.0
i i
E .eu L-
Q
u
E
.-
U
a
Ee
2 2
Spring 2009
O
O
O
O
O
.^pfNpfNp
ifl
wl
Q
33)
33)
33)
20)
20)
20)
20)
20)
33)
o o
o o o o o o o
o o
o
Z
3
0
7
Z
o o o c o
r-4 p
p
f^l
d d d d d
i^ Jj Ji
rN fN
<77* .,—i ^
O
"^
o o o
o
2
f^
(N
d y
o o
o
2
/^
,-l
3 0 ^ 3 0 0 0
Z
íN
O
o
00)
00)
00)
O
^
Z 2
o
No(0.
o
Yes
33)
33)
33)
o
o
Yes
Yes
Yes
20)
o
o
Yes
Yes
Yes
33)
00}
O
o o
(00
> • ? go
o
(00
(00
o o
(or
(or
(00
o
o
O o
z
£
OT o
>- 2 2 >-
(00
o o o o
(00
o
m o
? , 3 P. o o o o o o
o o o o
o o
2 Z
í^
25)
í3
00)
00)
"5 o
O
Z
No(0.
Yes
o
Z ¿¿ Z
No(0.
pfnp
00)
O f T i O
3d353d3d3
o
p
£ £
o o
0
0
0
0
2 2
:3 o-g u
_•
ÖD
n
S '^
O •£
0
2
,0.—,
0
0
2
¿ 30
2
^..^
0
co
3
0
0
fN
0
0
fN
0
fN
Yes (0.
Yes
Yes
Yes (0.
^1
3
d
0
0
Yes
0
^—. 0
3
d
2
0
d
d
d
.0 ^fdN
0
0
fN
0
d
0
fN
fN
z
.3 ^ .^ jo ^
1^ í^
^^
r<^ rCi O
0"
0
í¿ d d 3
d
0
c
fN
d d
0
d d
0
?^ ^ * ?^
O >
>H Z
0
-^
PO m
rri
^—V < ^ - í ^
f ^
00
-—^ ^ / ^ V ^ ^ - >
g o o o g o o g £ ££
fN
p
rC d d 0
d "o" "o"
Q
z z
Z
0
fN
>- >- z
fS
H
(N r-J (N
d d d Ö
o
uh
•5 3
•3 a
f-
&
S
-a g
Exceptional Children
327
quire only the means to be at or above these criteria. Namely, the component was considered present if the mean met criteria, regardless if the
range of scores reported included values below the
minimum criteria. If the article reported IOA and
Kappa values (e.g., March & Horner, 2002), then
both minimum criteria had to be met to be considered present. Finally, it was necessary for IOA
or Kappa values to be presented for each measure
and each phase. The coding reliability was as follows; depetident variable description 100%;
quantifiable measurement 100%; valid and welldefined measurement 100%; measured repeatedly
100%; and IOA 91.67%.
tic. For this component, we established a minimum of 3 data points rathet than 5, reasoning
that 5 may be an unnecessarily high number. As
Kennedy (2005) stated, "[A] baseline needs to be
as long as necessary but no longer. The goal of
baseline is to establish patterns of behavior to
compare to intervention. Therefore, a baseline
needs only be long enough to adequately sample
this pattern" (p. 38). The second component necessary to establisb the quality of baseline was that
baseline conditions be described with sufFicient
detail to allow for replication. We clarified this indicator by establishing that the baseline description needed to include information on "who did
Independent Variable. Horner et al. (2005)
what to whom, where were those actions taken,
delineated three components as being necessary
and when did those actions occur" (Lane, Worley,
for single-subject studies to meet the independent
Reichow, & Rogers, 2006, p. 226). The coding
variable quality indicator. First, the independent
reliability
was as Follows: repeated measurevariable needed to be described precisely to allow
ment/established
pattern 100% and description
for replication. This included documentation of
100%.
required materials and explicit reporting of speExperimental Control/Internal Validity. Horcific procedures. General descriptions (e.g., token
economy) were considered insufficient to meet ner et al. (2005) established experimental conthe expectation. Second, the independent variable trol/internal validity as being evident when
needed to be systematically manipulated by the
the design documents three demonstrations
intervention agent (e.g., teacher, paraprofesof the experimental effect at three different
sional). Third, fidelity of implementation was
points in time with 3 single participant
considered "highly desirable" (Horner et al., p.
(with in-subject replication), or across differ174). Horner et al. defined this as continuous dient participants (inter-subject replication).
rect measurement of the independent variable or
An experimental efFect is demonstrated when
a parallel form of assessment. To further define
predicted change in the dependent variable
this component, we added that fidelity needed to
covaries with manipulation of the indepenbe both measured explicitly and the data reported
dent variable, (p. 168)
(Lane et al., 1999). The coding reliability was as
They indicated that three c o m p o n e n t s
follows: independent variable description 100%;
needed
to be addressed to establish experimental
systematically manipulated 100%; and fidelity of
control.
First, the design must include at least
implementation 91.67%.
three demonstrations of experimental effect at
Baseline. Horner et al. (2005) indicated that
three different time points. In instances in which
baseline conditions needed to include repeated
measurement, with an established pattern of re- fewer than three demonstrations were docusponding that could be used to anticipate or pre- mented in a given experiment, this component
dict future behavior in the absence of an was considered absent. Second, the design needed
intervention. They reported that baseline phases to control for common threats to internal validity.
should include multiple data points. Specifically, In addition to requiring established designs that
Horner et al. stated the following: "five or more, met this criteria (e.g., ABAB; BABA; changing
although fewer data points are acceptable in spe- criterion, multiple baseline with three legs, altercific cases" (Horner et al., p. 168). In addition, nating treatment), we also required that treatment
they required that either (a) a trend in the pre- integrity be assessed and reported given that the
dicted direction of the intervention efFect not be absence of treatment integrity poses a severe
present or (b) that the trend be countertherapeu- threat to internal validity (Gresham, 1989).
328
Spring 2009
Finally, Horner et al. (2005) required a pattern of responding that documented experimental
control. They indicated that visual analysis techniques, which involve interpretation of level,
trend, and variability of performance during each
phase as well as other techniques (e.g., immediacy
of effects, magnitude of change, percentage of
overlapping data points, and consistency of data
patterns) shotild be used to determine if this component was met. For our coding procedures, we
determined that authors did not need to discuss
each element {level, trend, variability) in text. If a
graph with individual student-level data was displayed and the reader could examine level, trend,
and variability and the graph suggested a functional relation between the introduction oí the independent variable and corresponding changes in
the dependent variables, the componenr was
coded as present. 1 he coding reliability was as follows: three demonstrations of experimental effect
100%; internal validity 91.67%; and pattern of
results 100%.
External Validity. Horner et al. (2005) recommended documenting external validity by replicating experimental effects across participants,
settings, behaviors, or materials. Consistent with
Tankersley, Cook, and Cook (in press), we interpreted this quality indicator to require replication
across one of the following: participants, setting,
behavior, or materials. To further clarify criteria
tor external validity, we required studies to (a) include three replications in one of those categories
as recommended by Horner et al., and (b) meet
all three previously stated criteria for experimental
control/internal validity to be considered as possibly having external validity given that internal validity is essential to establishing external validity
(Wolery, 2007 personal communication). The
coding reliability was as follows: 100% external
validity.
vention produced an efFect that met the defmed,
clinical need" (Horner et al., p. 172). We further
defined this component as being present if (a)
there was a measure of social validity and the evidence from that measure reported a socially
meaningful change in the desired direction or (b)
a functional relation was evident between the introduction of the independent variable and
change in the target behavior (e.g., reduction in
aggression).
Third, the independent variable was practical
and cost effective. We clarified this component by
stating that cost effectiveness must be stated explicitly. Practicality was defined as a study conducted in a typical setting with traditional
intervention agents and materials typically found
in tbe identified setting. Finally, use in typical
contexts was defmed as the following:
demonstration that typical intervention
agenrs (a) report protedures to be acceptable,
(b) report the procedures to be feasible
within available resources, (t) report the procedure to be effective, and (d) choose to continue use of the intervention procedures after
formal support/expectation of use is removed. (Horner et al., 2003. p. 172)
We coded this component as present if any
one of these four practices was reported. The coding reliability was as follows: social importance of
the dependent variable 100%; change in dependent variable is socially Important 75%; independent variable is practical and cost effective 100%;
and used in typical contexts 100%.
An overarching framework when coding the
quality indicators was to evaluate the studies
based on what the researchers reported either in
text or in visual display, and not in our Interpretations. The modifications to the components constituting the quality indicators were developed to
(a) refme definitions to increa.se consistency across
Social Validity. Horner et al. (2005) identified raters and (b) allow more transparent criteria for
social validity as the final quality indicator, which the reader.
referred to rhe social significance of the goals, social acceptability of the treatment procedures, and
EVALUATION PROCEDURES FOR
rhe social Importance of the effects (Baer, Wolf, &
DETERMINING EVIDENCE-BASED
llisley, 1968). They identified four components
PRACTICE USING SINGLE-SUBJECT
for this indicator. First, the dependent variables
needed to be socially valid. Second, the change in RESEARCH
the dependent variable had to be socially impor- We then applied the five standards for an evitant, defined as a "demonstration that the inter- dence-based practice proposed by Horner et al.
Exceptional Cbildren
sas
(2005) to the body of liceracure examining the effectiveness of function-based intervención conducted with secondary-age students with and at
risk for EBD. The goal was to determine if singlesubject research studies document this practice as
evidence based. The five scandacds necessary co
documenc a practice as evidence-based included
the following:
indicator for describing participants and setting
(Smich & Sugai, 2000; see Table 1). Three studies
addressed cwo of the three components by reporting descriptions of the parcicipant and secting chac
were precise enough to facilitate replication
(Ervin ec al., 1998; Hoff, Ervin, Si Friman, 2005;
Liaupsin, Umbreit, Ferro, Urso, & Upreci, 2006).
However, despice the thorough descripción of the
students'
disabilities or condition and the proce1. The practice was defined operationally in
CO decermine their disabilicies, studies
dures
used
cext.
did noc describe the process used to selecc parcici2. The authors defined the context and outpants with replicable precision. Of the 8 remaincomes associated with the practice.
ing studies, all buc 1 (Schloss, Kane, & Miller,
3. The practice was implemented wich fidelity.
1981) mec at least one component constituting
4. Findings documenc che introduction of the this qualicy indicacor. Four scudies reported the
practice as functionally relaced to change in critical features of che setcing, buc chese were not
che dependent variables.
precise enough in describing the participancs or
5. Experimental efFects are replicated across suf- che parcicipant selection process so were not inficient number of peer-reviewed scudies (n = cluded (Gunter et al., 1993; Knapczyk, 1988,
5) published in refereed journals, across chree 1992; Penno, Frank, & Wacker, 2000). In condifferent researchers at chree differenc geo- crast, the remaining chree studies provided degraphical locales, and include ac lease 20 par- tailed descriptions of the participant selection
ticipants from five or more studies.
process, buC these did noc provide enough detail
in describing the participants or setting co allow
for replication (Ingram, Lewis-Palmer, Ô£ Sugai,
RESULTS
2005; March & Horner, 2002; Stage et al.,
In the results section, we address the first Cwo put- 2006). In some instances, the level of precision
poses of chis review by answering the following for describing the participant selection process
questions: To what extent do the scudies identified was particularly detailed. For example, March and
for inclusion meet the quality indicators posed by Horner stated che following:
Horner et al. (2U05)î To what extent do the scudThree participants were selected ba.sed on (a)
ies addressing che qualicy indicators supporc funcno decrease in cheir rate of discipline concion-based interventions for secondary-age
tacts following involvement wich the BEP
program, (b) documcncacion of ai least five
students with EBD as an evidence-based practice?
ofifice discipline referrals during thefirst4
months of the new academic year, (c) nomiby BEP team members, (d) student
nation
S T U D I E S
OF
F U N C T I O N assent
and
parent consent, (p. 162)
BASED
I N T E R V E N T I O N S
FOR
SECO N OA RY-AG E
Seccing was the mosc frequencly addressed,
wich 8 ouc of 12 studies meeting the coding criteria for this componenc. In contrasc, only 4 sttidies
FINDINGS
OF A FIELD
TEST
described
parcicipancs wich sufficienc detail to afOF QUALITY
INDICATORS
ford ceplicacion (Kevin ec al., 1998; Hoffet al.,
Quality Indicator I: Describing Participants 2005; Liaupsiti et al., 2006; Smith & Sugai,
and Setting. Results revealed that only 1 of che 12 2000). Similarly, 4 scudies described parcicipant
scudies reviewed mec all chree components (par- selection criceria wich replicable precision (Ingram
ticipant description, parcicipant selection criteria, et al., 2005; March & Hornee, 2002; Smich &
and setting descripción) constituting che quality Sugai; Scage ec al., 2006).
S T U D E N T S
WITH
EBD
Spring 2009
Quality Indicator 2: Dependent Variables. Five
studies met the quality indicator ior dependent
variables as evidenced by addressing the five components {description, quantifiable measurement,
valid and well-described measurement, repeated
measurement, and IOA) constituting this indicator {Gunter et al., 1993; Knapczyk, 1992; Liaupsin et al., 2006; Penno et al, 2000; Smith &
Sugai, 2000). Four studies met coding criteria tor
all but one component (Ervin et al., 1998; HolT
et aJ., 2005; Knapcyzk, 1988; March & Homer,
2002); I study met criteria for three components
{Ingram et al., 2005); and 2 studies met criteria
for two components: quantifiable measurement
and repeated measurement (Schloss et al., 1981;
Stage et al, 2006).
In all studies, each dependent variable was
measured in such a manner thac produced a
quantifiable index (e.g., percentage of intervals
on-task; Ervin et al., 1998) and all but two studies {Schloss et al, 1981; Stage et al., 2006) operationally defined ail dependent variables. In the
latter study, all behavior codes were stated, but
not all terms were operationally defined. The majority ot studies {n - 9) reported a valid and welldescribed measurement system. For example,
Liaupsin et al. {2006) described data collection of
on-task behavior as tollows "3O'S whole interval
recording procedure. Observations were 20 min
in length and began 5 to 10 min after the assignment of independent class work or reading" (p.
584). In addition, nine studies measured the dependent variables repeatedly over time according
to coding criteria (minimum of 3 data points per
phase). In instances when this component was
not met, there were typically fewer than 3 data
points in a phase. For example, in the Ervin et al.
(1998) study, one of the students, Joey, had just 1
datum point in the return to baseline phase. Finally, criteria for the IOA component {IOA >
80%; Kappa > 60%) were met in eight studies.
However, in some cases IOA was reported as an
overall mean, but not for each dependent variable
individually {e.g., Ingram et al, 2005; Stage et
al.). In other cases, the criterion tor IOA criteria
was met, yet the criterion for Kappa was not met
(e.g., March & Horner, 2002).
Quality Indicator 3: Independent Variable
(IV). Six studies met the quality indicator for independent variable as evidenced by addressing the
Exceptional Childrm
three components (IV description, systematically
manipulated, fidelity of implementation) constituting the quality indicator (Ervin et al, 1998;
Gunter et al., 1993; Ingram et al, 2005; Liaupsin
et al., 2006; Penno et al., 2000; Smith & Sugai,
2000). Three studies met two components: independent variable description and systematic manipulation of the independent variable {Hoffet
al, 2005; March & Horner, 2002; Stage et al,
2006); yet, these studies did not address implementation fidelity. The final three studies addressed one out of three components, with all
three studies systematically implementing the independent variable (Knapczyk, 1988, 1992;
Schloss et al, 1981).
In all studies {n - 12) the independent variable was systematically manipulated by the experimenter; of these studies, 9 described the
intervention procedures with replicable precision.
Six studies measured and reported treatment fidelity. The 3 studies not meeting expectations for
fidelity were published between 1981 and 1992
(Knapczyk, 1988, 1992; Schloss et al., 1981).
However, it should be noted that the importance
of treatment integrity was not emphasized in the
literature until the 1980s as documented in articles written by Yeaton and Sechresc {1981) and
Gresham (1989). March and Horner {2002) addressed the lack of treatment Rdelity data as a
limitation stating "a final limitation lies in the absence of treatment integrity data . . . the only process for documenting fidelity of procedural
implementation was the weekly observation and
feedback to teachers by the first author" (p, 168).
Although Hoffet al. (2005) stated that "Kevins
teacher implemented all of the intervention
strategies" (p. 50), they did not mention how (or
iO they collected fidelity data. Finally, Stage et al.
(2006) did monitor fidelity of data, but they reported poor fidelity ot implementation (e.g., "In
Gale's case, there was a complete lack of treatment
fidelit)' within the general education setting." p.
468), thereby not meeting this component.
Quality Indicator 4: Baseline. Seven studies
met the quality indicator for baseline as evidenced
by addressing the two components (repeated measurement and established pattern description)
constituting the quality indicator (Gunter et al,
1993; Knapczyk, 1988, 1992; Liaupsin et al.,
2006; March & Horner, 2002; Penno et al..
2000; Smith & Sugai, 2000). The remaining five
studies met ar least one of the two criteria for the
baseline quality indicator. More specifically, three
studies met at least one component, meeting expectations for an established pattern and repeated
measurement (Ingram et al., 2005; Schioss et al.,
1981; Stage et al., 2006). The other two studies
met expectations for description of baseline conditions (Ervin et al., 1998; Hoffet al., 2005).
Ten studies met the criteria for reporting a
baseline phase that included three or more data
points and an established pattern of repeated
measurement ofa dependent variable that supported a patterned responding predictive of future
behavior. However, 2 studies included fewer than
the requisite number ot data points in the return
to baseline phase (Ervin et al., 1998; Hoffet al.,
2005), although Hoff and colleagues acknowledge
this as a "brief withdrawal of the intervention and
return to baseline" (p. 51). Nine studies met the
requisite criteria for describing the baseline condition. The remaining 3 studies did not describe the
baseline condition precisely enough for replication (Ingram et ai., 2005; Schioss et al., 1981;
Stage et al., 2006).
nal validity according to the posed criteria. Several studies did not meet this component due to
the absence of treatment integrity (e.g.. Hoffet
al.; Knapczyk, 1992; March & Horner; Schioss ct
al.; Stage et al.). Finally, six studies met the component of pattern of results that supported experimental control (Gunter et al.; Knapczyk, 1988,
1992; March & Horner; Schioss et al.; Smith &
Sugai). The absence of sufficient data points in
each phase prohibited studies from satisfying this
component (e.g., Ervin et al.; Hoffet al.; Ingram
et al.), as did the absence of sufficient demonstrations (e.g., Liaupsin et al.; Penno et al.; Stage et
al.).
Quality Indicator 6: External Validity. Only
one study established external validity according
to the coding procedures (Smith & Sugai, 2000).
In most studies, external validity was not established given that we defined the presence of internal validity as a prerequisite to external validity.
Namely, the study needed to meet all components
constituting the experimental control/internal validity indicator to have the possibility of experimental control. Ihus, only two studies (Gunter et
al., 1993; Smith & Sugai) had the possibility of
Quality Indicator 5: Experimental Control/In- meeting this indicator.
ternal Validity. Two studies (Gunter et al., 1993;
Quality Indicator 7: Social Validity. One study
Smich & Sugai, 2000) met the three components met the quality indicator for social validity as eviconstituting this quality indicator: three demon- denced by addressing the four components (destrations of experimental effect, internal validity, pendent variable is socially important, change in
and pattern ot results. Four studies met two com- dependent variable is socially important, indepenponents: three demonstrations of experimental ei- dent variable is practical and cost effective, and
fect and pattern of results, with internal validity practice is used in typical contexts) constituting
not established (Knapczyk, 1988, 1992; March & the indicator (Smith & Sugai, 2000). Seven studHorner, 2002; Schioss et al., 1981). Six studies ies met all components save for the third compodid not meet any of the components.
nent, which required cost-effectiveness to be
In terms of the components, six studies stated (Ervin et al., 1998; Gunter et al., 1993;
demonstrated experimental efFect as evidenced by Hoffet al., 2005; Ingram et al., 2005; Knapczyk,
at least three demonstrations across participants 1988, 1992; March & Horner, 2002). The re(e.g., March & Horner, 2002; Schioss et al, maining four studies met two of the four compo1981); setting (Knapczyk, 1988, 1992); or via an nents, with three studies establishing the
ABAB design (Gunter et al., 1993; Smith & dependent variable as socially important and emSugai, 20Ü0). Based on coding criteria, experi- ploying the independent variable in typical conmental effect was scored as absent if there was an texts (Liaupsin et al., 2006; Penno et al., 2000;
insufficient number of data points in a phase Stage et al., 2006). The fourth study established
(e.g., Ervin et al., 1998; Hoffet al., 2005; Ingram the dependent variable as socially important and
et al., 2005) or if there were only two or fewer reported a change that was socially important
demonsrrations evident (e.g., Liaupsin et al., (Schioss et al., 1981).
2006; Penno et a!., 2000). Only two studies
All studies established the dependent variable
(Gunter et al.; Smith & Sugai) established inter- as socially important and 11 reported use of the
Spring 2009
independent variable in typical contexts. Nine established the change in the dependent variable as
socially important. Yet, only 1 study (Smith &
Sugai, 2000) specifically stated that the intervention was both practical and cost effective, reporting that the intervention was "conducted in [an]
actual classroom with minimal time or use of
additional resources" (p. 215).
FuNC TION-BA
SED IN TER VEN TIÜNS
EOR SECONDARY-AGE
WITH EBD:
STUDENTS
DETERMINATION
AN EVIDENCE-BASED
OF
PRACTICE
Given that only one study (Smith & Sugai, 2000)
met all seven quality indicators, it is clear that
ftinction-based interventions conducted with secondary-age students with and at risk tor EBD
cannot yet be docutnented as an evidence-based
practice according to Horner et al. s (2005) standards. As a practice. Function-based interventions
involve (a) conducting descriptive and, in some
cases, experimental tools to identify the function
of the target behavior; (b) designing ati intervention linked to functional assessment data to adjust
antecedent conditions and to maintain consequences so that the student can acquire a more reliable, more efficient, functionally equivalent
behavior; and (c) implementing the intervention
with fidelity using an experimental design (e.g.,
multiple ba.seline, ABAB) that ensures experimental control. However, in the studies reviewed, the
number of quality indicators met in entirety
ranged from 0 to 7 (see Table 1). Moreover, only
one study met four indicators (Gunter et al.,
1993); two studies met three indicators (Liaupsin
et al., 2006; Penno et al., 2000); one study met
two indicators (Knapcyzk, 1992); and four studies met just one indicator (Ervin et al., 1998; Ingram et al., 2005; Knapczyk, 1988; March &
Horner, 2002).
In addition, it should be noted that despite
the specification of inclusion criteria, there was
still variability in the functional assessment tools
employed, student characteristics, and instructional setting. For example, although all studies
reviewed met the inclusion criteria of having one
functional assessment tool, a hypothesis, and an
intervention linked to the functional assessment
data, there still was variability in the functional
Exceptional Children
assessment process used to identify the maintaining function of the target behavior (see Table 2).
Some studies involved both teacher and student
interviews (e.g., Ervin et al., 1998; Hoffet al.,
2005; Ingram et al., 2005; Liaupsin et al., 2006;
March & Horner, 2002; Penno et al., 2000;
Smith & Sugai, 2000; Stage et al., 2006), yet
other studies involved only teacher interviews.
Likewise, several studies involved functional analyses of behavior (e.g., F.rvin et al.; Hoff et al.;
Penno et al.; S t ^ et al.). Second, the articles reviewed contained students with different facets of
EBD as described in the article selection process.
Finally, although all studies were conducted in
school-based settings (e.g., self-contained schools,
self-contained classrooms), and not in clinical settings, there was still heterogeneitj' in the settings.
Thus, it shouid be noted that there was still variability in terms of target population, context, and
functional assessment processes. Even if the
results supported functional assessment-based
interventions as an evidence-based practice for
adolescents with or at risk for EBD according to
quality indicators posed by Horner et al. (2005),
the actual practice evaluated still may have contained variability in the components constituting
the practice despite the inclusion criteria specified
in this review.
DISCUSSION
Students with EBD pose significant challenges to
parents, teachers, and society as a whole (Kauffman, 2005). Function-based interventions are
one tertiary level, ideographic approach employed
to meet the multiple needs of this population,
particularly for elementary-age students (Lane et
ai., 1999). However, (ii net ion-based interventions
bave not yet been established as an evidencebased practice for secondary-age students with
EBD according to the criteria specified by Horner
et al. (2005). This is unfortunate given that function-based interventions are mandated per IDEA
for students with specific disciplinary circumstances (Kern et al., 2004)
In this analysis, a systematic literature review
identified 12 studies of function-based interventions conducted with middle and high school students with and at risk for EBD in school settings.
333
TABLE 2
Functional Assessment Components
Functional
Assessment Component
Schloss,
Kane,
& Miller
(¡98!)
Knapczyk
(¡988)
Knapczyk
(1992)
Gunter, Jack,
Ervin,
Shores,
DuPaul, Kern.
Carrell, &
& Friman
Flowers (1993)
(1998)
Penno
Frank,
& Wacker
(2000)
Direct observations
no
yes
yes
yes
yes
yes
Teacher interview
yes
yes
yes
no
yes
yes
Student interview
yes
no
no
no
yes
Parent interview
yes
no
yes
no
no
no
Other interview
no
no
no
no
no
no
Rating scales
no
no
no
yes
yes
no
Record search
no
no
no
no
no
y&
Functional analysis
no
no
no
no
yes
yes
Hypothesis statement
yes
yes
yes
yes
yes
yes
Intervention linked to
asse.ssnient data
yes
yes
yes
yes
yes
yes
Functional
Assessment Component
Smith &
Sugai
(2000)
March &
Homer
(2002)
Hoff,
Ervin,
& Friman
(2005)
Ingram,
Lewis-Palmer.
&Sugai
(2005)
Liaupsin
Stage, Jackson,
Umbriet,
Moscovitz,
Ferro, Urso, Erickson, Thurman,
& Upreti
&Jessee, et al
(2006)
(2006)
Direct observations
yes
yes
yes
yes
yes
yes
Teacher interview
yes
yes
yes
yes
yes
yes
Student interview
yes
yes
yes
yes
yes
yes
Parent interview
no
no
no
no
no
yes
Other interview
no
no
no
no
no
no
Rating scales
no
no
yes
no
no
yes
Record search
yes
yes
no
no
yes
no
Functional analysis
no
no
yes
no
no
yes
Hypothesis statement
yes
yes
yes
yes
yes
yes
Intervention linked to
assessment data
yes
yes
yes
yes
yes
yes
Application of the core quality indicators for single-subject research revealed only one study
(Smith Oí Sugai, 2000) as meeting all 21 components constituting the seven quality indicators
posed by Horner and colleagues (2005). Given
that only one study met this rigorous set of indicators, there is an insufficient number of studies
conducted that meet the requisite standards for
qualifying a practice as "evidence-based" according to the criteria set forth by Horner and colleagues. However, we contend that this
assessment may be based on indicators that may
be somewhat too rigorous. In the sections that
follow we (a) offer illustrations of how some of
the indicators may exceed reasonable standards
and (b) propose a different approach to evaluating
a given study against the posed quality indicators.
QuAHTY INDICATORS:
STANDARDS?
REASONABLE
As we coded the articles in the review, we discussed certain components that may be so stringent that they excluded studies that do, in fact,
make a meaningful contribution to the knowledge base. Specifically, we felt that the require-
Spring2009
mènes for describing participants, establishing repeaced measurement of the dependent variable,
repeated measurement and escablished pattern for
baseline, and stating cost-effectiveness as a component of the social validity indicator may need
CO be reconsidered.
In this analysis, a systematic literature
review identified 12 studies offiinctionhased interventions conducted with middle
and high school students with and at
risk for EBD in school settings.
Describing Participants. For example, in
Qualicy Indicator 1: Describing Participants and
Settitigs, the first component focused on participant description. To meet requisite criceria for
this componenc, che authors needed to reporc the
specific disability or condición and the "specific
instrument and process used to determine their
disability" (Horner et al., 2005, p. 167). It may be
that the latter componenc is beyond reasonable at
chis cime. Alchough ic is imporcanc to ensure precision for purposes of replication, it may be more
reasonable to require chat the process (e.g., as determined by a mulcidisciplinary team) be accepcahle racher than requiring specific instruments.
This is particularly true given information available in cumulative files and space limitations associaced wich publicación efforcs.
Repeated Measurement. As part of Qualicy Indicacor 2: Dependent Variable, componenc four
required that dependent variables be measured repeatedly over cime, and Qualicy Indicator 4; Baseline established che need for 5 daca points in
baseline, wbich we alcered to require a minimum
of 3 daca poincs. Yet, according to Kennedy
(2005), the "goal of baseline is co escablish pacterns of behavior co compare co inccrvencion.
Therefore, a baseline needs only be long enough
Co adequacely sample chis pactern" (p, 38). Consider the scudy by Ervin ec al. (1998) in which
fewer than 3 data poincs were coUecced during che
reversal phases, One could argue that because of
the dramatic change in level, additional data
points were noc warranted in the return to baseline phase. However, for purposes of chis review,
all areicles wich fewer than 3 data poincs were re-
Exceptional Children
porced as not meecing che componenc of repeaced
measurement.
Failure co meet che requisice number of daca
points per phase also influenced the extent to
which the internal validity indicator was met.
Again looking at the Ervin ec al. (1998) scudy, the
return to baseline phase for Joey had but 1 datum
point, which did not meet criteria for baseline requirement. Thus, this study did not meec che internal validity criteria. Because internal validity is
required Co establish external validicy (Wolery,
2007 personal communicacion), this also precluded this study from meeting the external validity componencs. Yet, despice che limited number
of daca points, the argument could be made for
experimental control given the clear changes in
level.
These same ramifications were recognized
when coding the study conducted by Hoff et al.
(2005). T h e brief return co baseline (2 data
points) did noc meec our minimum criceria of 3
daca poincs per phase. Therefore, the article was
coded as not having at lease three demonstrations
of experimental effect and the patterti of results
was noc sufficient given chat only 2 data points
were in the return to baseline condition. Because
internal validity was not established, excernal validity was absenc as well according to out coding
procedures. However, in inspecting the graph,
there was a very clear change in level and possibly
trend when the intervención was withdrawn. I his
serves as anocher illuscradon as to the possibility
chac some of che componencs defining each qualicy indicacor (e.g., requirement of a minimum of
3 data points) may be too scringenc. Thus, some
scudies may be excluded chac do lend support for
a given praccice.
Cost Effectiveness. The third component of
Quality Indicator 7: Social Validity required that
che intervención be "practical and cose effeccive"
(Horner et al., 2005, p. 174). Concordanc wich
Tankersley, Cook, and Cook's (in press) efforts to
evaluate Horner et al.'s quality indicators based
on information reported in cexc, our coding system required the cosc-effeccivcncss of an incervention to be seated explicitly. Vet chis requirement
may be coo rigorous because only one study
(Smich & Sugai, 2000) explicitly mentioned costeffectiveness of the intervención. Moving forward,
ic may be wise to offer clarifying poincs for evalu-
ating cost-effectiveness as many studies may indeed be cost-etFective in the sense that the benefits outweigh the costs (e.g., time, resources), even
though cost-effectiveness is not computed or discussed explicitly. One could argue tbat a practices
cost-etifectiveness could be assessed indirectly by
looking at social validity or treatment integrity
data. Namely, if the intervention was too costly in
terms of time or resources, then it would be apt to
receive a negative social validity rating or be implemented with low fidelity {Lane & BeebeFran ken berger, 2004)
We do recognize that it is difficult to develop
indicators and coding practices that can successfully capture the contribution and qualities of all
studies. For example, the Penno et al {2000}
study did not meet our criteria for establishing a
socially important change in the dependent variable. However, it should be noted that the authors reported "of particular importance is the
finding tbat behavior problems were reduced for
2 of three participants even though the instructional modifications were designed to enhance
academic performance" (Penno et al, p. 341).
The coding system we applied overlooked this
finding. Further, as we evaluated studies that were
published more than 2 decades ago, it is important to note that standards for research shift over
time. For example, the three studies not mentioning or reporting fidelity of the independent variable, required as part ot Quality Indicator 3:
Independent Variable, were published between
1981 and 1992 (Knapczyk, 1988, 1992; Schloss
et al, 1981)—prior to the emphasis placed on
treatment integrity. Finally, che study by Stage et
al (2006) reported three cases, but article selection procedures restricted coding to only secondary-age students. Consequently, the two other
applications to younger students—which met
many of the indicators—were not reported in this
APPLICATION OF THE INDICATORS:
A MoDiTiED APPROACH
Rather than evaluating only those studies that
met all indicators in entirety, another approach
might be to impose an 80% minimum criteria
with "credit" or recognition of the components
that were addressed in a given quality indicator.
For example, the dependent variable quality indicator contains five components that need to be
addressed. Moving forward, we may want to consider weighting each component, with each component contributing an equal proportion of the
quality indicator. In the case of the dependent
variable quality indicator, each component would
be weighted as contributing to 20% of the total
score tor the indicator. To illustrate, consider the
article by Ervin and colleagues (1998). This study
met the requirements for description, quantifiable
measurement, valid and well-described measurement, and IOA. Yet it did not meet the requirements for measured repeatedly. Rather than
scoring this indicator as a zero for omitting one of
the five components, a weighted scoring could be
employed as follows:
DV quality indicator = ((descriprion){l)
(.20)) + {(quantifiable measurement)(l)
{.20)) + {{valid and well-described measurement)(l) {.20)) + ({measured repeatedly)(0)
In this case, rather than applying an absolute
coding system of "met" or "not met," the study
could receive "parcial credit" for the components
that were addressed. In the above illustration of
the dependent variable indicator, the study would
receive an overall score of .80 rather than receiving a zero.
If this method was applied to all indicators
for this study, then the overall quality indicator
composite score for the Ervin et al {1998) study
with partial credit would be 3.72 (describing participants = 0.67; dependent variable = 0.80; independent variable = 1,00; baseline - 0.50;
experimental control = 0.00; external validity =
0.00; social validity = .75) as opposed to the current score of meeting one out of seven indicators.
Such a scoring system would reveal a more precise, detailed description of the critical components addressed in the study.
In lable 2, we present a total score for each
article when scored using the presence or absence
of each indicator as well as the partial-credit scoring system explained above. If we set a goal of
studies achieving 80% of the indicators (80% x 7
indicators), then studies with a total score of 5.60
could be considered rigorous enough to be evaluated in the decision of whether or not a practice is
Spring 2009
evidence-based. In this review, no additional studies would have been included for evaluating the
evidence base. However, it is possible that such a
coding procedure could influence the number of
studies included in other literature reviews.
Yet another consideration would be to differentiate between the value of each indicator.
Namely, are certain quality indicators (e.g., internal and external validity) more important than
other indicators (e.g., social validity)? Some may
argue that violating internal validity is a more serious concern than omitting social validity. If so,
should the weighted value of each indicator or
each component within each indicator be considered? Also, should the value of the indicator be
dependent on the type of study (efficacy or effectiveness) being conducted? As we move toward
conducting studies in more applied settings with
less university support, should the value of certain
indicators be viewed as more or less necessary?
studies included interviews from teachers and students (e.g., Penno et al., 2000); some included
functional analyses (e.g.. Hoffet al., 2005); and
some included record searches (e.g., Liaupsin et
al., 2006). Thus, although the interventions derived from functional assessment data were evaluated in terms of the quality indicators, the
functional assessment process was not standardized (Kern et al., 2004; Sasso et al., 2001). We
recommend that future reviews be considered in
which a particular method of conducting function-based interventions, such as the model posed
by Umbreit et al. (2007) be evaluated to determine if the specific model is an evidence-based
practice.
Despite these considerations, this article offers an initial application of the core qualiry indicators and standards for evidence-based practices
proposed by Horner et al. (2005) for single-case
methodology to functional assessment-based inCONCLUSION COMMENTS:
terventions conducted with secondary-age stuCONS/DERATIONS AND FUTURE
dents with EBD or at risk for developing EBD.
DIRECTIONS
Findings suggest that when assessed using the criteria
proposed, this practice cannot be considered
As we conclude the task of applying the quality
an
evidence-based
practice at this time. However,
indicators and standards posed by Horner et al.
we
contend
that
this
practice holds promise. Cer(2005) to function-based interventions with secondary-age students with EBD, we offer the fol- tainly, additional high-quality research may result
lowing comments. First, we applaud Horner and in the practice being considered ev id en ce-based
colleagues for the effort placed into developing for the target population using these or similar
quality Indicators for single-case research. This standards. Weighting the criteria, assigning partial
was clearly a formidable—and necessary-—^task credit, or weighting indicators depending on the
that will continue to influence how research pro- focus of the study may also be possible directions
posals and subsequent investigations will be con- for reñning the application of indicators; in this
ducted. We value the concept of setting standards way, researchers are certain to include all meanand hope that our goal of offering input as to ingful and trustworthy studies of the practices
where these indicators may be too stringent and and ensure that important contributions to this
in need of modification is received in the spirit body of literature are not eliminated based on criintended: to establish scientifically valid, yet reateria being unattainable. In the years to come, it
sonable indicators for evaluating single-subject
will be important to be thoughtful and careful as
work.
scholars and stakeholders use the proposed indicaFinally, in this field testing of the proposed tors. There is a delicate balance between mainquality indicators and standards for evidence- taining high scientific rigor and potentially
based practice, we want to point out rhat all arti- eliminating or ruling out the use of promising
cles met inclusion criteria of having employed at practices, such as fiinction-based interventions for
least one functional assessment procedure, stating adolescents with and at risk for EBD, that are asa hypothesis, and linking the intervention to the
sociated with improved behavioral and academic
assessment results. Yet, there was still variability in
performance.
the functional assessment process. Namely, some
Exceptional Children
337
REFERENCES
References marked with an asterisk indicate
experiments included in the analysis.
students with emotional and behavioral disorders. Behavioral Disorders, 24, 26-33.
Gersten, R., Fuchs, L. S., Compton, D., Coyne, M.,
Greenwood, C , Si Innocenti, M. S. (2005). Quality
American Psychiatric Association. (2001). Diagnostic
indicators for group experimental and quasi-experiand statistical manual of mental disorders (DSM-IV-TR;
mental research in special education. Exceptional Chil5th ed.). Washington, DC: Author.
dren, 71, 149-164.
Baer, D. M., Wolf, M. M., & Risley, T. R. (1968).
Gresham, F. M. (1989). Assessment of treatment inSome current dimensions of applied behavior analysis.
tegrity in school consultation and prereferral intervenJoumal of Applied Behavior Analysis, Î, 91-97.
tion. School Psychology Review, 18, 37-50.
Bullis, M., & Yovanoff, P (2006). Community employGresham, F M. (2002). Social skills assessment and inment experience.s of formerly incarcerated youth. Jourstruction for students with emotional and behavioral
nal ofEmotional and Behavioral Disorders, 14, 71-85. disorders. In K, L. Lane, F. M. Gresham, & T. E.
Center, D. B., Deitz, S. M., & Kaufman, M. E.
(1982). Student ability, task difficulty, and Inappropriate classroom behavior. Behavior Modification, 6,
355-374.
Cole, C. L., Davenport, T. A., Bambara, L. M., &
Ager, C. L. (1997). Effects of choice and task preference on the work performance of students with behavior problems. Behavioral Disorders, 22, 65-74.
Conroy, M. A., Dunlap, G., Clarke, S., & Alter, R J.
(2005). A descriptive analysis of positive behavioral intervention research with young children with challenging behavior. Topics in Early Cbildhood Special
Education, 25, 157-166.
•DePaepe, P A., Shores, R. E., Jack, S. L., & Denny, R.
K. (1996). EfFects of task difficulty on the disruptive
on-task behavior of students with severe behavior disorders. Behavioral Disorders, 21. 216-225.
O'Shaughnessy (Eds.), Interventions for children with or
at risk for emotional and behavioral disorders (pp.
242-258). Boston: Allyn & Bacon.
Gresham, F. M. (2004). Current status ajïd future directions of school-based behavioral interventions.
School Psychology Review, 33, 326-343.
•Guncer, P L., jack, S. L., Shores, R. E., Carrell, D. E.,
& Flowers, J. (1993). Lag sequential analysis as a tool
for fiinctional analysis of student disruptive behavior in
classrooms. Joumal of Emotional and Behavioral Disorders, 1. 138-148.
Heckaman, K., Conroy, M., Fox, J., ߣ Chait, A.
(2000). Functional assessment-based intervention research on students with or at risk for emotional and behavioral disorders in school settings. Behavioral
Disorders, 25, 196-210.
•HofF, K. E., Ervin, R. A., & Friman, P C. (2005). ReDrasgow, E., & Yell, M. L. (2001). Functional behavior fining functional behavioral a.ssessment: Analyzing the
assessment: Legal requirements and challenges. Sebool separate and combined effects of hypothesized controlling variables during ongoing classroom routines.
Psychology Review, 30, 239-251.
School Psychology Review, 34. A'i-'bl.
Dunlap, G., & Childs, K. E. (1996). Intervention research in emotional and behavioral disorders: An analy- Horner. R. H.. Carr, E. G., Halle, J., McGee, G.,
sis of studie.s from 1980-1993. Behavioral Disorders, Odom, S.. & Wolery, M. (2005). The use of singlesubject research to Identify evidence-based practice in
21, 125-1.36.
special education. Exceptional Children, 71, 165-179.
*Ervin, R. A,, DuPaul, G. j . . Kern, L, & Friman, P C.
(1998). Classroom based functional and adjunttive as- Horner, R. H., & Sugai, G. (2000). School-wide besessments: Proactive approaches to intervention selec- havior support: An emerging m'iúsiúvc. Journal of Position for adolescents with attention deficit hyperactivity tive Behavior Interventions, 2, 231-232.
disorder. Journal of Applied Behavior Analysis, 31, Individuals With Disabilities Education Act Amend65-78.
ments of 1997. Pub. L No. 105-17, Section 20, 111
Stat.
37 (1997). Washington, DC: U.S. Government
*Ervin, R. A.. Kern, L, Clarke, S., DuPaiil, G. ]., DunPrinting
Office.
lap, G., & Friman, P C. (2000). Evaluating assessmentbased intervention strategies for students with ADHD Individuals With Disabilities Education Improvement
and comorbid disorders within the natural classroom Act of 2004, 20 U.S.C §§ 1400 et esq. (2004).
context. Behavioral Disorders, 25, 344-358.
*Ingram, K., Lewis-Palmer, T., & Sugai, G. (2005).
Fox, J., Conroy, M., Si Heckaman, K. (1998). Research Function based intervention planning: Comparing the
in fiinctionai assessment of the challenging behaviors of effectiveness of FBA function-based and non-function
338
Spring 2009
based intervention planning. Journal of Positive Behav- Loeber, R., Green, S. M., Lahey, B. B., Frick, P. J., &
ior Interventions, 7, 224-236.
McBurnecr, K. (2000). Findings on disruptive behavior
disorders
from the fitst decade of the developmental
Iwata, B. A., Dorsey, M. F., SÜfet, K. J., Bauman, K., &
trend
study.
Clinical Child and Eamily Psychology ReRichman, G. S. (1982). Toward a functional analysis of
self-injury. Analysis and Intervention in Developmental view, 3, 37-59.
Disabilities, 2, 3-20. (Reprinted in Journal of Applied 'March, R. E., & Horner, R. H. (2002). Feasibility and
Behavior Analysis, 29, 133-135)
contributions of functional behavioral assessment in
schools.
Journal of Emotional and Behavioral Disorders,
Kauffman, J. (2005). Characteristics of emotional and
10,
138-170.
behavioral disorders of children and youth (8th ed.).
Morris, R. J.. Shah, K., Ik Morris, Y. P. (2002). InterKennedy, C. H. (2005). Single-case designs for educa- nalizing behavior disortiers. In K. L. Lane, F. M. Gresham, & T. E. O'Shaughnessy (Eds.). Interventions for
tional research. Boston: Allyn &L Bacon.
children with or at risk for emotional and hehavioral disKern, L., Childs, K., Dunlap, G., Clarke, S., & Falk, orders (pp. 223-241). Boston: Allyn and Bacon.
G. (1994). Using assessment-based curricular intervention to improve the classroom behavior of a student Morrison, G. M., Robertson, L, Laurie, B., & Kelly, J.
with emotional and behavioral challenges. Journal of (2002). Protective factors related to antisocial behavior
uti\tcioñcs. Journal of Clinical Psychology, 58, 277-290.
Applied Behavior Analysis, 27, 7-19.
Kern, L., Delaney, B., Clarke, S., Dunlap, G., & Nelson, J. R., Benner, G. J., Lane, K., & Smith, B. W.
(2004). An investigation of the academic and behavChilds, K. (2001). Improving the classroom behavior
ioral disorders in public school settings. Exceptional
of students with emotional and behavioral disorders
Children, 71. 59-74.
using individualÍ7£d curricular modifications./úama/of
O'Reilly, M., Tiernan, R., Lancioni, G., Lacey, C ,
Emotional and Behavioral Disorders, 9, 239-247.
Hillery, J., & Gardiner, M. (2002). Use of a self-moniKern, L., Hilt, A. M., & Gresham, F. (2004). An evalutoring and delayed feedback to increase on-task behavation of the functional behavioral assessment process
ior in a post-institutionalized child within regular
used with .students wirh or at risk for emorional and beclassroom settings. Education and Treatment of Chilhavioral disorders. Education and Treatment of Children,
I. 25, 91-102.
27, 440-452.
•Penno, D. A., Frank, A. R., & Wackcr, D. P. (2000).
•Knapczyk, D. R. (1988). Reducing aggressive behavInstructional accommodations for adolescent students
iors in special and regular class settings by training alwith severe emocional or behavioral disorders. Behavternative social responses. Behavioral Disorders, 14,
ioral Disorders, 25, 325-343.
27-39.
Platt, j . S., Harris, J. W., & Clements, J. E. (1980).
•Knapczyk, D. R. (1992). Effects of developing alternaThe effects of individually designed reinforcement
tive responses on the aggressive behavior of adolescents.
schedules on attending and academic performance with
Behavioral Disorders, 17, 247-263.
behaviorally disordered adolescents. Behavioral DisorLane, K. L., & Beebe-Frankenbcrger, M. E. (2004). ders. 5. 197-205.
School-based interventions: The tools you need to succeed.
Quinn, M. M., Gable, R. A., Fox, J., Rutherford, R.
Boston: Allyn & Bacon.
B., jr.. Van Acker, R., & Conroy, M. (2001), Putting
Lane, K. L., Umbreit, J., & Beebe-Franken berger, M. quality functional assessment into practice in schools: A
(1999). A review of functional assessment research with research agenda on behalf of E/BD students. Education
students with or at-risk for emotional and behavioral and Treatment of Children, 24, 261-275.
disorders. Journal of Positive Behavioral Interventions, I,
Sasso, G. M., Conroy. M. A., Stichter, J. P., & Fox, J. J.
101-111.
(2001). Slowing down the bandwagon: The misappliLane, K. t., Wolery, M., Reichow, B., & Rogers, L. cation of functional assessment for students with emo(2006). Describing baseline conditions: Suggestions for tional or behavioral disorders. Behavioral Disorders, 26,
study reports. Journal of Behavioral Education, 16, 282-296.
224-234.
Sasso, G. M., Reimers, T. M., Cooper, L. J., & Wacker,
Upper Saddle River, NY: Pearson Merrill Prentice Hall.
'Liaupsin, C. J., Umbreit, J., Ferro, J. B., Urso, A., &
Upreti, G. (2006). Improving academic engagement
through systematic, function-based intervention. Education and Treatment of Children, 29, 573-591.
Exceptional Children
D. P (1992). Use of descriptive and experimental analyses to identify the functional properties of aberrant behavior in school settings. Journal of Applied Behavior
Analysis, 25, 80^-821.
339
•Schloss, P. J., Kane, M, S., & Milier, S. (1981). Truancy incervention with behavior disordered adolescents.
Behavioral Disorders, 6, 175-179.
*Smith, B. W., & Sugai, G. (2000). A self-management
functional assessment-based behavior support plan for
a middle school student with EBD. Journal of Positive
Behavior Interventions, 2, 208-217.
Strength, integrity, and efïeaiveness. Journal of Consulting and Clinical Psychology. 49, 156-167.
ABOUT
T H E
A U T H O R S
KATHLEEN LYNNE LANE (CEC TN Federation), Associate Professor, Deparcmenc of Special
Educación; JEMMA ROBERTSON KALBERG,
Graduare of che Department of Special Education
ac Peabody College of Vanderbilc Universicy; and
JENNA COURTNEY SHEPCARO (CEC TN
Sterling-Turner, H. E., Robinson, S. L., & Wilczynski, Federación), Graduare of the Department of SpeS.M. (2001). Functional assessment of distracting and cial Educación ac Peabody College of Vanderbilt
disruptive behaviors in the school setcing. School PsyUniversicy, Nashville, Tennessee.
chology Review, 30, 211-226.
*Scage, S. A., Jackson, H. G., Moscovitz, K., Erickson,
M. J., Thurman, S. O., Jessee, W,, & Olson, E. M.
(2006). Using multimethod-mulcisource functional behavioral assessment for students with behavioral disabilicies. School Psychology Revieiv, 3.5, 451-471.
Tankersley, M., Gook, B. G., & Cook, I... (in press). A
preliminary examination to identify the presence of
qualicy indicators in single-subjecc research. Education We thank Robert Horner, Edward Carr, James
and Treatment of Children, 31.
Halle, Gail McCee, Samuel Odom, and Mark
Umbreit, j . . Ferro, J., Liaupsin, C, & Lane, K. (2007). Wolery for their seminal article. The Use of SingleFunctional behavioral assessment and Jiinction-hased inSubject Research to Identify Evidence-Based Practice
tervention: An effective, practical approach. Upper Saddlein Special Education. Furcher, we thank Jessica
River, N J: Prentice-Hall.
Weisenbach and M. Annette Little for their conUmbreit, J., Lane, K. L.. & Dejud, C. (2004). Improv- tribution to the literature search.
ing classroom behavior by modifying task difficulty: Effects of increasing the difficulty of coo-easy tasks. Address correspondence co Kathleen Lynne Lane,
Journal ofPositive Behavior Interventions, 6, 13-20.
Deparcment of Special Education, Peabody
Walker, H. M., Ramsey, E., & Gresham, E M. (2004). College, Vanderbilc Universicy, Nashville, TN
Antisocial behavior in school: Evidence-based practices 37203, (615) .322-8179 (e-mail: kachieen.lane@
(2nd ed.). Belmonc, GA: Wadsworth/Thomson Learn- vanderbilc.edu).
ing.
Yeaton, W, & Sechrest, L. (1981). Critical dimensions
in che choice and maintenance of successful treatments:
340
Manuscripc received Augusc 2007; accepted
March 2008.
Spring 2009