12.neuringer, A., - Jensen, G. (2013) - Operant Variability.

Chapter 22
Operant Variability
Allen Neuringer and Greg Jensen
During almost the entirety of a documentary film world, including mind and behavior, scientists must
featuring Pablo Picasso (Clouzot, 1956), the camera appreciate the reality of unpredictability. From Epi-
is focused on the rear of a large glass screen that curus in the 3rd century BC, who hypothesized ran-
serves as a canvas. As viewers watch, each paint dom swerves of atoms, to contemporary quantum
stroke appears and a theme emerges, only to be physicists, some have posited that nature contains
transformed in surprising ways. In many of the within it unpredictable aspects that cannot be
paintings, what started out as the subject is modi- explained by if-A-then-B causal relationships, no
fied, sometimes many times. Erasures; new colors; matter how complex those relationships might be.
alterations of size; and, indeed, the very subject of In this chapter, we discuss the unpredictability
the painting flow into one another. Each painting of behavior and focus on one aspect of it. When
can readily be identified as a Picasso, but the process reinforcers are contingent on variability, or more
seems to be filled with unplanned and unpredictable precisely on a level of operant–response variability
turns. Uncertainty and surprise characterize other (with levels ranging from easily predictable
arts as well. A fugue may be instantly recognizable responding to randomlike), the specified level will
as a composition by J. S. Bach, and yet the transi- be generated and maintained. Stated differently,
tions within the fugue may astound the listener, response unpredictability can be reinforced. Stated
even after many hearings. Leonard Bernstein wrote yet another way, variability is an operant dimen-
of the importance of establishing expectancies in sion of behavior. Operant dimension implies a bidi-
musical compositions and then surprising the lis- rectional relationship between behavior and
tener. Fiction writers describe their desires to com- reinforcer. Responses influence (or cause) the rein-
plete novels to find out what the characters will do forcers, and reinforcers influence (or cause) reoc-
because the authors can be as uncertain about their currence of the responses. The same bidirectional
creations as are their readers. relationship is sometimes true of response dimen-
Everyday behaviors may similarly be described in sions as well. For example, when food pellets are
terms of generativity and unpredictability. Listen contingent on rats’ lever presses, a minimum force
carefully to the meanderings of a conversation. must be exerted in a particular direction at a par-
Watch as an individual walks back and forth in his ticular location. Force, direction, and location are
or her office, the tilt of the head, the slight changes response dimensions that are involved in the con-
in pace or direction. Monitor the seemingly unpre- trol of reinforcers and come to be controlled by the
dictable transitions in one’s daydreams or images. reinforcers. Variability is related to reinforcement
Science is generally thought to be a search for pre- in the same way. We refer to this capacity as the
dictable relationship—if A then B—but throughout operant nature of variability or by the shorthand
history, some have argued that to understand the operant variability.
DOI: 10.1037/13937-022
APA Handbook of Behavior Analysis: Vol. 1. Methods and Principles, G. J. Madden (Editor-in-Chief)
513
Copyright © 2013 by the American Psychological Association. All rights reserved.
Neuringer and Jensen
The very idea of operant variability is surprising sequence of eight L and R responses in that trial dif-
to many and, at first blush, seems counterintuitive. fered from those in each of the previous 50 trials (as
Does variability not indicate noise? How can noise evaluated across a moving window). This contin-
be reinforced? In fact, does reinforcement not con- gency was referred to as lag 50. If the current
strain and organize responses—by definition—and sequence repeated any one (or more) of the previous
is that definition not confirmed by observation? As 50, then a brief time out (darkening of all lights)
we show, the answers are sometimes yes, but not resulted, and food was withheld. After food or time
always. Operant variability provides an important out, the keylights were again illuminated, and
exception, one that may be a factor in the emission another trial initiated. Sequences during the first 50
of voluntary operant responses generally. trials of a session were checked against the trials at
The chapter is organized broadly as follows. We the end of the previous session, that is, performance
discuss was evaluated continuously across sessions. Approx-
imately 25 sessions were provided under these VAR
■■ Experimental evidence showing that reinforcers contingencies.
and discriminative stimuli control behavioral Let us consider some possible outcomes. One
variability; would be that the birds stopped responding—
■■ Relationships between reinforcement of variabil- responding extinguished—because the lag 50
ity and other influences; requirement was too demanding. At the other end of
■■ Explanations: When variability is reinforced, the possibility spectrum, the birds cycled across at
what in fact is being reinforced? least 51 patterns and by so doing were reinforced
■■ How operant variability applies in such areas as 100% of the time, with each sequence being differ-
creativity, problem solving, and psychopathol- ent from every one of the previous 50. Although
ogy; and unlikely for pigeons, one way to solve lag contingen-
■■ How reinforced variability helps to explain the cies is to count in binary, with L = 0 and R = 1,
voluntary nature of operant behavior generally. then LLLLLLLL followed by LLLLLLLR, followed by
LLLLLLRL, then LLLLLLRR, and so on. Lag proce-
dures have also been used with human participants,
Reinforcement of Variability
and therefore such sophisticated counting behavior
As a way to describe the phenomenon, we begin must be considered. A third possible result would be
with descriptions of some of the methods that have alternations between a few preferred sequences
been successfully used to reinforce variability. (these would not be reinforced because of their high
frequencies but would fill the lag window) and an
Recency-Based Methods occasional “do something else” (leading to rein-
Imagine that a response earns a reinforcer only if it forcement). The fourth possibility would be that L
has not been emitted recently. Page and Neuringer and R responses were generated in randomlike fash-
(1985) applied this recency method, based on ion, or stochastically,1 as if the birds were flipping a
Schwartz (1980, 1988), to pigeons’ response coin.
sequences across two illuminated keys, left (L) and This last alternative best describes the results.
right (R). Each trial consisted of eight responses, One piece of evidence was that reinforcement
yielding 256 (or 28) different possible patterns of occurred on approximately 70% of the trials (with
L and R, for example, LLRLRRRR. In the initial the other 30% leading to time-outs; Page & Neu-
variability-reinforcing (or VAR) phase of the experi- ringer, 1985). The pigeons’ performances were com-
ment, a pattern was reinforced only if it had not pared with the results of a simulated model in which
occurred for some number of trials, referred to as a computer-based random-number generator pro-
the lag. A trial terminated with food only if the duced L and R responses under exactly the same
Throughout this chapter, we use stochastic and random interchangeably.

1
514
Operant Variability
reinforcement contingencies experienced by the If Page and Neuringer’s (1985) experiment had
pigeons. The simulation showed that the model was stopped at this point, there would be uncertainty as
reinforced on 80% of trials because, by chance, to why responses varied. The issue is tricky because
response sequences were repeated within the win- it involves more than whether the lag procedure
dow of the lag 50 approximately 20% of the time. resulted in response variability (which it clearly
Thus, the pigeons’ performances were similar to, did). Rather, was variability directly reinforced, or
although not quite as good as, that of a random could the results be explained differently? Variabil-
response model. ity could have resulted from extrinsic sources (noise
A second source of support for approximations in the environment) or intrinsic sources (within the
to random generation was provided by statistical organism), or it could have been caused by experi-
analyses of the sequences, namely by the U statistic mental error, an insufficient flow of reinforcers, or
(Page & Neuringer, 1985). That statistic is a mea- any number of other things. To show that variability
sure of uncertainty or entropy and is calculated from depended on the “if vary, then reinforce” contin-
the relative frequencies of a set of items using the gency, a control procedure provided reinforcers after
equation some eight-response trials and time outs after others,
just as in the VAR condition, but these reinforcers
RFi • log ( RFi ) were now unrelated to the pigeon’s sequence varia-
U = −∑ . (1)
n log ( n ) tions. Under this control, reinforcers and time outs

were yoked to those delivered during the VAR phase.
Here, RFi refers to the relative frequency of element In other words, the yoke condition was identical to
i, out of n total elements. As a convention, every the VAR condition (eight responses per trial, and trials
RFi = 0.0 is considered to contribute a value of 0.0 were followed by food or time out at exactly the same
to the sum, without an attempt to resolve log(RFi). rates as during VAR), except that the pigeon received
When all elements occur with equal frequency, U is food and time outs whether or not the variability con-
maximal with a value of 1.0; if any single element tingency had been met. Each pigeon’s terminal six ses-
has a frequency of 1.0 (and all others are 0.0), then sions under the lag 50 VAR contingencies provided
U is minimal with a value of 0.0. the trial-by-trial schedule for its reinforcements and
In the Page and Neuringer (1985) study, three time outs under the yoke condition.
levels of U value were analyzed at the end of each The yoke procedure produced large and consis-
session. U value was calculated for the relative fre- tent effects. Levels of variability fell rapidly and
quencies of L and R; for relative frequencies of dyads remained low under the yoke condition. U values
(namely LL, LR, RR, and RL); and for triads (e.g., that had approached the 1.0 of a random model in
LLL, LLR, LRL . . .). The birds’ U values were com- the VAR condition dropped to values closer to 0.50,
pared with those from the random model. As indicating substantially more sequence repetition.
expected, the random model produced U values Other statistics confirmed the increased repetitive-
close to 1.0 at each level of analysis, and the pigeons’ ness and predictability of responding. These effects
U values also approached 1.0, although not quite as were replicated with an A-B-A-B design (VAR–yoke–
closely as the random model. Thus, in this case, the VAR–yoke), yielding a conclusion that direct rein-
results can best be described as randomlike but dis- forcement of variability was responsible for the high
criminably different from a true random source. variability (see Figure 22.1).
Rather than responding equiprobably, the birds Lag contingencies have been used with other
demonstrated biases, for example, favoring one key species, including humans, monkeys, rats, budgeri-
over another or favoring repetition over switching. gars, and fish (see Neuringer, 2002, for a review).
We return later to a more detailed discussion of With rats, four-response trials across L and R levers
whether operant responses can be generated sto- are often used, in which case 16 (or 24) different
chastically when the reinforcement contingencies sequences are possible. Here, too, high sequence
are more demanding. variability is observed. In an example of a human
515
response frequencies (rather than recency) provides

a partial solution, and we describe it next.
Frequency-Based Methods
In these procedures, reinforcement is contingent on
low overall relative frequencies. As one example,
rats’ responses were reinforced on the basis of four-
response trials (across L and R levers), with, as we
indicated, 16 different possible sequences (LLLL,
LLLR, LLRL, LLRR, etc.; Denney & Neuringer,
1998). Frequencies of these sequences were updated
throughout each session in 16 separate counters, or
bins, and a sequence was reinforced only if its rela-
tive frequency—the number of times that it was
emitted divided by the total number of sequences—
was less than some designated threshold value.
Reinforcement of low relative frequency sequences
Figure 22.1. Three levels of the U statistic, an has the advantage of permitting occasional rein-
index of behavioral variability (U1 based on number forcement of repetitions. This procedure has several
of left [L] and right [R] responses; U2 on LL, LR, RR, technical aspects. For example, after each trial, all
and RL dyads; and U3 based on triads), during the
lag 50 reinforcement-of-variability phases and yoke
bins are multiplied by an exponent, for example,
d-VR phases. VR = variable ratio; F = first session in 0.95, which results in recent sequences having more
each phase; L = last session in each phase. Adapted weight than those emitted in the past, a kind of
from “Variability Is an Operant,” by S. Page and A. memory decay (see Denney & Neuringer, 1998, for
Neuringer, 1985, Journal of Experimental Psychology:
Animal Behavior Processes, 11, p. 445. Copyright 1985 details). The important point is that, as with lag,
by the American Psychological Association. highly variable responding was generated. The pro-
cedure has been used in many experiments with
procedure, Stokes and Harrison (2002) presented on yoke serving as the control (Neuringer, 2002).
a computer screen a triangle consisting of one loca- In an interesting variant of the procedure, Don-
tion at the top, two locations in the next row down, ald Blough (1966) reinforced variable interresponse
three in the third, and so on until the sixth row, times (IRTs). Blough’s goal was a difficult one,
which contained six locations. A trial involved mov- namely, to see whether pigeons could learn to
ing from the top row to the bottom, thereby requir- behave like an emitter of atomic particles, the most
ing five responses, with 32 possible patterns. These random of physical phenomena (or, put another
five-response sequences were reinforced under lag way, to respond as would a Geiger counter). For a
contingencies, and high levels of variability often random-in-time responder, the likelihood of a
resulted. In this procedure, however, as well as oth- response is independent of whether a previous
ers with humans, some (albeit rarely observed) par- response had occurred recently. To accomplish this,
ticipants use a different strategy of cycling through a Blough created a series of IRT bins, such that a ran-
subset of sequences, such as a binary counting strat- dom responder would be expected to have an equal
egy. Another problem with lag procedures is that number of IRTs in each bin. A moving window of
they never reinforce repetitions, and a random gen- 150 responses was analyzed in real time, with each
erator sometimes repeats (e.g., if 16 sequences are response allocated into one of 16 bins, depending on
possible, then for a random generator, the probabil- its IRT. Blough then only reinforced an IRT falling
ity of two identical back-to-back sequences is .0625). in the bin with the lowest current relative frequency,
An alternative method that bases reinforcement on that is, he reinforced only for the least frequent2 IRT
As such, this contingency is called a least frequent contingency.
2
516
Operant Variability
in a given window. The procedure resulted in the all behaviors are strictly determined—by inheri-
pigeons learning to approximate the IRT distribu- tance, experiences, stimuli, responses, and the
tions of a truly random generator, that is, to distrib- like—and therefore that random behavior is not
ute pecks (with some minor exceptions resulting possible, certainly not when voluntarily attempted.
from double pecks) much as a Geiger counter would However, none of the earlier studies tried to rein-
respond. force randomlike behavior directly. This approach
was accomplished with a procedure that required
Statistical Feedback Methods students to enter tens of thousands of responses at a
If the goal is to test whether animals and people can computer terminal (Neuringer, 1986). A trial con-
respond truly randomly, then both recency and fre- sisted of 100 responses across two keys (which we
quency methods have a potential weakness. As indi- refer to as 1 and 2) with feedback, based on com-
cated earlier, systematic strategies, such as binary mon statistical tests of randomness, presented at the
counting, can provide higher frequencies of reinforce- end of each trial. At first, satisfying one statistical
ment than responding stochastically. Although some- test was reinforced, then two tests had to be satis-
times present in animals, this type of strategy is most fied, and then three, and so on—with graphical
commonly observed in human participants. Note that feedback showing each statistic relative to that
exploiting weaknesses in the variability-reinforcing expected from a true random source—until partici-
contingencies—by responding in a systematic way pants were passing 10 evaluations of random
that maximizes reinforcement—is not a sign of insen- sequences.
sitivity to the schedule requirements. If anything, it is The challenge confronting the participants was
precisely the opposite. An additional problem is that even greater than just described. Participants were
all statistical tests of randomness (and therefore rein- required to generate a distribution of statistical val-
forcement contingencies based on randomlike ues that would be expected from a random source
responding) have certain blind spots that result in (Neuringer, 1986). To take a simple example, across
false positives. Therefore, researchers sought an alter- many trials of 100 responses each, random genera-
native procedure, one that was better at detecting, tion of 1s and 2s shows a distribution of proportions
and therefore not reinforcing, strategic or patterned of 1s and 2s. The most likely outcome would be
responding. It seemed reasonable to hypothesize that approximately equal numbers of 1s and 2s, or 50%
if a reinforcement contingency was based on a multi- each, but some trials would occur in which, for
plicity of different measures of variability, it might be example, there were 40% 1s and 60% 2s, or vice
less likely to reward exploitative strategies and poten- versa, and fewer trials of, say, 30% 1s (or 2s). The
tially lead more reliably to approximations to random participants’ task, therefore, was not simply to
outputs, especially in human participants. match the average of a random distribution but
Before we describe the relevant experiment, note more precisely to approximate the randomly gener-
that the attempt to reinforce randomness flies in the ated distributions. This task required many weeks of
face of more than 50 years of research in which peo- training, but all participants learned to approximate
ple were asked to generate random sequences (e.g., the random model according to 10 simultaneously
“Pretend you are flipping a coin”). The consistent applied statistics, and some participants (but not all)
conclusion from this large body of studies was that were able to pass additional tests as well.
people do not respond randomly when so requested How should this research be interpreted?
(Brugger, 1997), and indeed, some researchers con- Because so much training was necessary, it would
cluded that people cannot respond randomly. (The seem that the ability to respond unpredictably is
literature on human randomness rarely references unnatural, but that would be a misinterpretation. If
nonhuman animal studies.) This conclusion is fun- you, the reader, were to call out 100 instances of
damentally important because randomness implies heads and tails and try to do so unpredictably, it is
absence of identifiable causes and independence unlikely that an observer of your behaviors, even
from determination. Most psychologists assume that with access to sophisticated statistical analyses,
517
could predict your responses with a high degree of difficult for the subject to produce never-before-seen
accuracy. You can, without training, respond quite behaviors and increasingly difficult for the observer
unpredictably. The requirement to pass 10 statistical to discriminate among the various behaviors being
tests, however, demands equivalence, over the long emitted. At least over the short run, however, rein-
run, of instances, dyads, triads, and the like and forcing novel responses led to an exceedingly high
absence of all biases. To use a rat’s operant response level of unpredictable behaviors.
as an analogy, it is quite easy to train a rat to press a The Pryor et al. (1969) study was followed by an
lever for food pellets. Indeed, that can often be analogous one with preschool children (Goetz &
accomplished in an hour-long session. To train a rat Baer, 1973). The children were rewarded for block
to respond precisely at some given force, or with constructions that differed from any that had previ-
precise interresponse intervals, may take weeks or ously been observed during the session. As training
months of training. In similar fashion, approximat- proceeded, the children built increasingly varied
ing true randomness is difficult to attain, but forms, including ones never before made by the
responding variably, and to large extent unpredict- child. Similar results were obtained with the draw-
ably, is readily achieved. In rats, for example, highly ing of color pictures as the target behavior (Holman,
variable operant responding is obtained within a few Goetz, & Baer, 1977).
sessions (McElroy & Neuringer, 1990). The evidence from many methods has therefore
shown control over response variability by directly
Novelty-Based Methods contingent reinforcers (see also Hachiga & Saka-
To evaluate variability, each of the methods dis- gami, 2010; Machado, 1989, 1992, 1997). Variabil-
cussed to this point requires that a set of possibili- ity is highest when reinforcers follow high
ties be explicitly defined—responses, sequences, variability. In the next section, we show that rein-
paths, or times. Mathematical definitions of random- forcers exert even more precise control than that:
ness, statistical analyses, and reinforcement contin- Levels of variability can be specified, levels that span
gencies depend on such specification. However, in the range from response repetitions (or stereotypy)
many outside-of-lab situations, the set may not be to response unpredictability. As such, variability
known, and an alternative method allocates rein- parallels other operant dimensions in which rein-
forcers for novel, or not previously emitted (e.g., forcers influence exactly how fast to respond or
within a session or ever) responses. Reinforcement when, with what force, or at which location.
of novel responses was first used by Pryor, Haag,
and O’Reilly (1969) in research with porpoises. At Levels of Variability
the beginning of each session, Pryor et al. waited Other experiments in the Page and Neuringer
until they observed some behavior not previously (1985) article described earlier applied different lag
emitted by the porpoise and then selected the new values in different phases, from lag 1 (the current
behavior for consistent reinforcement during the sequence of eight responses had to differ from the
session. This procedure resulted in the porpoise single previous sequence) to lag 50 (the current
sequence had to differ from each of the previous 50
emitting an unprecedented range of
sequences). As the lag increased, requiring that
behaviors, including aerial flips, gliding
sequences be withheld for an increasing number of
with the tail out of the water, and “skid-
trials, responses generally became increasingly
ding” on the tank floor, some of which
unpredictable (as assessed by U values, number of
were as complex as responses normally
different sequences per session, and other statistics;
produced by shaping techniques, and
see also Machado, 1989). Frequency-based methods
many of which were quite unlike anything
show similar control over levels. For example,
seen in . . . any other porpoise. (p. 653)
Grunow and Neuringer (2002) used a different
One problem with long-term use of this procedure, threshold reinforcement criterion with each of four
however, is that, over time, it becomes increasingly groups of rats: one that required rats to distribute
518
Operant Variability
three-response sequences (across three different variability as a function of the reactions of their play-
operanda) in a way that paralleled a random genera- mates. When entertaining a child, the actions of an
tor (high variability), another that required adult sometimes include surprises, such as tickling,
medium-high variability, another that required as well as repetitions, such as repeatedly bouncing a
medium-low variability, and the last that permitted child on one’s lap, and the child’s reactions influence
frequent repetitions. Levels of variability were again the action’s (un)predictability. Similarly, in conver-
controlled by these specific requirements, as shown sations, the speaker is (often) sensitive to the reac-
by the leftmost points in Figure 22.2 (the other tion of the listener with variations in topic as well as
points in the figure are discussed later). Several in prosody, loudness, and speed. Unpredictability is
additional studies have demonstrated reinforcement particularly important in competitive situations.
control over precise levels of variability in pigeons Consider the example of table tennis: When a skilled
(Neuringer, 1992) and people (G. Jensen, Miller, & player (S) plays with a beginner (B), S will often
Neuringer, 2006). return the ball in a way that B can easily predict, but
Precisely controlled levels of behavioral as B becomes increasingly capable, S will vary ball
(un)predictability can be observed in many natural placement and speed until a high level of unpredict-
situations. Variable behaviors are used to attract ability is (sometimes) manifest. Precise control of
attention, as when male songbirds increase the vari- levels of unpredictability plays a substantial role in
ability of their songs in the presence of a receptive game theory, under the rubric of mixed strategies
female (Catchpole & Slater, 1995). During play (see Glimcher, 2003; Smith, 1982). These examples
and games, animals and people modulate levels of are only a few of the commonplace variations in
Figure 22.2. U value as a function of reinforcement frequencies. Each line rep-

resents a different group: .037 = very high variability (var) required for reinforce-
ment; .37 = very low variability required; .055 and .074 = intermediate levels
required. CRF = continuous reinforcement, or reinforcement every time variability
contingencies were met; VI 1 = variable-interval reinforcement for meeting vari-
ability contingencies no more than once per minute, on average; VI 5 = variable-
interval reinforcement no more than once every 5 minutes. From “Learning to Vary
and Varying to Learn,” by A. Grunow and A. Neuringer, 2002, Psychonomic Bulletin
and Review, 9, p. 252. Copyright 2002 by the Psychonomic Society, Inc. Adapted
with permission.
519
levels of response (un)predictability that character- of the rectangles were evaluated: area (the number
ize many real-world operant behaviors, variations of pixels enclosed by the rectangle), location (the
that are controlled by consequences. We discuss position of its center point), and shape (its height-
additional real-world applications later. to-width ratio). To be reinforced, the rectangles had
to vary along two of these dimensions while repeat-
Orthogonal Dimensions ing along the third. The participants were told noth-
As indicated in the introduction, reinforcement ing about these criteria, and the only instructions
often depends on a combination of many aspects of were to gain points by drawing rectangles. Partici-
a response. For example, a child may receive a rein- pants were randomly assigned to one of three
forcer for saying “thank you” but only when the groups, with rewards delivered in one group when
child (a) speaks slowly and (b) makes eye contact. the areas of the drawn rectangles were approxi-
Because responses can vary across many dimensions mately the same, trial after trial, but locations and
independently from one another, one can readily shapes varied. The other two groups had analogous
imagine circumstances in which it might be func- contingencies, but for one, locations had to repeat,
tional to vary some dimensions of behavior while and for the other, shapes had to repeat. All partici-
keeping others highly predictable. pants learned to meet their respective three-part
A demonstration of the independent reinforce- contingencies, varying and repeating as required
ment of variability and predictability along indepen- (Figure 22.3). Thus, binary feedback—reinforcement
dent dimensions was provided by Ross and or not—influenced variability and repetitions along
Neuringer (2002). They instructed college students three orthogonal dimensions and did so indepen-
to earn points in a video game involving drawing dently, thereby highlighting the precise, multifac-
rectangles on a computer screen. Three dimensions eted way in which reinforcers control variability.
Figure 22.3. U values for each of three dimensions of rectangles drawn by

participants in three separate groups. One group was required to repeat the
areas of their rectangles while varying shapes and locations (left set of bars), a
second group was required to repeat shape while varying areas and locations
(middle set of bars), and a third group was required to repeat location while
varying areas and shapes (right set of bars). Error bars indicate standard errors.
From “Reinforcement of Variations and Repetitions Along Three Independent
Response Dimensions,” by C. Ross and A. Neuringer, 2002, Behavioural
Processes, 57, p. 206. Copyright 2002 by Elsevier B.V. Adapted with permission.
520
Operant Variability
As we show in the next section, reinforcers exert sequences were reinforced in the presence of red
other simultaneous influences: They select the set or keylights (lag schedule). Blue and red alternated
class of responses from which instances emerge and, after every 10 reinforcers under what is referred to as
simultaneously, the required level of variation. a multiple schedule (two different reinforcement con-
tingencies presented successively, each correlated
Response Sets and Variations with a distinct stimulus). The birds learned to repeat
Whenever variability is reinforced, a set of appropri- in the presence of blue and to vary in the presence of
ate responses is also strengthened. Reinforcers select red, and when the stimulus relationships were
the set from which variations emerge. Mook and reversed, the birds varied in the presence of blue
Neuringer (1994) provided experimental evidence while repeating in the presence of red. In another
for this point. In the first phase, rats’ variable four- experiment, rats learned to emit variable four-
response sequences across L and R levers (lag sched- response sequences across L and R levers in the pres-
ule) were reinforced. In the second phase, only ence of one set of lights and tones and repeated a
sequences that began with two right responses, RR, single pattern, LLRR, in the presence of different
were reinforced. Thus, now only RRLL, RRLR, stimuli (Cohen, Neuringer, & Rhodes, 1990). In an
RRRL, and RRRR patterns were effective. In the first even more stringent test by Denney and Neuringer
phase, all 16 possible sequences were emitted, (1998), rats’ variable sequences were reinforced in
whereas in the second phase, most sequences began one stimulus, whereas in a yoke stimulus, reinforcers
with two right responses, RR. Thus, the reinforce- were delivered at exactly the same rate and distribu-
ment contingency generated behaviors that satisfied tion but independent of variability. The cues came to
the appropriate set definition while simultaneously exert strong differential control, and when variability
producing a required level of variability within that was required, the animals varied; when variability
set. In another experimental example (Neuringer, was not required but permitted in yoke, response
Kornell, & Olufs, 2001), rats responded in cham- sequences became more repetitive and predictable.
bers containing five operanda: left lever, right lever, These results indicate that an individual may behave
left key, center key, and right key. In one phase, in a habitual and predictable manner in one context,
reinforcers were contingent on variations across whereas in a different context, perhaps occurring
only three of the operanda (left and right levers and only a few moments later, the same individual will
center key), and the rats learned to respond variably respond unpredictably or in novel ways. The results
across only those three. A binary event—reinforce further indicate (along with other VAR–yoke com-
or not—can function simultaneously to define a parisons described earlier) that to engender highly
response class and levels of (un)predictability along variable behaviors, it may be necessary to reinforce
multiple dimensions of the within-class instances. variability explicitly rather than, as in laissez-faire
This result shows an extraordinary diversity of con- environments, simply permit individuals the free-
trol by simple reinforcement operations. dom to vary. To the extent that individual freedom
depends on the possibility of variation, reinforce-
Discriminative Stimuli ment plays an important role (a topic to which we
Operant responses are generally influenced by dis- return in the final sections of this chapter).
criminative stimuli, that is, cues that indicate rein-
forcer availability. If pigeon pecks are intermittently Endogenous Stimulus Control
reinforced when the keylight is red but not when it is The discriminative stimuli described in the previous
green, the birds learn to peck almost exclusively section were external to the organism and publicly
when the keylight is red. Discriminative stimuli con- observable. Another form of discriminative control
trol levels of variability as well. For example, Page depends on the interactions of an organism with a
and Neuringer (1985, Experiment 6) reinforced rep- reinforcement schedule. An example of such endog-
etitions of a single sequence of key pecks, LRRLL, in enous stimulus control is seen in the pauses that
the presence of blue keylights, whereas variable follow reinforcers under fixed-interval schedules.
521
The reinforcers serve as indicators that for some noncontingent influences often interact in important
period of time, reinforcement is not possible. Hopson, ways with variability-contingent reinforcers.
Burt, and Neuringer (2002) showed that response–
reinforcer relationships exert discriminative control Random Events
over levels of variability as well (see also Neuringer, Many behavioral trajectories are initiated by the
2002). Rats’ responses were reinforced under a accidental confluence of the organism with one or
schedule in which two periods alternated, VAR and more environmental happenings. Hurricanes, earth-
repetition (REP), but these periods were not cued by quakes, and wars change behaviors in ways that can-
external stimuli (technically, a mixed schedule). In not readily be anticipated. Winning a lottery is a
the VAR period, four-response sequences of L and R happier example. Another might be happening to sit
lever presses were reinforced if they met a threshold next to a particular individual on a cross-country
variability contingency; in the REP period, only rep- flight, which leads to a long-term romantic relation-
etitions of LLLL were reinforced. (Probabilities of ship (see Bandura, 1982; Taleb, 2007). These events,
reinforcement were equalized in the two periods by although randomly related to the individual’s behav-
intermittent reinforcement of LLLL in REP.) After ior, have important long-term influences.
the schedule transitioned into the VAR component, Random events have been purposively used
responding began to vary within a few trials, and throughout history to guide behaviors, for example,
variations continued until the schedule transitioned throws of dice, randomly selected sticks, cards,
into REP, with responding soon reverting to LLLL. bones, or organs. Today, a referee flips a coin at the
These results indicate that the variability produced beginning of a football game to decide which team
when reinforcement is withheld for short periods, as can choose whether to kick the ball; a computer’s
when a new response is being shaped, may partly be random-number generator assists scientists with
discriminatively controlled despite absence of exter- avoiding biases in assigning subjects to experimental
nal cues; that is, animals and people may learn when groups; and alleotoric events are used in modern art,
it is functional to vary, and some of the cues may music, and literature. The Dice Man by Rhinehart
come from response–outcome relationships. (1998) provides a fictional example of intentional
use of random artifacts. The protagonist, bored with
life, writes a number of possible actions on slips of
Noncontingent Effects
paper and then periodically selects one blindly and
Until this point, we have focused on contingencies behaves accordingly. These examples show that ran-
that directly relate reinforcers to variability. All dom events that are independent of an individual’s
operant responses are also influenced by events that actions may be used to avoid biases, engender
are not directly contingent on the responses, some- unlikely responses, and break out of behavioral ruts.
times referred to as eliciting or inducing influences,
respondents, or establishing operations. For exam- Evolved Responses
ple, levels of deprivation, injections of drugs, and Modes of unpredictable behavior have evolved that
ambient room temperature can all influence learning permit organisms to escape from or avoid predators
and maintenance of operant responses. Even non- or aggressors. These behaviors have been referred to
contingent aspects of the reinforcement operation as protean behaviors that are “sufficiently unsystem-
itself may have important effects, for example, attri- atic in appearance to prevent a reactor predicting in
butes such as the quality and quantity of food (refer- detail the position or actions of the actor” (Driver &
ring here to when these do not change as a function Humphries, 1988, p. 36). Examples include the ran-
of behavior). Thus, to understand operant respond- dom zigzags of butterflies, stickleback fish, rabbits,
ing, including operant variability, these other influ- and antelopes when being attacked. One conse-
ences must be considered. We turn to a discussion quence of evolved protean behavior is that it inter-
of effects of noncontingent events on operant vari- feres with a predator species’ evolving a response to
ability. As we describe at the end of this section, a specific escape or avoidance pattern. In brief,
522
Operant Variability
protean behaviors demonstrate evolved randomlike interpretation of these effects is that low expectation
responses to eliciting stimuli. (or anticipation) of reinforcers induces variability
(Gharib et al., 2004). Whatever the explanation, it is
Schedules of Reinforcement and important to be able to identify whether variability
Expectancy is selected by reinforcers or pushed by states of the
Both in the laboratory and in the real world, it is body (endogenous inducers) or environmental
common for responses to be intermittently (or occa- events, including noncontingent effects of reinforc-
sionally) reinforced. Much operant conditioning ers. Discriminating between selection and induction
research is devoted to documenting the effects of will facilitate the modification of variability when
such schedules. To take one example, under a fixed- that is desirable.
ratio schedule of reinforcement, a fixed number of
responses (say, 30) is required to gain access to a Experience
pellet of food. After receipt of each pellet, it is Thorndike (1911) and Guthrie and Horton (1946)
impossible to obtain another immediately, because described the responses of cats that had been con-
30 additional responses are required. As was the fined in a puzzle box and who received food contin-
case for the fixed-interval schedules mentioned ear- gent on escape. Response topographies were highly
lier, pauses are generally observed after reinforce- variable at first but, over trials and rewards, became
ment, or lower rates of responding, as compared increasingly predictable and stereotyped. Antonitis
with later in the ratio, when access to reinforcement (1951) studied nose pokes by rats along a long hori-
is possible. zontal slit. When pokes produced access to food,
In addition to these effects on response rate, location variability decreased across trials. Notter-
response variability is also found to change under man and Mintz (1965) measured the force exerted
similar reinforcement schedules. In the cases dis- by rats on a response lever and found that across
cussed here, variability plays no role in the contin- training, force decreased, approaching the minimum
gency, that is, reinforcers do not depend on level necessary to operate the lever, with force vari-
response variations. However, responding tends to ability decreasing as well. Brener and Mitchell
become increasingly repetitive and predictable as an (1989) extended these results to the total energy
anticipated reinforcer is approached in time or num- expended by a rat in an operant conditioning cham-
ber. This tendency was shown for variability across ber. A last example comes from Vogel and Annau
two levers when a fixed sequence of responses was (1973), who reinforced pecking three times on a left
the operant (Cherot, Jones, & Neuringer, 1996), for key and three times on a right key, in any order.
variability of lever-press durations also under ratio Across sessions, a marked increase occurred in the
schedules (Gharib, Gade, & Roberts, 2004), and for predictability (stereotypy) of the pigeons’ patterns of
variability of movements across a chamber space response. A general consensus has therefore
when access to a potential sexual reinforcer is emerged: Variability of operant behavior decreases
approached (Atkins, Domjan, & Gutierrez, 1994). with experience. This conclusion, however, may
In each of these cases, variability is relatively high apply mainly to situations in which every response
when reinforcers are distant with respect to effort, or sequence leads to a reinforcing consequence and
time, or space, and responding becomes more pre- to situations in which high variability is not differ-
dictable as reinforcers are neared (see also Craig, entially reinforced.
1918). These changes in response predictability are
said to be induced by the schedule of reinforcement. Extinction
Another variable shown to induce differences in After long-term experience in which responses pro-
response variability is reinforcement frequency. In duce reinforcers, suddenly withholding reinforcers—
general, response variability is high when reinforc- referred to as extinction of responding—increases
ers are infrequent and low under high-density rein- variability. In the experiment by Antonitis (1951)
forcement (Lee, Sturmey, & Fields, 2007). One noted earlier, after the rats were accustomed to
523
producing food reinforcers by poking their noses

anywhere along a horizontal opening, food was
withheld, which caused an increase in location
variability. Extinction-induced variability has been
seen along many other response dimensions: loca-
tion (Eckerman & Lanson, 1969), force (Notterman
& Mintz, 1965), topography (Stokes, 1995), and
number (Mechner, 1958). One contrary result is
often cited, namely a study by Herrnstein (1961) in
which variability of the location of pigeon pecks
along a continuous strip was reported to decrease
during a period of extinction. However, the extinc-
tion in that study followed a phase in which every
response was reinforced (continuous reinforce-
ment) in an A-B design (first a reinforcement phase,
then extinction, without return to the first phase).
Experience may therefore have confounded the
results. In general, extinction causes variability to
increase.
The variations induced by extinction generally
emerge from the class of responses established dur-
ing original learning. For example, if lever pressing
produced food pellets, a rat may vary the ways in
which it presses when food is withheld, but much of
the behavior will be directed toward the lever (e.g.,
Figure 22.4. The top graph shows the propor-
Stokes, 1995). Neuringer et al. (2001) quantified the tion (or probability) of occurrences of the three-
bounded nature of extinction-induced variability response patterns shown along the x-axis during
that was observed after rats had been rewarded for a period when a single sequence (left lever, key,
repeating a single sequence across two levers and a right lever) was being reinforced (filled circles)
and during a period of extinction, when reinforc-
key: left lever, key, right lever (LKR), in that order. ers were withheld completely (open circles). The
The top panel of Figure 22.4 shows the distribution bottom graph shows the ratio of responding during
of the relative frequencies of each of the possible extinction (EXT) to responding during reinforce-
ment (REIN; i.e., the ratio of the two curves in the
sequences (proportions of occurrences) during the upper graph). Together, the graphs show that pat-
conditioning, or reinforcement, phases (filled cir- terns of responding during extinction were similar
cles) and during extinction (open circles). The LKR to those during reinforcement, but high-frequency
sequence was, of course, most frequent during the sequences decreased and low-frequency sequences
increased during the extinction phase. Adapted
reinforcement phase, with other, somewhat similar from “Stability and Variability in Extinction,” by
sequences falling off in terms of their frequencies. A. Neuringer, N. Kornell, and M. Olufs, 2001,
The LKR sequence was also most frequent through- Journal of Experimental Psychology: Animal Behavior
out the extinction phase—during which time Processes, 27, p. 89. Copyright 2001 by the
American Psychological Association.
response rates fell to low levels—with the two
curves being quite similar. (Note that these curves
show relative frequencies. Absolute rates of response the ratio of the two curves in the upper graph). The
were much lower during extinction than during the take-home message is that the basic form of the
reinforcement phase.) Also shown at the bottom of behavior was maintained during extinction, and
the figure are the ratios of response proportions dur- variability increased because of the generation of
ing the reinforcement and extinction phases (i.e., unusual or highly unlikely sequences (for related
524
Operant Variability
findings, see Bouton, 1994). Extinction was there- variability, indicated by the U values in Figure 22.2.
fore characterized as resulting in a “combination of The individual curves represent the four variability
generally doing what worked before but occasion- thresholds, and the x-axis represents frequencies of
ally doing something very different. . . . [This] may reinforcement. The four thresholds exerted primary
maximize the possibility of reinforcement from a control, that is, the groups differed in variability
previously bountiful source while providing neces- throughout the experiment. Effects of reinforcement
sary variations for new learning” (Neuringer et al., frequency were more subtle and depended on the
2001, p. 79). Knowledge of these effects can be threshold requirements, an interaction effect. When
applied to one’s own behavior as well as to others. the contingencies were lenient and low levels of
When in a rut, or unproductive or dissatisfied, variability sufficed for reinforcement, variability
avoiding those reinforcers that had been produced increased as reinforcement rates fell (from continu-
by habitual behaviors may help. ous reinforcement to VI 1 to VI 5). When the con-
tingencies were demanding and high levels of
Interactions variability were reinforced, the opposite occurred,
Noncontingent inducers often interact with that is, variability decreased with decreasing rein-
variability-contingent reinforcers to control levels of forcements. The intermediate groups showed inter-
response variability. Additional phases in the mediate effects. A similar interaction was obtained
Grunow and Neuringer (2002) experiment described when delays were introduced between the end of a
in the Levels of Variability section provide one exam- varying sequence and reinforcement (Wagner &
ple. In the first phase of that experiment, recall that Neuringer, 2006). Thus, when reinforcers are con-
high, medium-high, medium-low, and low levels of tingent on variability, the contingency exerts a
response-sequence variability were reinforced across strong—and often primary—effect, but that effect is
groups of rats, resulting in different levels of modified by noncontingent influences, including
response variability across the four groups. Two reinforcement rates and delays. Levels of response
additional phases followed in which, although the variability depend on both contingent and noncon-
four different variability criteria were unchanged, tingent influences.
overall frequencies of reinforcement were systemati- Interactions between variability-contingent and
cally lowered by providing reinforcement only inter- variability-noncontingent reinforcement may help to
mittently. In particular, a variable-interval (VI) explain effects seen outside of the lab. Repetitive
schedule of reinforcement was superimposed on the behaviors are required for many workers (e.g., factory
variability contingency: first a VI 1 minute (such that workers, mail carriers, fare collectors), but for others,
food pellets were limited to an average of once per variable (and unpredictable) behaviors are the norm
minute, with unpredictable gaps of time between (e.g., inventors, fashion designers, artists). Lowering
food deliveries) and then VI 5 minute (limiting food pay or withholding positive feedback may affect
pellets to no more than once, on average, every 5 behaviors differently in these two cases. Thus, to pre-
minutes). Under the VI schedules, after an interval dict effects on behavioral variability, one must know
elapsed, the first trial to meet the variability contin- both contingent and noncontingent relationships.
gency ended with a reinforcer. All other trials ended Cherot et al. (1996) described a different interac-
with a brief time out (whether or not the variability tion that may also help to illuminate real-world
requirement had been satisfied). effects. In that experiment, repeated response
As reinforcement frequencies were lowered, sequences across two levers were reinforced in one
response rates fell in all four groups and did so group of rats (REP) and sequence variability was
equally, that is, all groups responded much more reinforced in another group (VAR). Not every
slowly when varying sequences were reinforced on sequence that met the VAR or REP contingency
average once every 5 minutes than when they were gained reinforcement, however; rather, a superordi-
reinforced each time that they met the contingen- nate fixed-ratio 4 also had to be satisfied. That is, the
cies. However, different results were obtained for REP group had to successfully repeat a sequence
525
four times to get a single reinforcer, and the VAR

group had to successfully vary a sequence the same
number of times. For example, in the REP group,
one animal may have emitted LLLR in the first trial
after a reinforcer, but that correct sequence caused
only a signal indicating correct. No food was given.
If the next trial also contained LLLR, the correct sig-
nal was again provided. If the following trial was
LLLL, then a brief time out was given. This process
continued until the fourth correct LLLR sequence
produced the signal plus a food pellet. Exactly the
same procedure was in place for the VAR animals,
except they were correct only when they met a lag
variability contingency.
As shown in Figure 22.5 (bottom), the main
result was that the VAR animals responded much
more variably overall than did the REP animals,
again indicating primary control by the variability
and repetition contingencies. However, as reinforce-
ment was approached (i.e., as the last of the four
successful sequences was neared), levels of variabil-
ity decreased for both VAR and REP groups. Recall
the expectancy-of-reinforcement effects described in
the section Schedules of Reinforcement and Expec-
tancy earlier in this chapter. In this case as well,
variability decreased as reinforcers were approached,
thereby facilitating correct responding in the REP
group but interfering with it in the VAR group (Fig-
ure 22.5, top panel). Let us pause for a moment to Figure 22.5. The top graph shows per-
consider this surprising finding. Despite the fact centages of sequences that met variability
(VAR) or repetition (REP) contingencies
that variability was being reinforced in the VAR as a function of location within a fixed-
group, as the reinforcer was neared, the likelihood ratio 4 (FR 4) schedule. The lines connect
of varying decreased. It is important to note again means for groups of rats, and the error bars
that reinforcement of variability generated much indicate standard deviations. The lower
graph shows U values, an index of sequence
higher levels of variation overall than did reinforce- variability, for the two groups across the
ment of repetitions, a variability-contingent effect, FR schedule. Adapted from “Reinforced
but superimposed was an expectancy-inducing Variability Decreases With Approach to
Reinforcers,” by C. Cherot, A. Jones, and
decrease in variability. Similar interactions may help
A. Neuringer, 1996, Journal of Experimental
to explain effects of reinforcers on other types of Psychology: Animal Behavior Processes, 22,
behavior, including creative behaviors, a topic that p. 500. Copyright 1996 by the American
we discuss in the Applications section. Psychological Association.
discussed explanations of operant variability, namely

Memorial and Stochastic
memorial and stochastic processes. According to the
Explanations
memorial explanation, each response can be related
We have described types of events that result in vari- to or predicted from prior stimuli or responses.
able behaviors. Now, we examine two commonly According to the stochastic-generator hypothesis,
526
Operant Variability
individual responses are unpredictable because of reinforced. By alternating, LRLRLR . . . , every

the nature of a random process. That is, individual response is reinforced, and birds developed just
responses do not have identifiable individual causes, such an alternation strategy. When the sequence
a hypothesis that many consider problematic. We consisted of two responses, the birds again devel-
consider each of these explanations and argue that oped memory-based sequences, for example, repeat-
the evidence for both of these hypotheses is good ing RRLLRRLL. However, when the sequence was
and therefore that behaving (more or less) unpre- increased to three responses, such that reinforce-
dictably derives from multiple sources. ment was given for responses in the least frequent
three-response bin, the birds apparently could not
Memory-Based Variability develop the optimal fixed pattern of RRRLRLLL . . .
Memory is a shorthand way to refer to the influence but instead reverted to randomlike behavior (Mach-
of past events that are separated in time from a cur- ado, 1993, p. 103). Thus, a memory-based strategy
rent response. The term is not intended to connote was used when that was possible, but when the
conscious awareness (although that might be memory demands became too high, stochastic
involved) but rather potentially identifiable influ- responding emerged. A similar pattern was seen
ences (or causes). To the extent that memorial pro- with songs generated by a songbird, a budgerigar,
cesses are responsible for variability generation, under lag contingencies of reinforcement (Manabe,
prediction of individual responses is possible, even Staddon, & Cleaveland, 1997). Under lag 1, the
when the overall output is variable; thus, each mem- birds tended to alternate between two songs; under
ber of a variable sequence could be said to be deter- lag 2, they cycled among three songs. When the lag
mined by prior events. was increased to 3, however, song diversity and vari-
At the outset of this chapter, we indicated that ability increased appreciably. Thus, under recency-
under lag 50 schedules, in which the current and frequency-based methods of variability
response sequence must differ from each of the pre- reinforcement, variable responses are generated via
vious 50 sequences, responding was highly variable memorial processes when possible, but reversion to
and, indeed, approached that expected of a stochas- stochastic-like emission is seen when memory
tic generator. However, behaviors are often quite requirements exceed the organism’s capacity.
different under lag 1 or 2 schedules. In these cases,
the current sequence must differ from only the pre- Chaotic Responding
vious one or two, and memory-based response strat- Memory-based strategies can be used in other ways
egies frequently emerge: Animals and people as well. For example, chaotic processes generate out-
sometimes cycle repeatedly through two or three puts that are so noisy that it is exceedingly difficult
sequences, apparently basing a current response to distinguish them from stochastically generated
sequence on the just-emitted sequences. The cycling ones. “Chaos [is] a technical term . . . refer[ring] to
strategy produces reinforcement for every sequence, the irregular, unpredictable behavior of determinis-
which is a better rate of return than responding in a tic, nonlinear systems” (R. V. Jensen, 1987, p. 168).
stochastic-like manner.3 The advantage is, however, Chaotic behavior is both highly variable and pre-
only conferred when the memory demands are cisely controlled by prior events (Hoyert, 1992;
within the subject’s capacity. Mosekilde, Larsen, & Sterman, 1991; Townsend,
In a demonstration of the latter point, Machado 1992). A study by Neuringer and Voss (1993) asked
(1993) studied pigeons pecking L and R keys under whether people could learn to generate chaotic-like
a frequency-dependent variability contingency. responses. They used one example of a chaotic func-
Under this schedule, if the sequence is composed tion, the logistic difference function:
of just one response, then pecking the key that
had been pecked least frequently in the past will be R n = t • R n−1 • (1 − R n−1 ) . (2)

For example, stochastic responding on two alternatives under a lag 1 contingency earns 50% reinforcement, whereas alternating earns 100%.
3
527
Here, Rn refers to the nth iteration in a series, each R

is a value between 0.0 and 1.0, and t is a constant
between 1.0 and 4.0. The current value of the logis-
tic difference function (Rn) is based on the previ-
ously generated value (Rn −1). The process begins
with an arbitrary seed value for the R0 between 0 and
1, which is used to calculate R1. Apart from the initial
seed value, the function is completely self-generated,
with each value determined by the just-prior value,
together with the constant parameters.
Chaotic outputs have two identifying characteris-
tics. First, given a constant value for t that approaches
4, for example, 3.98, the generated sequence approxi-
mates a random one, that is, it passes many tests for
randomness. Outputs are noisy and apparently
unpredictable. However, second, if the current value
of Rn is plotted as a function of the just prior value,
Rn− 1, a predictive structure can be identified. In the
particular case of the logistic difference function, the
form of this autocorrelated relationship is a parabola
(different chaotic functions show different types of
internal structures). Thus—and this is the identifying
attribute of chaotic processes—a deterministic math-
ematical function can generate randomlike outputs
with prediction of each value of the function possible
given precise knowledge of parameters and prior val-
ues. The outputs are extremely noisy and, at the same
time, identifiably determined.
In the Neuringer and Voss (1993) study, college
students were shown, after each response, the differ-
ence between their responses and that of the iterated
logistic difference model. With training, the students
became increasingly adept at responding in chaotic-
like fashion—the students’ responses matched
closely the iterations of the logistic function—and
their autocorrelations increasingly approximated a
parabola (Figure 22.6). Because each iteration in the Figure 22.6. Responses in trial n as a function of
logistic difference sequence is based on the prior responses in trial n − 1 during the first 120 responses
output, Neuringer and Voss hypothesized that the of the experiment (left column) and final 120 responses
human subjects also remembered prior responses. (right column). Each row of graphs represents data
from a single participant. The drawn lines show the
Put simply, the subjects may have learned (or mem- best-fitting parabolas. From “Approximating Chaotic
orized) a long series of “if the previous response was Behavior,” by A. Neuringer and C. Voss, 1993,
value A, then the current response must be value B” Psychological Science, 4, p. 115. Copyright 1993 by the
Association for Psychological Science. Adapted with
pairs, a suggestion made by Metzger (1994) and by permission.
Ward and West (1994).
To test this memory hypothesis, Neuringer and
Voss (2002) interposed pauses (IRTs) between each
528
Operant Variability
response, that is, they slowed responding (see also of the items (the two colors, in our example). It is
Neuringer, 2002). As IRT durations increased, the dif- also true that the greater the number of different
ference between the subjects’ sequences and the mod- item classes, for example, different colors, the less
el’s chaotic output increased, and the underlying predictable any given instance will be. If the urn
parabolic structure was disrupted, providing evidence contained equal numbers of 20 different colors, for
that the highly variable responding was memory based. example, then the chance level of prediction would
be .05 (rather than .50 in the two-color case). Dis-
Stochastic Generation cussion of these concepts in historical context can
Stochastic generation has been hypothesized at be found in Gigerenzer et al. (1989).
numerous points in the chapter. Here we discuss in When trying to ascertain whether a finite
more detail what the stochastic hypothesis involves sequence of outputs was randomly generated, the
and possible ways to test it. The issue is complex, best one can do is to estimate the probability that a
difficult, and important. If variable operant random process is involved. For example, if 100
responses are generated stochastically, then it may selected balls were all blue, it would be unlikely but
not be possible to predict individual responses at not impossible that the balls were selected randomly
greater than chance levels. Stochastic generation from an urn containing an equal number of red and
may also be relevant to operant responses generally blue balls. Given a random generating process, any
and to explanations of their voluntary nature, as we subsequence of any length is possible, and every par-
discuss later. A researcher confronts many prob- ticular sequence of outcomes of a given length is
lems, however, in attempting to decide whether a exactly as likely as any other (see Lopes, 1982).
particular response stream is random or not and These considerations indicate the impossibility of
confronts additional difficulties when trying to proving that a particular finite sequence deviates
determine whether it has been generated by a ran- from random: The observed sequence may have been
dom process (see Nickerson, 2002). selected from an infinite random series (see Chaitin,
To get an intuitive sense of what random implies, 1975). However, the probability of 100 blue balls is
imagine an urn filled with 1,000 colored balls. The extremely low in our example, and the probability is
urn is well shaken, and one ball is blindly selected. much higher for sequences that contain approxi-
After selection, the ball’s color is noted, the ball is mately 50% red and 50% blue. Thus, one can evalu-
returned to the urn, and the selection process is ate the probability that a given output was generated
repeated. If the urn contains an equal number of by a stochastic process having particular parameters.
blue and red balls, then prediction of each ball’s A second problem is that as demonstrated by
color will be no better than chance; that is, the prob- chaos theory, seemingly random outputs may be
ability of a correct prediction would be .50. The generated by nonrandom processes. Another exam-
repeated selections represent a random process4 ple is given by iteration of the digits of pi. Someone
with the resulting output being a random sequence. could memorize the first 100 digits of pi and use
Note that predictions can be better than 50% for those to generate a randomlike sequence. Thus,
random processes, as shown by the following: If the behavioral outputs can be highly variable but (given
urn was filled with an uneven number of different sufficient knowledge by an observer) predictable.
colored balls, prediction could become increasingly How, then, can one test whether highly variable
accurate. For example, if the urn contained 900 red operant responses derive from a stochastic process?
balls and 100 blue balls, then prediction accuracy The test must involve a comparison—which of two
would rise to .90 (if one always predicted red). hypotheses is most likely to account for the data?—
However, the process and output are still referred to and the most likely alternative is the one already
as stochastic. Thus, stochastic outputs are more or discussed, namely, memory-based processes
less predictable depending on the relative frequencies under which each response can be predicted from
Specifically, this process is “random with replacement” on account of the act of returning the ball to the urn. All discussions of randomness in this
4
chapter refer to this type of randomness.
529
knowledge of prior stimuli or responses. A powerful sequence depended in part on memory for the just-
tool for making this comparison is memory interfer- prior response.
ence, that is, the degrading of control by prior events. Effects of the same blackouts were assessed in a
Availability of memory-interfering procedures leads second group of rats that obtained reinforcers for
to the following reasoning: When approximation varying four-response sequences under lag contin-
to a stochastic output is reinforced, if a memory- gencies. Neuringer (1991) reasoned that if variable
interfering event degrades performance, it provides responses were generated by a memory-based pro-
evidence against a stochastic generation process. cess, then performances would be degraded as black-
Absence of memory-interfering effects provides evi- out durations increased, as was the case for the
dence consistent with stochastic generation. We LLRR group. In fact, performances by the variability
have already seen evidence for stochastic-like group actually improved with increasing blackout
responding when demands on memory were high durations, resulting in higher rates of reinforcement.
(Machado, 1993; Manabe et al., 1997) and turn next Some have suggested that absence of memory for
to additional tests of the stochastic hypothesis. prior events is necessary for random responding
Neuringer (1991) compared the effects of mem- (Weiss, 1965), implying that memory interferes with
ory interference on responding by two groups of random generation. In any event, the results were
rats. One group obtained reinforcers by repeating a clearly inconsistent with the memory hypothesis.
single pattern, LLRR. Once that pattern was well In a related study, alcohol was administered to a
learned, blackouts were introduced between each single group of rats that had learned to respond vari-
response, the durations ranging from 0.1 second to ably when one stimulus was present and repeat
20 seconds across different phases of the experi- LLRR sequences given a second stimulus (Cohen et
ment. Responses were ineffective during the black- al., 1990). The two stimuli alternated throughout
out periods. As blackout durations increased, errors each session under a multiple schedule. As alcohol
increased and reinforcement rates fell. Neuringer doses increased, performance of the LLRR sequence
hypothesized that the interposed blackouts degraded was seriously impaired, whereas varying under the
performance because each response in the LLRR lag contingency was unaffected (Figure 22.7; see
Figure 22.7. Percentages of reinforced, or correct, sequences as a function of etha-

nol dosage for each of five rats. Varying sequences were reinforced under one stimu-
lus condition (left graph), and repetitive LLRR sequences were reinforced under
another stimulus (right graph). The lines connect averages of the five subjects (BG,
BR, RG, B, and P). From “Effects of Ethanol on Reinforced Variations and Repetitions
by Rats Under a Multiple Schedule,” by L. Cohen, A. Neuringer, and D. Rhodes,
1990, Journal of the Experimental Analysis of Behavior, 54, p. 5. Copyright 1990 by
Society for the Experimental Analysis of Behavior, Inc. Adapted with permission.
530
Operant Variability
also Doughty & Lattal, 2001). Thus, within a single and the underlying processes. Results from these
session, the drunk rats failed to repeat accurately laboratory experiments may help to explain unpre-
but were highly proficient when required to vary. dictable operant behaviors in many nonlaboratory
Both interposed time delays and alcohol, two ways cases in which variability contingencies occur
to affect memory for prior responses; degraded per- naturally. In this section, we continue to describe
formances of fixed-pattern sequences; and either laboratory-based studies but ones with direct rele-
improved operant variability or left it unaffected. vance to real-world conditions.
Additional evidence for the difference between
memory-based and stochastic responding was pro- Training New Responses
vided by Neuringer and Voss (2002; see also Neu- Skinner (1981) hypothesized a parallel between evo-
ringer, 2002). College students learned to generate lutionary processes and selection by reinforcers of
chaotic-like sequences (according to the logistic differ- operant responses from a substrate of varying behav-
ence chaos function described in Equation 2) as well iors (see also Baum, 1994; Hull, Langman, & Glenn,
as to generate stochastic-like sequences (given feed- 2001; Staddon & Simmelhag, 1971). As described in
back from eight statistical tests, as in Neuringer, the Reinforcement of Variability section earlier in
1986). These two ways of responding variably were this chapter, variable behaviors can be generated by
alternated, under stimulus control, throughout each reinforcers whose delivery is directly contingent on
session. Memory interference was later introduced in a that variability, something not anticipated by Skin-
rather complex way. In the chaos phase of the experi- ner. One question that has potential importance
ment, subjects were required to generate four separate for applications is whether reinforced variability
chaotic functions, each differing from the other. In the facilitates acquisition of new responses, especially
stochastic phase, four uncorrelated response sequences difficult-to-learn ones.
were required. In essence, chaotic responses were used Neuringer, Deiss, and Olson (2000) reinforced
to interfere with one another, and stochastic responses variable five-response sequences across L and R
were used similarly. Results showed that performances levers while concurrently reinforcing a target
were significantly degraded during the four-segment sequence that rats find difficult to learn, namely
chaotic phases: Chaotic responses interfered with one RLLRL. Reinforcement for varying was limited to
another. A different result was obtained from the sto- once per minute, whereas the target sequence
chastic portion of the experiment. For one subject, RLLRL was reinforced whenever it occurred. Thus,
seven of eight statistics were closer to a random model if the rats learned to emit the target, reinforcement
during the interference phase than at the end of the was much more frequent than if they only varied.
original training with a single sequence; for the second The question was whether concurrent reinforcement
subject, all eight statistics were closer. These results of variations would facilitate acquisition of the target
are consistent with the hypothesis that a memory- sequence, and the answer was obtained through a
based process controls chaotic variability and that a comparison with two other groups of rats. In one
stochastic process, not dependent on memory, con- group, the same RLLRL target sequence was rein-
trols stochastic variability. (For additional evidence, forced whenever it occurred, but varying was never
see G. Jensen et al., 2006; Page & Neuringer, 1985.) reinforced. These target-only animals stopped
The importance of this experiment is that it demon- responding altogether—responses extinguished
strated the two modes of variability generation, one because reinforcer frequencies were low or zero—
memory based, the other stochastic, under a procedure and they did not learn the target (shown by the con-
that controlled for extraneous variables. trol [CON] data in Figure 22.8, top panel). In a
second control group, the RLLRL target was also
reinforced whenever it occurred. In addition, five-
Applications
response sequences were reinforced at a rate yoked
We have described experiments on the reinforce- to that obtained by the experimental animals; note
ment of predictable and unpredictable responding that these reinforcers could follow any sequence and
531
panel of Figure 22.8, these rats too did not learn the
target. Only the experimental animals who concur-
rently received reinforcers for variable responding
(VAR) learned to emit the RLLRL sequence at high
rates. Thus, it appeared that concurrent reinforce-
ment of variations facilitated acquisition of a
difficult-to-learn sequence, a potentially important
finding. The experiment was replicated with a sec-
ond difficult sequence with the same results (bottom
panel of Figure 22.8) and in a separate study with
rats as well (Neuringer, 1993).
However, attempts in two laboratories to repli-
cate these effects with human participants failed
(Bizo & Doolan, 2008; Maes & van der Goot, 2006).
In both cases, the target-only group (with no addi-
tional reinforcers presented) learned most rapidly.
Several possible explanations have been suggested,
including differences in relative frequencies of rein-
forcements for target responses versus variations,
differences in levels of motivation in the animal ver-
sus human studies, and the “figure out what’s going
on” type of instructions provided to the human par-
ticipants, but why or when concurrent reinforce-
ment of variations facilitates versus interferes with
Figure 22.8. Rates of emission of a learning of new responses is not yet clear (see
difficult-to-learn target sequence (RLLRL Neuringer, 2009).
on top and LLRRL on bottom) for three
groups of rats as a function of blocks of
sessions (each session block shows the Problem Solving
average of five sessions). In all groups, Arnesen (2000; see also Neuringer, 2004) studied
the target sequence was reinforced whether a history of explicit reinforcement of varia-
whenever it occurred. For one group, tions would facilitate later problem solving. Using a
reinforcement was additionally arranged
for varying sequences (VAR); for a sec- rat model, she provided food pellets to rats in an
ond group, the additional reinforcers experimental group for varying their responses to
occurred at the same rate as in VAR but arbitrarily selected objects. For example, a soup can
independent of variability (ANY); a third
was placed in the chamber, and responding to it in a
group did not receive additional rein-
forcement for any sequence other than variety of ways was reinforced. Each session pro-
the target sequences (CON). Adapted vided a different object, with response variability
from “Reinforced Variability and Operant being reinforced throughout. Members of a yoked
Learning,” by A. Neuringer, C. Deiss, and
G. Olson, 2000, Journal of Experimental control group experienced the same objects but
Psychology: Animal Behavior Processes, received food pellets independent of their interac-
26, p. 107. Copyright 2000 by the tions. A second control group was simply handled
American Psychological Association. for a period each day. After training, each rat was
placed alone in a problem space, a room approxi-
did not depend on variations. These animals contin- mately 6 feet by 8 feet, on the floor of which were
ued to respond throughout (the yoke reinforcers 30 objects—for example, a toy truck, metal plumb-
maintained high response strength) but as shown by ing pipes, a hair brush, a doll’s chest of drawers—
the independent-of-variability (ANY) data in the top arbitrarily chosen but different from those used
532
Operant Variability
during the training phase. Hidden in each object Psychopathology

was a small piece of food, and the hungry rats were Behavioral and psychological disabilities are some-
permitted to explore freely for 20 minutes. The times associated with reduced control of variability.
question was how many food pellets would be dis- In autism and depression, for example, behaviors
covered and consumed. The experimental animals tend to be repetitive or stereotyped even when varia-
found significantly more pellets than either of the tions are desirable. In attention-deficit/hyperactivity
control groups, which did not differ from one disorder (ADHD), the opposite is true, with abnor-
another. Furthermore, the experimental rats explored mally high variability observed when focused and
more—they seemed bolder—and interacted more repetitive responses are adaptive. All three of these
with the objects than did the control rats, many of disorders share a common characteristic, however:
whom showed signs of fear. Thus, prior reinforce- an apparent inability to move from one end or the
ment of response variations transferred to a novel other of the variability continuum. One question is
environment and facilitated exploration of novel whether reinforcement contingencies can modify
objects and discovery of reinforcers. The advantages abnormal levels of variability. The answer to this
incurred by variations are discussed in the human question may differ with respect to depression and
literature (e.g., brainstorming), but tests of direct autism, on the one hand, and ADHD, on the other.
reinforcement-of-variability procedures for problem
Depression. Hopkinson and Neuringer (2003)
solving more generally have been few.
asked whether the low behavioral variability asso-
ciated with depression (Channon & Baker, 1996;
Creativity
Horne, Evans, & Orne, 1982; Lapp, Marinier, &
Although creative production requires more than
Pihl, 1982) could be increased by direct reinforce-
variation, Donald Campbell (1960) argued that vari-
ment. College students were separated into mildly
ations, and indeed random variations, are necessary.
depressed and not depressed on the basis of Center
If so, then operant variability may make important
for Epidemiological Studies Depression Scale scores
contributions to creativity. Support comes from
(Radloff, 1991). Each participant played a computer
studies in which creativity was directly reinforced
game in which sequences of responses were first
(e.g., Eisenberger & Armeli, 1997; Holman et al.,
reinforced independently of variability or proba-
1977; Pryor et al., 1969; see also Stokes, 2001).
bilistically (PROB), as in the yoke procedures we
Other studies, however, have indicated that rein-
have described, after which variable sequences were
forcement interferes with, or degrades, creative out-
directly reinforced (VAR). Figure 22.9 shows that
put (e.g., Amabile, 1983). This literature is deeply
under PROB, the depressed students’ variability
controversial and has been reviewed in several arti-
(U values) was significantly lower than that of the
cles (e.g., Cameron & Pierce, 1994; Deci, Koestner,
nondepressed students. When variability was explic-
& Ryan, 1999; Lepper & Henderlong, 2000), but the
itly reinforced, however, levels of variability increased
research listed earlier may contribute to a resolution.
in both groups and to the same high levels. This
As shown by Cherot et al. (1996) and others (Wag-
result, if general, is important because it indicates
ner & Neuringer, 2006), reinforcement of variations
that variability can be explicitly reinforced in people
has two effects. As a reinforcer is approached, vari-
manifesting mild depression (see also Beck, 1976).
ability declines. Thus, situations that potentiate the
anticipation of consequences on the basis of comple- Autism. In an experiment conducted by Miller
tion may interfere with creative activities. The con- and Neuringer (2000), five individuals diagnosed
tingencies may at the same time, however, maintain with autism and nine control subjects received rein-
high overall levels of creativity. Consideration of forcers independent of variability in a baseline phase
both induced effects (anticipation of reinforcement) (PROB), followed by a phase in which sequence
and contingency effects (reinforced variability and variations were directly reinforced. Subjects with
creativity) may help explain reinforcement’s contri- autism behaved less variably than the control sub-
bution to creativity (see Neuringer, 2003). jects in both phases; however, variability increased
533
Rubia, Smith, Brammer, & Taylor, 2007). A sec-

ond common identifier is lack of inhibitory control
(Nigg, 2001). Can such behavior be influenced by
direct reinforcement? The evidence has indicated
that unlike the case for autism, variability may result
mainly from noncontingent (i.e., inducing) influ-
ences. One example is provided by the beneficial
effects of drugs such as methylphenidate (Ritalin).
Another is the fact that variability in individuals
with ADHD is higher than in control subjects when
reinforcement is infrequent, but not when it is fre-
quent (Aase & Sagvolden, 2006). Methylphenidate
Figure 22.9. Levels of variability
(indicated by U values) for depressed reduces variability. Low reinforcement frequencies
and nondepressed college students induce high variability, and the effects on those with
when reinforcers were provided inde- ADHD may be independent of direct reinforcement-
pendent of response variability (PROB of-variability contingencies. Similarly, when rein-
phase) versus when variations were
required (VAR phase). Standard errors forcement is delayed, the responses of subjects with
are shown by the error bars. From ADHD weaken more than those of control subjects,
“Modifying Behavioral Variability in possibly because of induced increases in variability
Moderately Depressed Students,” by
(Wagner & Neuringer, 2006). Thus, variability may
J. Hopkinson and A. Neuringer, 2003,
Behavior Modification, 27, p. 260. be induced in individuals diagnosed with ADHD
Copyright 2003 by Sage Publications, by different attributes of reinforcement, but to date
Inc. Adapted with permission. little evidence has indicated sensitivity to variability-
reinforcing contingencies.
significantly in both groups when it was reinforced.
Thus, individuals with autism, although relatively
Operant Variability and the
repetitive in their responding, acquired high levels
Emitted Operant
of operant varying. Ronald Lee and coworkers (Lee,
McComas, & Jawor, 2002; Lee & Sturmey, 2006) Reinforced variability may help to explain some
extended this work. Under a lag schedule, indi- unique attributes of operant behavior. Operants are
viduals with autism received reinforcers for vary- often compared with Pavlovian reflexes, and the two
ing verbal responses to questions, and two of three can readily be distinguished at the level of the proce-
participants in each of two experiments learned to dures used to establish them. In Pavlovian condi-
respond appropriately and nonrepetitively. Thus, tioning, a conditional relationship exists between a
the experimental evidence, although not extensive, previously neutral stimulus, such as a bell, and an
has indicated that the behavior of individuals with unconditioned stimulus, such as food. The result is
autism can benefit from reinforcers contingent on that the neutral stimulus becomes a conditioned
variability. Stated differently, the abnormally low stimulus that elicits a conditioned response. One
levels of variability characteristic of individuals with view is that operant responses differ in that they
autism may at least in part be under the influence of depend on a conditional relationship between
operant contingencies. response and reinforcer. Thus in one case, a condi-
tional relationship exists between two stimuli (if
Attention-deficit/hyperactivity disorder. Things conditioned stimulus, then unconditioned stimu-
may differ for individuals diagnosed with ADHD. lus), whereas in the other, the relationship is
Here, the abnormal levels of variability are at the between response and reinforcer.
opposite end of the continuum, with high variability However, according to Thorndike, Guthrie, and
a defining characteristic (Castellanos et al., 2005; others, when a response is made in the presence of a
534
Operant Variability
particular stimulus and the response is reinforced, of these responses is most likely to be reinforced.
then over trials, the stimulus takes on the power of For example, if Jackie’s mother is nearby, the “Mom-
an elicitor (Bower & Hilgard, 1981). This finding mie, get my toy” response might be most likely.
led some researchers to conclude that both operant Alternatively, if the toy is just beyond reach, the
and Pavlovian responses were elicited by prior stim- child might be most likely to jump to get it. In many
uli. That is, in both cases stimulus–response rela- cases, however, the behavior appears to be selected
tionships were critical to predicting and explaining with equal probabilities, and prediction of the
the observed behaviors. instance becomes difficult.
Skinner (1935/1959) offered a radically different As just suggested, members of a particular class
view of the operant. Skinner’s position is difficult to of behaviors may be divided into subclasses, and
grasp, partly because at times he assumed the point even here variability may characterize aspects of the
of view of an environmental determinist, whereas at response. For example, if “ask for the toy” is the
other times he proposed probabilistic (and possibly activated subclass, the exact moment of a verbal
indeterministic) outcomes. According to Skinner, request, the particular words used, or the rhythm or
eliciting stimuli could not be identified for the oper- loudness may all be difficult to predict. Similarly,
ant. Although discriminative stimuli signaled the when a rat is pressing a lever to gain food pellets,
opportunity for reinforcement, no discrete environ- the characteristics of the press (one paw vs. both,
mental event could be identified to predict the exact with short or long latency, with high or low force,
time, topography, or occurrence of the response. etc.) are sometimes predictable, but often are not.
Skinner described operants as emitted to distinguish Thus, according to a Skinnerian model, functionally
them from elicited Pavlovian reflexes. equivalent instances emerge unpredictably from
But how is one to understand emission? The within a class or subclass, as though generated by a
term is ambiguous, derived from the Latin emittere, stochastic process (Skinner, 1938; see also Moxley,
meaning “to send out.” To be sent out might imply 1997). To state this differently, there is variance
being caused to leave, but there is a sense of emer- within the operant, manifested as the emission of
gence, rather than one-to-one causation, as in the instances from a set made up of functionally related
emission of radioactive particles. More important, but often physically dissimilar behaviors.
the term captures, for Skinner and others, the mani- Behavioral variability occurs for many reasons, as
fest variability of all operant behaviors. Skinner we have discussed. It decreases with training and
interpreted that variability as follows. experience. It is low when reinforcers are frequent
An individual operant response is a member of a and higher under intermittent schedules of rein-
class C of instances, a generic class, made up of forcement. It decreases with expectancy of and
functionally similar (although not necessarily physi- proximity to reinforcement. However, consequence-
cally similar) actions (Skinner, 1935/1959). An controlled variability may play a special role in
example may help to explain this point. Jackie, a explaining the emitted nature of the operant. To see
young child, desires a toy from a shelf that is too why, we next turn to experiments on volition. The
high for her to reach. Jackie might ask her mom to operant is often referred to as the voluntary operant,
get the toy, jump to try to reach it, push a chair next in contrast to the Pavlovian reflex. The question is
to the shelf to climb up to the toy, take a broom what about the operant indicates (and helps to
from the closet and try to pull the toy from the shelf, explain) volition.
or cry. Each of these acts, although differing in
physical details, is a member of the same operant
Operant Variability and
class because each potentially serves the same func-
Voluntary Behavior
tional relationship between the discriminative stim-
ulus (out-of-reach toy) and the goal (toy in hand). Attempts to explain volition have been ongoing for
Some responses may be more functional than other more than 2,000 years, and heated debates continue
members of the class, and cues may indicate which to this day in philosophy (Kane, 2002), psychology
535
(Maasen, Prinz, & Roth, 2003; Sebanz & Prinz, then together with knowledge of the individual’s
2006; Wegner, 2002), and physiology (Glimcher, past experiences and current circumstances, at least
2005; Libet, Freeman, & Sutherland, 1999). These somewhat accurate predictions can be made about
debates often concern the reality of volitional behav- the individual’s future goal-directed actions. Thus,
ior or lack thereof and, if real, how to characterize because functionality is thought to require an
it. Research on operant variability has suggested orderly relationship to environmental variables, pre-
that the descriptive term voluntary can be usefully dictions must be (at least theoretically) possible.
applied; that is, voluntary behaviors can be distin- Again, though, voluntary acts are often character-
guished from accidental reactions, such as stumbles; ized by their unpredictability, with this serving as a
from elicited responses, such as reflexes, both sign of autonomous control.
unconditioned and Pavlovian; from induced ones, An added complication is that unpredictability
such as those caused by drinking alcohol or antici- alone does not characterize voluntary actions.
pating a reinforcer; and many other cases. The Researchers do not attribute volition to random
research has also indicated important ways in which events, such as the throw of dice or emission of
voluntary actions differ from these others. atomic particles (Dennett, 2003; Popper & Eccles,
In large part, the difficulty surrounding attempts 1977), and truly random responding would often be
to explain voluntary behavior comes from an appar- maladaptive. Yet another problem is that voluntary
ent incompatibility between two often-noted charac- behaviors are not always unpredictable—they are
teristics. On the one hand, voluntary acts are said to quite predictable some of the time and, indeed, exist
be intentional, purposeful, goal directed, rational, or across the range of predictability. For example,
adaptive. These characteristics indicate the function- when the traffic light turns red, a driver is likely to
ality of voluntary behaviors, and we use that term as step on the brake. When you are asked for your
a summarizing descriptor. On the other hand, vol- name, you generally answer veridically, and so on.
untary actions are described as internally motivated But even in cases of predictable behaviors, if volun-
and autonomously controlled. Unpredictability, tary, these responses can be—and sometimes are—
demonstrated or potential, is offered as empirical emitted in more or less unpredictable fashion. The
evidence for such hypothesized autonomous con- red light can cause speeding up, slowing down, or
trol. Thus, unpredictability is thought to separate cursing. The name offered might be made up so as
voluntary acts from other functional behaviors (e.g., to fool the questioner, for example, during a game.
reflexes) and to separate their explanation from In brief, voluntary responses have the potential to
Newtonian causes and effects. Proposed explana- move along a variability continuum from highly pre-
tions of the unpredictability run the gamut from a dictable to unpredictable. A characteristic of all vol-
soul or a mind that can function apart from physical untary behaviors is real or potential variations in
causes to quantum-mechanical random events, but levels of variability.
they are all ultimately motivated by the presumed Operant variability helps to explain volition by
inability of a knowledgeable (perhaps even combining functionality with variations in levels of
supremely knowledgeable) observer to anticipate variability. Operant responses are goal directed and
the particulars of a voluntary act. functional, and the same holds for voluntary behav-
How can unpredictability (perhaps even unpre- iors. (In some cases, researchers say that the volun-
dictability in principle) be combined with function- tary response—and the operant—is intended to be
ality? That is the critical question facing those of us functional because it is governed by previous experi-
who would argue that voluntary is a useful classifica- ences and because in a variable or uncertain environ-
tion. The problem derives from the (erroneous) ment, what was once functional may no longer be so.)
assumption that functionality necessarily implies Operant responses are more or less variable, depend-
potential predictability. That assumption goes some- ing on discriminative stimuli and reinforcement con-
thing like this: If an observer knows what an indi- tingencies, and the same is true for voluntary
vidual is striving for, or attempting to accomplish, behaviors. Thus, for both operant and voluntary
536
Operant Variability
behaviors, the ability of a knowledgeable observer to reinforcement is unpredictable, and the two alterna-
predict future occurrences will depend on the circum- tives are independent of one another, every response
stances. Voluntary behavior is behavior that is func- has the possibility of producing a reinforcer. How-
tional (or intended to be so) and sometimes highly ever, in general, the left alternative is three times
predictable, other times unpredictable, with predict- more likely to have reinforcement waiting than the
ability governed by the same functionality require- right alternative.
ment as other attributes of operant behavior. We have The VI values (or average times between rein-
just summarized a theory of volition referred to as the forcer setups) generally differ across phases of an
operant variability and voluntary action (OVVA) theory experiment. For example, a 1:3 ratio of setup time
(Neuringer & Jensen, 2010). In the following sec- left to right in one phase might be followed by a 3:1
tions, we provide experimental evidence consistent ratio in another, and a third might use a 2:2 ratio.
with OVVA theory. We begin with a discussion of When the ratios across these alternatives are system-
choices under conditions of uncertainty, partly atically manipulated, an often observed finding is
because choices are generally thought to be voluntary that the overall ratios of left-to-right choices are
and partly because concurrent schedules of reinforce- functionally related to ratios of left-to-right obtained
ment, a method used to study choice, provided the reinforcers, a relationship commonly described as a
means to test OVVA theory. power function and referred to as the generalized
matching law (Baum, 1974):
Choice Under Uncertainty s
In some choice situations, one (and only one) of CX  kX   R X 

=   •  . (3)
many options provides reinforcement (e.g., the third CY  k y   R y 

key from the left in a row of eight keys), and both
people and other animals learn to choose correctly In Equation 3, CX refers to observed choices of alter-
and to do so repeatedly. In other cases, a particular native X, and RX corresponds to delivered reinforcers
pattern of choices is required (e.g., LLRR in a two- (CY and RY correspond to alternative Y, accordingly).
lever chamber), and that pattern is learned and The parameter kX refers to bias for X, with biases—
repeated. Individual choices in these situations are because of side preferences, differences in the oper-
readily predicted. In many situations, though, fixed anda, or any number of variables—not thought to be
choices and patterns are not reinforced, and rein- influenced by the reinforcer ratios. The s parameter
forcer availability is uncertain, both in time and refers to the sensitivity of choice ratios to reinforce-
place. As we discuss, these conditions often result in ment ratios. When s = 1.0, choice ratios exactly
stochastic responding. match (or equal) reinforcement ratios. With s
Choices under conditions of reinforcement parameter values less than 1.0, choice ratios are not
uncertainty have commonly been studied in behav- as extreme as the ratio of reinforcers, with the oppo-
ioral laboratories with concurrent schedules of site for s more than 1.0 (see the Psychophysical Test
reinforcement. Reinforcers are independently pro- section later in this chapter). To the extent that the
grammed for two (or sometimes more) options, and generalized matching law provides an accurate
subjects choose freely among them. Consider the description (and there is much support for it), it
example of concurrent VI schedules. In a VI 1 minute– permits predictions of the molar distribution of
VI 3 minute procedure, each schedule is applied to choice allocation; that is, overall ratios of choices
one of two response alternatives, left and right. can accurately be described as a function of obtained
Under this procedure, a reinforcer becomes available reinforcer ratios (Davison & McCarthy, 1988).
(or “sets up”) on average once per minute for Another observation from studies of concurrent
responses on the left and independently on average VI schedules, however, is that individual choices are
every 3 minutes for choices of the option on the difficult to predict. Even when they conform to
right. Once a reinforcer has set up, it is delivered on Equation 3, they often appear to be emitted stochas-
the next response to that alternative. Because time to tically (Glimcher, 2003, 2005; G. Jensen & Neuringer,
537
2008; Nevin, 1969; see also Silberberg, Hamilton, functionally varying behaviors yielded a perception
Ziriax, & Casey, 1978, for an alternative view). In of voluntary action.
the VI 1 minute–VI 3 minute example given earlier,
an observer might accurately predict that the left Psychophysical Test
option will be chosen three times more frequently OVVA theory predicts that responses will appear to
than the right but be unable to accurately predict be voluntary when levels of (un)predictability vary
any given choice. A recent example of such stochas- functionally (purposefully, adaptively). Choices
ticity was observed when pigeons’ choices were allo- under concurrent schedules of reinforcement pro-
cated across three concurrently available sources vided a way to test this claim. Neuringer, Jensen,
of reinforcement (G. Jensen & Neuringer, 2008). and Piff (2007) had human participants observe six
Figure 22.10 shows that run lengths—defined as the different virtual actors (hereinafter called agents) as
average number of choices on one key before each agent made thousands of choices. The agents
switching to a different key—approximated those differed in how they went about choosing among
expected from a stochastic process.5 Thus, at the the available options (the strategies are described in
same time that overall choice proportions can read- a subsequent paragraph). Each agent’s choices were
ily be predicted, individual choices cannot. This shown separately on an individual computer, with
combination of functionally related choice propor- six computers located close to one another on small
tions and stochastic emission provided the means desks in a laboratory. The participants were free to
to assess the relationship between operant variabil- walk among the computers to compare the agents’
ity and volition. In particular, we asked whether choice strategies.
To minimize extraneous cues, such as whether
the agent resembled a human figure, choices were
represented in a simple manner, namely as dots
moving around the screens. Participants were
instructed that the agents were choosing among
three alternative gambles, similar to slot machine
gambles, with each gamble’s choice represented by
the dot’s movement in one of three directions.
Whenever a choice led to reinforcement—the agent
won that gamble—the dot’s color changed as a sign
of success. Thus, participants could observe how
choices were made in relationship to the reinforcers
received. Participants were asked to judge how well
the choices made by the agents represented volun-
tary choices made by a real human player.
Unknown to the participants, the agents’ choices
were controlled by iterating the generalized match-
Figure 22.10. Mean run lengths by pigeons on each ing power function (Equation 3) that was extended
of three response keys as a function of the propor-
to a three-alternative situation (G. Jensen & Neu-
tion of responses to that key. The drawn line is the
expected function if responses were emitted stochasti- ringer, 2008). Thus, the agents chose probabilisti-
cally. Adapted from “Choice as a Function of Reinforcer cally among the three options on the basis of the
‘Hold’: From Probability Learning to Concurrent proportions of reinforcers that they had received
Reinforcement,” by G. Jensen and A. Neuringer, 2008,
Journal of Experimental Psychology: Animal Behavior from the three alternatives. These calculations were
Processes, 34, p. 44. Copyright 2008 by the American done in real time, with current choice probabilities
Psychological Association. depending on previously obtained reinforcers.
See Jensen and Neuringer (2008) for discussion of these findings, including the small divergence of data from the theoretical curve.
5
538
Operant Variability
Choices by the six agents differed only with respect among the three options with more equal probabili-
to the s exponent of the power functions governing ties than indicated by the reinforcement ratios
the choices: Some agents had high values for their throughout the six games. In the preceding example,
sensitivity parameters, and others had low values. this agent would choose X with probability of .399
Participants were told only that the dot movements (rather than .5 for the exact matcher), choose Y with
represented choices of gambles and that their objec- probability of .325 (rather than .3), and choose Z
tive was to rate how closely those movements with probability of .276 (rather than .2). In general,
resembled the voluntary choices of real human algorithms with s values less than 1.0 are referred to
players. Next, we describe how reinforcer availabil- as undermatchers: They distribute choices more
ity was programmed and the effects of s values on equally—and therefore more unpredictably—across
the generated choices. the available options than the exact matcher. The
Reinforcers set up probabilistically (and remained opposite was the case for agents with s values more
available until collected, as in concurrent VI sched- than 1.0, whose preferences were more extreme
ules) for each of the three gambles. There were six than indicated by the reinforcer ratios and were
different combinations of set-up rates, which partici- referred to as overmatchers. Over the course of sev-
pants were told constituted six different games. eral experiments, a wide range of s values was pre-
Thus, in some games, the agent’s X choices were sented, spanning a range from 0.0 (extreme
most frequently reinforced; in other games, Y choices undermatcher) to 6.0 (extreme overmatcher) in one
were most frequently reinforced; in others, the rein- experiment, a range from 0.1 to 2.0 in another, and
forcers were more equally distributed; and so on. a range from 0.1 to 1.9 in a third.
Participants were free to observe each agent playing Results were consistent and clear: The strict
each of the six games for as long as needed to make matcher (s = 1.0) was judged to best represent voli-
their evaluation. After observing the choices in all tional choices. Figure 22.11 shows data from two of
games, the participants judged the degree to which the experiments. In one experiment, participants
each agent’s responses appeared to be those of a were informed in advance that all of the agents’
human player who was voluntarily choosing among choices were generated by computer algorithms, and
the options. The key question was whether the they were asked to rate the algorithms in terms of
agents’ different choice strategies—caused by differ- volitional appearance. In the second, participants
ences in the s exponents—generated systematic dif- were told that some agents’ choices were based on
ferences in participants’ judgments of volition. computer algorithms, that others depicted voluntary
The s values, and their effects on the agents’ choices of real humans, and that their task was to
choice allocations, were as follows: For one agent, s identify the humans.6
equaled 1.0, and choice proportions therefore As s values approached 1.0, the agents were rated
strictly matched proportions of received reinforcers. as providing increasingly good representations of
Assume, for example, that this agent had gained a voluntary human choice, suggesting a continuum of
total of 100 reinforcers at some point in the game: more or less apparent volition. From the perspective
50 reinforcers for option X, 30 for Option Y, and 20 of the participants, the s = 1.0 strict matcher some-
for Option Z. The probability of the agent’s next X times responded unpredictably (when reinforcers
choice would therefore equal 0.5 (50/100); a Y were equally allocated across the three alternatives),
choice, 0.3 (30/100); and a Z choice, 0.2 (20/100). at other times highly predictably (when most rein-
The s = 1.0 actor therefore distributed its choices forcers were obtained from one alternative), and at
probabilistically in exact proportion to its received yet other times at intermediate levels. In each case,
reinforcers. however, the agent’s choices seemed to be governed
Another agent was assigned an s value of 0.4, the by the reinforcement distribution in a particular
consequence of which was that it tended to choose game environment, an indicator of functional
This task was inspired by the Turing test, considered by many to be the gold standard of artificial intelligence.
6
539
Figure 22.11. Judgments of how closely agents’ responses approximated

voluntary human choices (on left y-axis) and probabilities (prob.) of identifying
agents as a voluntarily choosing human player (on right y-axis) as functions of
the agents’ s-value exponents. From “Stochastic Matching and the Voluntary
Nature of Choice,” by A. Neuringer, G. Jensen, and P. Piff, 2007, Journal of the
Experimental Analysis of Behavior, 88, pp. 7, 13. Copyright 2007 by Society for
the Experimental Analysis of Behavior, Inc. Adapted with permission.
changes in behavior. The undermatchers tended to Another control experiment tested whether
respond less predictably throughout, as we indicated matching alone implied volition (Neuringer et al.,
earlier, and the overmatchers more predictably. 2007). The question was whether the more or less
Thus, the undermatchers demonstrated that unpre- (un)predictable responding contributed at all to the
dictability alone was not sufficient for apparent voli- judgments. Stated differently, did matching or pre-
tion: It was necessary that agents display functional dictability or both generate the volitional judg-
variations in levels of (un)predictability to receive ments? Participants were therefore asked to
the highest volitional ratings. compare two agents, both of which exactly matched
A series of control experiments evaluated alterna- choice proportions to reinforcer proportions; how-
tive explanations. For example, rates of reinforce- ever, one agent matched by stochastically allocating
ment were overall slightly higher (across games) for its choices (as was done in all of the experiments
the s = 1.0 matcher than for any of the other agents, described to this point), whereas the other agent
and one control showed that differences in rein- allocated its choices in an easily predictable fashion.
forcement rate were not responsible for the voli- For example, if the stochastic matcher had received
tional judgments. In the experiment, agents who reinforcers in a ratio of 5:3:2, it responded to the left
cheated (i.e., those who appeared to know where to alternative with a .5 probability, to the center with a
respond for reinforcers) were compared with the .3 probability, and to the right with a .2 probability.
strict—probabilistically choosing—matcher, and the Because they were emitted stochastically, individual
matcher was evaluated as substantially more voli- choices could not be predicted above chance levels.
tional in appearance, despite obtaining fewer rein- By contrast, the patterned matcher also matched
forcers than the cheaters. An observer might exactly but did so in a patterned and therefore read-
appreciate the individual who gains more reinforce- ily predictable way. In the example just given, it
ment than another, but that fact alone will not con- would respond LLLLLCCCRR, again and again
vince the observer that the individual is choosing in cycling through the same 5:3:2 strings of respond-
a voluntary manner. ing until there was a change in obtained reinforcer
540
Operant Variability
proportions, at which point it would adjust the instance may be difficult or impossible to predict,
length of its strings accordingly. Because both especially for large response classes.
agents matched, both received identical rates of Unpredictability, real or potential, is emphasized
reinforcement. The participants judged the stochas- in many discussions of volition. Indeed, the size of
tic matcher to significantly better represent a volun- the active set can be exceedingly large—and func-
tary human player than the patterned one, showing tionally so—because if someone was attempting to
that both functionality (matching, in this case) and prove that he or she is a free agent, the set of possibil-
stochasticity were jointly necessary for the highest ities might consist of all responses in that person’s
ratings of volition. repertoire (see Scriven, 1965). We return to the fact,
The combination of choice distributions (match- though, that voluntary behaviors can be predictable
ing) and choice variability (more or less predictabil- as well as not predictable. The most important char-
ity) provided evidence for voluntary behavior. acteristic is functionality of variability, or ability to
Choice distributions alone did not lead responses to change levels of predictability in response to environ-
be evaluated as highly voluntary, nor did choice mental demands. This is equally an identifying char-
unpredictability alone. Choices were most voluntary acteristic of operant behavior in which responses are
in appearance when probabilities and distributions functional and stochastically emitted. Thus, with
of stochastic responses changed with distributions Skinner, we combine voluntary and operant in a sin-
of reinforcers. According to OVVA theory, function- gle phrase, but research has now shown why that is
ally changing variable behaviors are voluntary appropriate. Operant responses are voluntary because
behaviors. Stated differently, voluntary behaviors they combine functionality with (un)predictability.
are members of a class characterized by ability to
vary levels of response (un)predictability in a func-
Conclusion
tional manner. The psychophysical evidence just
reviewed is consistent with OVVA theory. Aristotle anticipated what many have referred to as
To review, the facts of operant variability show the most influential law in psychology (Murray,
that levels, or degrees, of behavioral (un)predictabil- 1988). When two events co-occur, presentation of
ity are guided by environmental consequences. A one will cause recollection or generation of the
theory of volition, OVVA, proposes that exactly the other. Although he and many others were wrong in
same is true for voluntary actions. Voluntary behav- the details, laws of association have been the foun-
iors are sometimes readily predictable, sometimes dation of theories of mind and behavior throughout
less predictable, and sometimes quite unpredictable. the history of Western thought, and the science of
In all cases, the reasons for the predictability can be psychology and behavior has been well served by
identified (given sufficient knowledge), but the pre- the search for them. From the British Association-
cise behaviors may still remain unpredictable. For ists, to Pavlov, to Hebb and Rescorla, theoreticians
example, under some circumstances, the response to and researchers have documented laws of the form
“How are you?” can readily be predicted for a given “if A, then B” that help to explain thoughts and
acquaintance. Even when the situation warrants behaviors. Evolutionary theory offered a distinctly
unpredictable responses, as when responders wish different type of behavioral law, involving selection
to conceal their feelings, some veridical predictions from variations, laws that were developed by Skinner
can be made: that the response will be verbal, that it (1981) and others (Hull et al., 2001). In this chap-
will contain particular parts of speech, and so on. ter, we provided evidence of how selection interacts
The functionality of variability implies a degree of with variation: Parameters of variation are selected
predictability in the resulting behaviors that is (via reinforcement of variability), and selections
related to the activated class of possibilities from emerge from variations (via stochastic emission).
which the response emerges. The class can often be This interaction, of equal importance to that of asso-
predicted on the basis of knowledge of the organism ciation, must be deciphered if researchers are to
and environmental conditions. However, the explain, at long last, voluntary behavior.
541
References Motor Skills, 84, 627–661. doi:10.2466/pms.1997.

84.2.627
Aase, H., & Sagvolden, T. (2006). Infrequent, but not
frequent, reinforcers produce more variable respond- Cameron, J., & Pierce, W. D. (1994). Reinforcement,
ing and deficient sustained attention in young chil- reward, and intrinsic motivation: A meta-analysis.
dren with attention-deficit/hyperactivity disorder Review of Educational Research, 64, 363–423.
(ADHD). Journal of Child Psychology and Psychiatry, doi:10.3102/00346543064003363
47, 457–471. doi:10.1111/j.1469-7610.2005.01468.x Campbell, D. T. (1960). Blind variation and selective
Akins, C. K., Domjan, M., & Gutierrez, G. (1994). retention in creative thought as in other knowl-
Topography of sexually conditioned behavior in edge processes. Psychological Review, 67, 380–400.
male Japanese quail (Coturnix japonica) depends doi:10.1037/h0040373
on the CS–US interval. Journal of Experimental Castellanos, F. X., Sonuga-Barker, E. J. S., Schere, A., Di
Psychology: Animal Behavior Processes, 20, 199–209. Martino, A., Hyde, C., & Walters, J. R. (2005). Varieties
doi:10.1037/0097-7403.20.2.199 of attention-deficit/hyperactivity disorder-related intra-
Amabile, T. M. (1983). The social psychology of creativity. individual variability. Biological Psychiatry, 57, 1416–
New York, NY: Springer-Verlag. 1423. doi:10.1016/j.biopsych.2004.12.005
Antonitis, J. J. (1951). Response variability in the white Catchpole, C. K., & Slater, P. J. (1995). Bird song:
rat during conditioning, extinction, and recondition- Biological themes and variations. Cambridge, England:
ing. Journal of Experimental Psychology, 42, 273–281. Cambridge University Press.
doi:10.1037/h0060407 Chaitin, G. J. (1975). Randomness and mathematical
Arnesen, E. M. (2000). Reinforcement of object manipula- proof. Scientific American, 232, 47–52. doi:10.1038/
tion increases discovery. Unpublished undergraduate scientificamerican0575-47
thesis, Reed College, Portland, OR. Channon, S., & Baker, J. E. (1996). Depression and
Bandura, A. (1982). The psychology of chance encoun- problem-solving performance on a fault-diagnosis
ters and life paths. American Psychologist, 37, task. Applied Cognitive Psychology, 10, 327–336.
747–755. doi:10.1037/0003-066X.37.7.747 doi:10.1002/(SICI)1099-0720(199608)10:4<327::AID-
ACP384>3.0.CO;2-O
Baum, W. M. (1974). On two types of deviation from the
matching law: Bias and undermatching. Journal of Cherot, C., Jones, A., & Neuringer, A. (1996). Reinforced
the Experimental Analysis of Behavior, 22, 231–242. variability decreases with approach to reinforcers.
doi:10.1901/jeab.1974.22-231 Journal of Experimental Psychology: Animal Behavior
Processes, 22, 497–508. doi:10.1037/0097-7403.22.4.497
Baum, W. M. (1994). Understanding behaviorism. New
York, NY: HarperCollins. Clouzot, H.-G. (Producer & Director). (1956). Le mystère
Picasso [The mystery of Picasso; Motion picture].
Beck, A. T. (1976). Cognitive therapy and the emotional France: Filmsonor.
disorders. New York, NY: International Universities
Press. Cohen, L., Neuringer, A., & Rhodes, D. (1990). Effects
of ethanol on reinforced variations and repeti-
Bizo, L. A., & Doolan, K. (2008). Reinforced behavioural tions by rats under a multiple schedule. Journal
variability. Paper presented at the meeting of the of the Experimental Analysis of Behavior, 54, 1–12.
Association for Behavior Analysis, Chicago, IL. doi:10.1901/jeab.1990.54-1
Blough, D. S. (1966). The reinforcement of least frequent Craig, W. (1918). Appetites and aversions as constitu-
interresponse times. Journal of the Experimental ents of instincts. Biological Bulletin, 34, 91–107.
Analysis of Behavior, 9, 581–591. doi:10.1901/jeab. doi:10.2307/1536346
1966.9-581
Davison, M., & McCarthy, D. (1988). The matching law:
Bouton, M. (1994). Context, ambiguity, and classical con- A research review. Hillsdale, NJ: Erlbaum.
ditioning. Current Directions in Psychological Science,
Deci, E. L., Koestner, R., & Ryan, R. M. (1999). A
3, 49–53. doi:10.1111/1467-8721.ep10769943
meta-analytic review of experiments examin-
Bower, G. H., & Hilgard, E. R. (1981). Theories of learning the effects of extrinsic rewards on intrinsic
ing (5th ed.). Englewood Cliffs, NJ: Prentice-Hall. motivation. Psychological Bulletin, 125, 627–668.
Brener, J., & Mitchell, S. (1989). Changes in energy doi:10.1037/0033-2909.125.6.627
expenditure and work during response acquisition Dennett, D. (2003). Freedom evolves. New York, NY:
in rats. Journal of Experimental Psychology: Animal Viking Adult.
Behavior Processes, 15, 166–175. doi:10.1037/0097-
Denney, J., & Neuringer, A. (1998). Behavioral variabil-
7403.15.2.166
ity is controlled by discriminative stimuli. Animal
Brugger, P. (1997). Variables that influence the genera- Learning and Behavior, 26, 154–162. doi:10.3758/
tion of random sequences: An update. Perceptual and BF03199208
542
Operant Variability
Doughty, A. H., & Lattal, K. A. (2001). Resistance to students. Behavior Modification, 27, 251–264.
change of operant variation and repetition. Journal of doi:10.1177/0145445503251605
the Experimental Analysis of Behavior, 76, 195–215. Hopson, J., Burt, D., & Neuringer, A. (2002). Variability
doi:10.1901/jeab.2001.76-195 and repetition under a multiple schedule. Unpublished
Driver, P. M., & Humphries, D. A. (1988). Protean behav- manuscript.
ior: The biology of unpredictability. Oxford, England:
Horne, R. L., Evans, F. J., & Orne, M. T. (1982). Random
Oxford University Press.
number generation, psychopathology, and therapeu-
Eckerman, D. A., & Lanson, R. N. (1969). Variability of tic change. Archives of General Psychiatry, 39, 680–
response location for pigeons responding under con- 683. doi:10.1001/archpsyc.1982.04290060042008
tinuous reinforcement, intermittent reinforcement
Hoyert, M. S. (1992). Order and chaos in fixed-
and extinction. Journal of the Experimental Analysis of
interval schedules of reinforcement. Journal of the
Behavior, 12, 73–80. doi:10.1901/jeab.1969.12-73
Experimental Analysis of Behavior, 57, 339–363.
Eisenberger, R., & Armeli, S. (1997). Can salient reward doi:10.1901/jeab.1992.57-339
increase creative performance without reducing
Hull, D. L., Langman, R. E., & Glenn, S. S. (2001). A gen-
intrinsic creative interest? Journal of Personality and
eral account of selection: Biology, immunology and
Social Psychology, 72, 652–663. doi:10.1037/0022-
behavior. Behavioral and Brain Sciences, 24, 511–528.
3514.72.3.652
doi:10.1017/S0146525X0156416X
Gharib, A., Gade, C., & Roberts, S. (2004). Control
Jensen, G., Miller, C., & Neuringer, A. (2006). Truly ran-
of variation by reward probability. Journal of
dom operant responding: Results and reasons. In
Experimental Psychology: Animal Behavior Processes,
E. A. Wasserman & T. R. Zentall (Eds.), Comparative
30, 271–282. doi:10.1037/0097-7403.30.4.271
cognition: Experimental explorations of animal intel-
Gigerenzer, G., Swijtink, Z., Porter, T., Daston, L., ligence (pp. 459–480). Oxford, England: Oxford
Beatty, J., & Kruger, L. (1989). The empire of chance: University Press.
How probability changed science and everyday life.
Cambridge, England: Cambridge University Press. Jensen, G., & Neuringer, A. (2008). Choice as a function
of reinforcer “hold”: From probability learning to
Glimcher, P. W. (2003). Decisions, uncertainty, and the concurrent reinforcement. Journal of Experimental
brain. Cambridge, MA: MIT Press. Psychology: Animal Behavior Processes, 34, 437–460.
Glimcher, P. W. (2005). Indeterminacy in brain and doi:10.1037/0097-7403.34.4.437
behavior. Annual Review of Psychology, 56, 25–56. Jensen, R. V. (1987). Classical chaos. American Scientist,
doi:10.1146/annurev.psych.55.090902.141429 75, 168–181.
Goetz, E. M., & Baer, D. M. (1973). Social control of form Kane, R. (Ed.). (2002). The Oxford handbook of free will.
diversity and emergence of new forms in children’s Oxford, England: Oxford University Press.
blockbuilding. Journal of Applied Behavior Analysis, 6,
209–217. doi:10.1901/jaba.1973.6-209 Lapp, J. E., Marinier, R., & Pihl, R. O. (1982). Correlates
of psychotropic drug use in women: Interpersonal
Grunow, A., & Neuringer, A. (2002). Learning to vary personal problem solving and depression. Women
and varying to learn. Psychonomic Bulletin and and Health, 7, 5–16. doi:10.1300/J013v07n02_02
Review, 9, 250–258. doi:10.3758/BF03196279
Lee, R., McComas, J. J., & Jawor, J. (2002). The effects of
Guthrie, E. R., & Horton, G. P. (1946). Cats in a puzzle differential reinforcement on varied verbal respond-
box. New York, NY: Rinehart. ing by individuals with autism to social questions.
Hachiga, Y., & Sakagami, T. (2010). A runs-test algo- Journal of Applied Behavior Analysis, 35, 391–402.
rithm: Contingent reinforcement and response run doi:10.1901/jaba.2002.35-391
structures. Journal of the Experimental Analysis of Lee, R., & Sturmey, P. (2006). The effects of lag sched-
Behavior, 93, 61–80. doi:10.1901/jeab.2010.93-61 ules and preferred materials on variable respond-
Herrnstein, R. J. (1961). Stereotypy and intermittent rein- ing in students with autism. Journal of Autism and
forcement. Science, 133, 2067–2069. doi:10.1126/ Developmental Disorders, 36, 421–428. doi:10.1007/
science.133.3470.2067-a s10803-006-0080-7
Holman, J., Goetz, E. M., & Baer, D. M. (1977). The Lee, R., Sturmey, P., & Fields, L. (2007). Schedule-induced
training of creativity as an operant and an examina- and operant mechanisms that influence response vari-
tion of its generalization characteristics. In B. Etzel, ability: A review and implications for future investiga-
J. LeBland, & D. Baer (Eds.), New development in tions. Psychological Record, 57, 429–455.
behavior research: Theory, method and application Lepper, M. R., & Henderlong, J. (2000). Turning “play”
(pp. 441–471). Hillsdale, NJ: Erlbaum. into “work” and “work” into “play”: 25 years of
Hopkinson, J., & Neuringer, A. (2003). Modifying research on intrinsic versus extrinsic motivation. In
behavioral variability in moderately depressed C. Sansone & J. M. Harackiewicz (Eds.), Intrinsic and
543
extrinsic motivation: The search for optimal motiva- Behavior Analysis, 33, 151–165. doi:10.1901/
tion and performance (pp. 257–307). San Diego, CA: jaba.2000.33-151
Academic Press. doi:10.1016/B978-012619070-0/ Mook, D. M., & Neuringer, A. (1994). Different effects
50032-5 of amphetamine on reinforced variations versus rep-
Libet, B., Freeman, A., & Sutherland, K. (Eds.). (1999). etitions in spontaneously hypertensive rats (SHR).
The volitional brain: Towards a neuroscience of free Physiology and Behavior, 56, 939–944. doi:10.1016/
will. Thorverton, England: Imprint Academic. 0031-9384(94)90327-1
Lopes, L. L. (1982). Doing the impossible: A note on Mosekilde, E., Larsen, E., & Sterman, J. (1991). Coping
induction and the experience of randomness. Journal with complexity: Deterministic chaos in human deci-
of Experimental Psychology: Learning, Memory, and sion making behavior. In J. L. Casti & A. Karlqvist
Cognition, 8, 626–636. doi:10.1037/0278-7393.8.6.626 (Eds.), Beyond belief: Randomness, prediction and
explanation in science (pp. 199–229). Boca Raton, FL:
Maasen, S., Prinz, W., & Roth, G. (Eds.). (2003).
CRC Press.
Voluntary action: Brains, minds, and sociality. New
York, NY: Oxford University Press. Moxley, R. A. (1997). Skinner: From determinism to ran-
dom variation. Behavior and Philosophy, 25, 3–28.
Machado, A. (1989). Operant conditioning of behavioral
variability using a percentile reinforcement schedule. Murray, D. J. (1988). A history of Western psychology (2nd
Journal of the Experimental Analysis of Behavior, 52, ed.). Englewood Cliffs, NJ: Prentice Hall.
155–166. doi:10.1901/jeab.1989.52-155 Neuringer, A. (1986). Can people behave “randomly?”:
Machado, A. (1992). Behavioral variability and frequency- The role of feedback. Journal of Experimental
dependent selection. Journal of the Experimental Psychology: General, 115, 62–75. doi:10.1037/0096-
Analysis of Behavior, 58, 241–263. doi:10.1901/ 3445.115.1.62
jeab.1992.58-241 Neuringer, A. (1991). Operant variability and repeti-
Machado, A. (1993). Learning variable and stereotypical tion as functions of interresponse time. Journal of
sequences of responses: Some data and a new model. Experimental Psychology: Animal Behavior Processes,
Behavioural Processes, 30, 103–129. doi:10.1016/ 17, 3–12. doi:10.1037/0097-7403.17.1.3
0376-6357(93)90002-9 Neuringer, A. (1992). Choosing to vary and repeat.
Machado, A. (1997). Increasing the variability of Psychological Science, 3, 246–250. doi:10.1111/
response sequences in pigeons by adjusting the j.1467-9280.1992.tb00037.x
frequency of switching between two keys. Journal Neuringer, A. (1993). Reinforced variation and selec-
of the Experimental Analysis of Behavior, 68, 1–25. tion. Animal Learning and Behavior, 21, 83–91.
doi:10.1901/jeab.1997.68-1 doi:10.3758/BF03213386
Maes, J. H. R., & van der Goot, M. (2006). Human oper- Neuringer, A. (2002). Operant variability: Evidence,
ant learning under concurrent reinforcement of functions, and theory. Psychonomic Bulletin and
response variability. Learning and Motivation, 37, Review, 9, 672–705. doi:10.3758/BF03196324
79–92. doi:10.1016/j.lmot.2005.03.003 Neuringer, A. (2003). Reinforced variability and creativ-
Manabe, K., Staddon, J. E. R., & Cleaveland, J. M. (1997). ity. In K. A. Lattal & P. N. Chase (Eds.), Behavior
Control of vocal repertoire by reward in budgerigars theory and philosophy (pp. 323–338). New York, NY:
(Melopsittacus undulatus). Journal of Comparative Kluwer Academic/Plenum.
Psychology, 111, 50–62. doi:10.1037/0735-7036. Neuringer, A. (2004). Reinforced variability in animals
111.1.50 and people. American Psychologist, 59, 891–906.
McElroy, E., & Neuringer, A. (1990). Effects of alcohol doi:10.1037/0003-066X.59.9.891
on reinforced repetitions and reinforced variations in Neuringer, A. (2009). Operant variability and the power
rats. Psychopharmacology, 102, 49–55. doi:10.1007/ of reinforcement. Behavior Analyst Today, 10, 319–
BF02245743 343. Retrieved from http://www.baojournal.com/
Mechner, F. (1958). Sequential dependencies of the BAT%20Journal/VOL-10/BAT%2010-2.pdf
lengths of consecutive response runs. Journal of Neuringer, A., Deiss, C., & Olson, G. (2000). Reinforced
the Experimental Analysis of Behavior, 1, 229–233. variability and operant learning. Journal of
doi:10.1901/jeab.1958.1-229 Experimental Psychology: Animal Behavior Processes,
Metzger, M. A. (1994). Have subjects been shown to gen- 26, 98–111. doi:10.1037/0097-7403.26.1.98
erate chaotic numbers? Commentary on Neuringer Neuringer, A., & Jensen, G. (2010). Operant variability
and Voss. Psychological Science, 5, 111–114. and voluntary action. Psychological Review, 117,
doi:10.1111/j.1467-9280.1994.tb00641.x 972–993. doi:10.1037/a0019499
Miller, N., & Neuringer, A. (2000). Reinforcing vari- Neuringer, A., Jensen, G., & Piff, P. (2007). Stochastic
ability in adolescents with autism. Journal of Applied matching and the voluntary nature of choice. Journal
544
Operant Variability
of the Experimental Analysis of Behavior, 88, 1–28. Schwartz, B. (1988). The experimental synthesis of
doi:10.1901/jeab.2007.65-06 behavior: Reinforcement, behavioral stereotypy and
Neuringer, A., Kornell, N., & Olufs, M. (2001). Stability problem solving. In G. H. Bower (Ed.), The psychol-
and variability in extinction. Journal of Experimental ogy of learning and motivation (Vol. 22, pp. 93–138).
Psychology: Animal Behavior Processes, 27, 79–94. New York, NY: Academic Press.
doi:10.1037/0097-7403.27.1.79 Scriven, M. (1965). An essential unpredictability in
Neuringer, A., & Voss, C. (1993). Approximating cha- human behavior. In B. Wolman (Ed.), Scientific psy-
otic behavior. Psychological Science, 4, 113–119. chology (pp. 411–425). New York, NY: Basic Books.
doi:10.1111/j.1467-9280.1993.tb00471.x Sebanz, N., & Prinz, W. (Eds.). (2006). Disorders of voli-
Neuringer, A., & Voss, C. (2002). Approximations to tion. Cambridge, MA: MIT Press.
chaotic responding depends on interresponse time.
Unpublished manuscript. Silberberg, A., Hamilton, B., Ziriax, J. M., & Casey, J.
(1978). The structure of choice. Journal of
Nevin, J. A. (1969). Interval reinforcement of choice Experimental Psychology: Animal Behavior Processes,
behavior in discrete trials. Journal of the Experimental 4, 368–398. doi:10.1037/0097-7403.4.4.368
Analysis of Behavior, 12, 875–885. doi:10.1901/
jeab.1969.12-875 Skinner, B. F. (1938). The behavior of organisms. New
York, NY: Appleton-Century.
Nickerson, R. S. (2002). The production and perception
of randomness. Psychological Review, 109, 330–357. Skinner, B. F. (1959). Cumulative record. New York, NY:
doi:10.1037/0033-295X.109.2.330 Appleton-Century-Crofts. (Original work published
Nigg, J. T. (2001). Is ADHD a disinhibitory disorder? 1935)
Psychological Bulletin, 127, 571–598. doi:10.1037/ Skinner, B. F. (1981). Selection by consequences. Science,
0033-2909.127.5.571 213, 501–504. doi:10.1126/science.7244649
Notterman, J. M., & Mintz, D. E. (1965). Dynamics of Smith, J. M. (1982). Evolution and the theory of games.
response. New York, NY: Wiley. Cambridge, England: Cambridge University Press.
Page, S., & Neuringer, A. (1985). Variability is an oper-
ant. Journal of Experimental Psychology: Animal Staddon, J. E. R., & Simmelhag, V. L. (1971). The
Behavior Processes, 11, 429–452. doi:10.1037/0097- “superstition” experiment: A reexamination of its
7403.11.3.429 implications for the principles of adaptive behav-
ior. Psychological Review, 78, 3–43. doi:10.1037/
Popper, K. R., & Eccles, J. C. (1977). The self and its h0030305
brain. New York, NY: Springer-Verlag.
Stokes, P. D. (1995). Learned variability. Animal Learning
Pryor, K. W., Haag, R., & O’Reilly, J. (1969). The creative and Behavior, 23, 164–176. doi:10.3758/BF03199931
porpoise: Training for novel behavior. Journal of
the Experimental Analysis of Behavior, 12, 653–661. Stokes, P. D. (2001). Variability, constraints, and cre-
doi:10.1901/jeab.1969.12-653 ativity: Shedding light on Claude Monet. American
Radloff, L. S. (1991). The use of the Center for Psychologist, 56, 355–359. doi:10.1037/0003-066X.
Epidemiological Studies Depression Scale in ado- 56.4.355
lescents and young adults. Journal of Youth and Stokes, P. D., & Harrison, H. M. (2002). Constraints have
Adolescence, 20, 149–166. doi:10.1007/BF01537606 different concurrent effects and aftereffects on vari-
Rhinehart, L. (1998). The dice man. Woodstock, NY: ability. Journal of Experimental Psychology: General,
Overlook Press. 131, 552–566. doi:10.1037/0096-3445.131.4.552
Ross, C., & Neuringer, A. (2002). Reinforcement of Taleb, N. N. (2007). The black swan. New York, NY:
variations and repetitions along three independent Random House.
response dimensions. Behavioural Processes, 57,
199–209. doi:10.1016/S0376-6357(02)00014-1 Thorndike, E. L. (1911). Animal intelligence. New York,
NY: Macmillan.
Rubia, K., Smith, A. B., Brammer, M. J., & Taylor, E.
(2007). Temporal lobe dysfunction in medication- Townsend, J. T. (1992). Chaos theory: A brief tutorial
naïve boys with attention-deficit/hyperactivity disor- and discussion. In A. F. Healy, S. M. Kosslyn, & R.
der during attention allocation and its relation M. Shiffrin (Eds.), From learning theory to connection-
to response variability. Biological Psychiatry, 62, ist theory: Essays in honor of William K. Estes (Vol. 1,
999–1006. doi:10.1016/j.biopsych.2007.02.024 pp. 65–96). Hillsdale, NJ: Erlbaum.
Schwartz, B. (1980). Development of complex stereo- Vogel, R., & Annau, Z. (1973). An operant discrimina-
typed behavior in pigeons. Journal of the Experimental tion task allowing variability of reinforced response
Analysis of Behavior, 33, 153–166. doi:10.1901/ patterning. Journal of the Experimental Analysis of
jeab.1980.33-153 Behavior, 20, 1–6. doi:10.1901/jeab.1973.20-1
545
Wagner, K., & Neuringer, A. (2006). Operant variabil- Wegner, D. M. (2002). The illusion of conscious will.
ity when reinforcement is delayed. Learning and Cambridge, MA: MIT Press.
Behavior, 34, 111–123. doi:10.3758/BF03193187 Weiss, R. L. (1965). “Variables that influence random-
Ward, L. M., & West, R. L. (1994). On chaotic generation”: An alternative hypothesis. Perceptual
behavior. Psychological Science, 5, 232–236. and Motor Skills, 20, 307–310. doi:10.2466/
doi:10.1111/j.1467-9280.1994.tb00506.x pms.1965.20.1.307
546

12.neuringer, A., - Jensen, G. (2013) - Operant Variability.

Uploaded by

Copyright:

Available Formats

12.neuringer, A., - Jensen, G. (2013) - Operant Variability.

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

12.neuringer, A., - Jensen, G. (2013) - Operant Variability.

Uploaded by

Copyright:

Available Formats

Chapter 22

Throughout this chapter, we use stochastic and random interchangeably.

response frequencies (rather than recency) provides

Figure 22.2. U value as a function of reinforcement frequencies. Each line rep-

Figure 22.3. U values for each of three dimensions of rectangles drawn by

producing food reinforcers by poking their noses

four times to get a single reinforcer, and the VAR

discussed explanations of operant variability, namely

individual responses are unpredictable because of reinforced. By alternating, LRLRLR . . . , every

Here, Rn refers to the nth iteration in a series, each R

chapter refer to this type of randomness.

Figure 22.7. Percentages of reinforced, or correct, sequences as a function of etha-

during the training phase. Hidden in each object Psychopathology

Rubia, Smith, Brammer, & Taylor, 2007). A sec-

In some choice situations, one (and only one) of CX  kX   R X 

Figure 22.11. Judgments of how closely agents’ responses approximated

References Motor Skills, 84, 627–661. doi:10.2466/pms.1997.

You might also like