Journal of Experimental Psychology:
Animal Behavior Processes
1978, Vol. 4, No. 4, 368-398
The Structure of Choice
Alan Silberberg, Bruce Hamilton, John M. Ziriax, and Jay Casey
The American University
Both Nevin (1969) and Shimp (1966) found on different choice procedures
that pigeons equate (match) the proportion of their choices to the proportion of reinforcers each choice delivers. Their results differed in terms of
the order of successive choices: Shimp found pigeons ordered successive
choices so as to maximize the reinforcement rate, whereas Nevin found no
evidence of such an ordering. Experiment 1 replicated both studies and found
in both: (a) matching relations and (b) sequential dependencies of choice
that corresponded with Shimp's maximizing prediction. The next three experiments studied the order of choices in three other choice procedures:
(a) concurrent variable-interval schedules with a changeover delay, (b)
concurrent variable-interval schedules without a changeover delay, and (c)
concurrent-chains schedules. In all of these procedures, control of choice at
the level of the response sequence was evident. The major features of the
data from all four experiments were attributed to two molecular processes:
response perseveration and reinforcement maximization. This evidence for
a microstructure of choice suggests that the molar matching law is not isomorphic with the molecular processes governing concurrent performances.
In a study by Herrnstein (1961), pigeons
chose between two response keys, each associated with an independent variable-interval
(VI) schedule of food reinforcement. A
changeover delay (COD) that specified a
minimum interval during which reinforcement was unavailable following a switch of
choice- between keys was used to minimize
interaction between these schedules. Herrnstein found with several pairs of VI schedules that the proportion of responses to a
key (responses to a key divided by response
total to both keys) equaled or "matched"
the proportion of reinforcements that that
This research was supported by National Institute of Mental Health grant MH22881 to The
American University. This report was written during A. Silberberg's sabbatical stay at the University of Sussex, England. He thanks them for
their courtesy and support.
Bruce Hamilton is now at the Walter Reed
Army Institute of Research, Washington, D.C.
Requests for reprints should be sent to A. Silberberg, Department of Psychology, 321 Asbury Building, The American University, Washington, D.C.
20016.
key delivered (reinforcements to a key divided by reinforcement total to both keys).
These matching relations have proven to
be of considerable species and procedural
generality (see de Villiers, 1977, for a review). To present just one example, Shull
and Pliskoff (1967) found matching using a
choice procedure different from Herrnstein's,
using rats instead of pigeons and brain
stimulation instead of grain as the reinforcer.
This generality, in conjunction with the elegant characterization of choice offered by
the matching law, accounts for much of the
current research interest in concurrent performances. There is, however, also an important theoretical reason for studying choice
behavior. As Baum and Rachlin (1969)
have argued, the generality of matching
relations makes the relative-rate measure a
useful metric for defining a reinforcer's relative value. In their study, pigeons were given
a choice between two behaviors—standing at
one or the other end of an elongated
chamber. Each of these behaviors was reinforced according to different schedules of
Copyright 1978 by the American Psychological Association, Inc. 0097-7403/78/0404-0368?00.75
368
THE STRUCTURE OF CHOICE
VI reinforcement. They found that the ratio
of time spent standing on either side equaled
the ratio of reinforcements these behaviors
produced. The more general possibility is
that the time spent in one of several behaviors equals the relative value of that
behavior.
This equation of relative response rate or
time with relative reinforcing value is a bold
extension of the matching law and suggests
that choice theorists have answered a historically significant question in the psychology of learning—What is the appropriate
measure of the strength of a reflex (e.g., see
Herrnstein, 1970; Hull, 1943; Skinner,
1953) ? The answer is relative response rate.
This progress notwithstanding, a possible
problem should be acknowledged: Relative
response rate is an average of an organism's
individual choices. It must be determined
whether the relative-response-rate statistic is
consistent with the choices denning it. For
the relative-response-rate measure to be fully
descriptive of the psychological process of
choice allocation, successive choices must not
show orderly sequential dependencies. If
they did, relative response rate could be a
theoretically misleading and empirically impoverished measure, with matching occurring
at the molar level of relative response rate,
yet without the more molecular response
sequences composing that measure conforming with the matching law (e.g., see Shimp,
1969b). For the same reason, sequential
dependencies among choices would do damage to Baum and Rachlin's notion of choice
as an assay of relative reinforcing value.
Were successive choices sequentially dependent, the molar relative-frequency measure would not index any relation between
the strengths of behavior that different reinforcement conditions maintain; rather, this
measure would only be an average of different choices controlled by a molecular
process based on the order of prior choices.
Two studies (Nevin, 1969; Shimp, 1966)
have tested whether matching, defined in
terms of relative rate, is a consequence of
discernible regularities in the order of successive choices. In the Shimp study (Experiment 3), a center-key response illumi-
369
nated two side keys and occasionally assigned
a reinforcement with unequal probability to
one of the two keys. Once an assignment
was made to a key, no further assignments
were possible until reinforcement was delivered. Each side-key response darkened
the choice keys and reilluminated the center
key. Shimp found that pigeons matched—
that is, they partitioned their choices between
keys so that the proportion of total responding to a key equaled the proportion of total
reinforcement that that key provided. Shimp
then carried this analysis one step further:
He recorded the order of all choices falling
between successive reinforcements. He found
that these interreinforcement response sequences showed several correspondences
with the changes each choice induced in the
relative probability of reinforcement. Based
on this finding, he argued that matching at
the level of relative response rate was a
consequence of a more molecular process—
control of successive choices by local reinforcement probabilities.
Shimp found pigeons' behavior was well
described by a reinforcement-optimizing
strategy he called momentary maximizing:
At each instance a choice is made, that
choice is allocated to whichever key is more
likely to provide reinforcement. In his study,
in which reinforcement for key-A responses
was three times as likely as for key-B
responses, the momentary-maximizing sequence was AAB.1 Not only were sequential
dependencies in choice found that conformed
to this sequence, but a computer simulation
of choice governed by this rule simulated
matching on concurrent VI-VI schedules.
Thus, he argued, matching need not be a
consequence of the probabilistic allocation
of choices in accordance with their averaged
relative frequency of reinforcement. Rather,
choice may be governed by a molecular
1
Although the momentary-maximizing principle is conceptually simple (at each moment of
choice the pigeon selects whichever alternative is
more likely to be reinforced), the mathematical
derivation of the maximizing sequence is somewhat more complicated. The interested reader is
referred to Shimp (1966, Appendix A).
370
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
process—that of optimizing the time rate
of reinforcement (see also Shimp, 1969b).
Data inconsistent with this interpretation
were presented by Nevin (1969). He studied pigeons' choice behavior on a discretetrials procedure. Each trial began with the
illumination of both side keys. A response
to either key turned off these key lights
for an intertrial interval (ITI) of 6 sec.
Associated with one key was a VI 1-min
schedule and with the other key, a separate
and independent VI 3-min schedule. Nevin
found the conventional matching result—that
approximately three times as many choices
were made to the VI 1-min key as were
made to the VI 3-min key. Of greater
interest was his method for analyzing the
order in which choices were made: Instead
of recording all combinations of interreinforcement response sequences as was
done by Shimp, he recorded only some
sequences—those terminating in a changeover (i.e., a switch between keys). Nevin
found that the probability of a changeover
decreased slightly as a function of the number of successive choices to a given key.
Because, on concurrent VI-VI schedules,
the probability of reinforcement for a changeover increases with run length to a key, he
concluded that pigeons were insensitive to
this local dimension of differential reinforcement. More generally, his results suggest
that matching can occur at the molar level of
relative rate without evidencing, in terms
of his changeover measure, any tendency to
maximize the time rate of reinforcement at
the molecular level of the response sequence.
Clearly, the results of Shimp and Nevin
appear to be contradictory—the former
found evidence that matching was a consequence of control by molecular contingencies;
the latter did not. Herrnstein and Loveland
(1975) attempted to resolve this apparent
incompatibility. They argued that studies
that support a molecular interpretation of
choice (e.g., Shimp, 1966) use procedures
in which the conditions of concurrent reinforcement are "inhomogeneous." By inhomogeneous they meant that the loci of
prior choices dramatically influence the relative probability of reinforcement between
choice alternatives. Evidence for molecular
control obtains when the subject's choices
track these inhomogeneities. Hence, finding
sequential dependencies among choice is not
so much evidence against matching as a
demonstration of a subject's sensitivity to
local changes in the likelihood of reinforcement. In their view, the proper assay for
matching relations is on schedules where the
conditions of concurrent reinforcement are
more homogeneous (e.g., Nevin, 1969)—
that is, where each choice presumably does
not produce such large changes in local
reinforcement probabilities.
A test of this "relative-homogeneity" argument seems straightforward: Statistically
independent choice allocation that conforms
with matching should produce larger changes
in the local reinforcement probability in the
Shimp procedure than in the Nevin procedure. In order to test this premise, the
Shimp (1966, Experiment 3) and Nevin
(1969) choice contingencies were simulated
on a computer. In both procedures, key-A
responses were three times as likely to produce reinforcement as key-B responses. The
following two assumptions were made regarding simulated schedule performances:
(a) The relative response rate would equal
the relative frequency of reinforcement
(i.e., matching would obtain), and (b) all
choices would be statistically independent
of prior choices. These assumptions were
fulfilled by assigning 75% of the numbers
from a random number stream to key A.
A third assumption in these simulations was
that responding would occur on all choice
trials, immediately after the choice alternatives were presented.
After 100,000 trials of simulated choice
allocation, the relative frequency of key-A
reinforcement was calculated in two ways
for each procedure: (a) given the prior
choice had been to key A and (b) given the
prior choice had been to key B. When the
prior response was to key A, the relative
frequency of key-A reinforcement for the
Shimp and Nevin procedures, respectively,
was .43 and .40, and when the prior response
was to key B, these statistics were .90 and
.88, respectively. As these results clearly
THE STRUCTURE OF CHOICE
show, the relative changes in local reinforcement frequencies appear no more inhomogeneous in the Shimp procedure than in the
Nevin procedure. Hence, any explanation of
the empirical differences between these two
procedures based on differences in schedule
homogeneity is suspect.
Silberberg and Williams (1974) attributed the differences in Shimp's and Nevin's
results to the differences in ITI duration in
these studies. In their experiment, they
showed that ITI duration can influence
whether control by local reinforcement contingencies is manifest. They hypothesized
that the 6-sec ITI of the Nevin study caused
subjects to make errors in emitting the
momentary-maximizing sequence and that
these errors obfuscated the appearance of
the underlying molecular control of choice.
Nevertheless, matching obtained in the
Nevin study because pigeons tended to distribute their errors randomly among the
members of the maximizing sequence. To
test this notion, Mohr (1976) replicated
Nevin's procedure with two different ITI
durations. She found little evidence that
ITI duration affected choice sequences with
his procedure. Thus, it does not appear
that ITI duration alone can account for the
absence of sequential dependencies in
Nevin's results.
A close reading of Shimp's and Nevin's
experiments suggests another possible way
of reconciling their results: Perhaps the
empirical differences in these studies are due
to differences in the way sequential dependencies were measured. In the Shimp study,
sequential statistics were recorded among all
combinations of choices defining interreinforcement sequences. For example, he recorded the probability of a left (L) given
prior choices had been L, right (R), LR,
RL, LRL, and so forth. Nevin's primary
measure, on the other hand, was the probability of a changeover between key colors
as a function of run length to a key color.
For example, he recorded the probability of
a red-key response given one green-key response, two green-key responses, and so
forth. Shimp's measures subsume Nevin's
in that Shimp recorded all choice combina-
371
tions, whereas Nevin only recorded changes
in changeover probability. Whether this
difference in measures translates into different interpretations of the process of choice
is best answered by replicating Shimp's and
Nevin's procedures and presenting both
types of measures. This was the purpose of
Experiment 1.
Experiment 1
In this experiment, choice procedures
similar to those of Shimp and Nevin were
used. Detailed sequential statistics were recorded and presented in two ways: (a) in
a table composed of the probability of an L
given different combinations of Ls and Rs
(see Shimp, 1966) and (b) in a plot of the
probability of a changeover as a function of
run length to a key (see Nevin, 1969). If
accounts that attribute the differences between the Shimp and Nevin results to procedural factors are correct (e.g., Herrnstein & Loveland, 1975; Silberberg &
Williams, 1974), differences should obtain
between procedures for both dependent variables. If our current hypothesis is correct,
however—that the empirical differences between these studies are largely a consequence of different methods of data presentation—each dependent variable should produce
roughly the same results across procedures.
Method
Subjects. Ten adult male White Carneaux
pigeons, deprived to 80% of their free-feeding
weights, were used. All birds were experimentally
naive at the beginning of the experiment.
Apparatus. Four identical sound-attenuated experimental chambers, electrically connected to a
PDP-8/e minicomputer, were used. Each chamber's
dimensions were 34.3 X 30.5 X 33 cm. With the
exception of the stainless-steel response panel and
the wire mesh floor, all surfaces were composed
of galvanized steel. The distances from the floor
of the chamber to the hopper aperture, to the
midpoint of the center key, and to the houselight
were, respectively, 9.5 cm, 25 cm, and 30.5 cm.
The midpoints of each of the two side keys were
displaced 7.6 cm from the midpoint of the center
key. Lehigh Valley Electronics response keys, requiring a minimum force of .1 N for operation
and transilluminated by Industrial Electronic Engineers multistimulus projectors, were used.
372
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
Procedure. After being trained to eat from the
food magazine when it was presented unpredictably in time, the pigeons were placed on an autoshaping schedule (Brown & Jenkins, 1968) in
which all three keys were. transilluminated with
white light for 6 sec before the response-independent 4-sec presentation of grain. Successive
presentations of the lighted keys were separated by
a variable ITI of 30 sec, during which only the
houselight was illuminated. After two SO-trial sessions, during which reliable key pecking was induced, the birds were randomly assigned to two
five-subject groups, corresponding to the Shimp
(birds Bl through B5) and Nevin (birds B6
through BIO) replications.
In the Shimp replication, a trial began with the
illumination of the center key with white light. A
center-key response darkened that key and illuminated the two side keys, one with green and
one with red light. Each choice-key response
re-illuminated the center key, either immediately
or after reinforcement. The probability of assigning reinforcement to a side key equaled .25. Assignments were three times as likely to one key as
to the other. Only one assignment could be made
on a choice trial, and that assignment remained
available on subsequent trials until obtained. Each
daily session ended after 200 trials.
In the Nevin replication, a trial began with the
illumination of the side keys, one with red and
one with green light. Side-key illumination ended
with a response or after 2 sec, whichever occurred
first. Separating successive trials was an ITI of
6 sec, excluding reinforcement-cycle time. Associated with each choice key was a VI 1-min or a
VI 3-min schedule, the interreinforcement intervals
of which were defined according to the formulation
of Fleshier and Hoffman (1962). Because the VI
schedules were independent, reinforcement could
be assigned on any trial to neither, either, or
both choice alternatives. Each assignment remained
available until obtained. Each daily session ended
after 60 reinforcements.
In both replications, choice-key colors were
counterbalanced among birds insofar as was practicable; however, the richer reinforcement schedule was associated with the left key for all birds.
A houselight was continuously illuminated throughout a session except during the hopper cycle, which
was of 4-sec duration. Each replication lasted 70
sessions. All response-sequence combinations from
Sessions 31 through 60 were recorded up to a
length of seven without regard to the occurrence
of a reinforcement. For Sessions 61 through 70,
only response sequences between reinforcements
were recorded. The all-response-sequence and interreinforcement-response-sequence measures were not
recorded concurrently, due to computer memory
limitations. All data analysis is based on the
sum of Sessions 31 through 60 or 61 through 70.
Results
Table 1 presents the probability of a leftkey response as a function of all possible
combinations of prior choices through a
length of four. These probabilities were calculated by dividing the frequency of occurrence of a particular response sequence
terminated by an L by the sum of that
sequence plus the same sequence terminated
by an R. For example, the probability of an
L given RLR equaled the frequency of the
sequence RLRL/( frequency of RLRL+
frequency of RLRR). Columns 1 through 5
of Table 1 present sequential statistics from
birds Bl through B5 of the Shimp replication, and columns 6 through 10 present sequential statistics from birds B6 through
BIO of the Nevin replication. These probabilities are based on all choice sequences
for Sessions 31 through 60 for each bird.
Probabilities were not calculated when fewer
than 25 instances of a response sequence
occurred. Columns 11 and 13 present these
probabilities averaged across the Shimp and
Nevin replications, respectively. Columns 12
and 14 present the group data for the Shimp
and Nevin replications averaged over Sessions 61-70 in which only interreinforcement response sequences were recorded.
Column 15 presents the sequential statistics
predicted by the momentary-maximizing
sequence, which was LLR for the Shimp
and Nevin replications. Asterisks refer to
the theoretically nonoccurring interreinforcement response sequences of a subject
following the momentary-maximizing rule.2
The relative response rate and relative reinforcement frequency for all individual and
group data are presented at the bottom of
Table 1.
If choices were statistically independent,
the sequential statistics for individual sub2
These momentary-maximizing predictions are
correct for all data based on interreinforcement response sequences (columns 12 and 14 of Table 1) ;
however, they vary somewhat for data based on
all trials (e.g., columns 11 and 13 of Table 1).
This variance is due to the fact that response
sequences are not reset when reinforcement occurs
in the all-trials analysis.
THE STRUCTURE OF CHOICE
jects in Table 1 would all closely approximate the subjects' relative* response rates.
This test for independence! was clearly violated for all subjects. In fact, orderly
sequential dependencies did develop for all
birds. For example, the probability of an L
given L was lower for all birds than an L
given R. Moreover, the sequential statistics
for the Shimp and Nevin replications show
a high degree of correspondence: The Spearman rank-order correlation between their
group data (columns/11 and 13) is .92
(Siegel, 1956).
There is also some correspondence between
the predictions of the momentary-maximizing rule and the obtained sequential statistics
for each group. To illustrate this point,
compare the sequential statistics from the 7
sequences in which maximizing predicts
the subject will always make an L (a 1.0
sequence) with the 3 sequences in which the
subject will never make an L (a .0 sequence). In terms of the all-trials data from
the Shimp procedure (column 11), 5 of
the 7 1.0 sequences are higher than the highest .0 sequence, and 2 of the 3 .0 sequences
are the lowest of the 10 sequences under
consideration. This effect is even more clearcut in the Nevin data (column 13): The
probability of an L for all 1.0 sequences is
higher than the probability of an L for all
.0 sequences. Although these correspondences show that momentary maximizing is
of some value in describing changes in
sequential statistics in both studies, it must
also be noted .that the dependencies that
obtained were sometimes substantially different from the momentary-maximizing prediction. Particularly at variance with the
momentary-maximizing rule was the finding
that left-key responses were likely even when
a right-key response was predicted (e.g., see
probability of L given LL).
As regards relative response rate, the
quality of matching was roughly comparable
between birds in the Shimp and Nevin replications: The mean absolute deviation from
matching in the former group was 7.5%,
and in the latter group it was 6.4%. As can
be seen at the bottom of Table 1, birds in
the former group tended, on average, to
373
match, whereas birds in the latter group
tended to undermatch.
Figure 1 presents the probability of a
changeover to a key as a function of successive choices to the other key. Filled circles
refer to left-key runs; open circles to rightkey runs. Runs occurring fewer than 25
times are not plotted. In the lower left-hand
corner of each panel is the bird number and
its relative left-key response rate. Birds from
the Shimp replication are in the left-hand
column, and birds from the Nevin replication
are in the right-hand column. With the exception of left-key runs for B5, all birds
showed a progressively greater tendency to
remain on a key the longer they had responded on it. Moreover, in some birds
this progressive perseverative tendency was
unequal between keys: For Bl, B2, and B3
from the Shimp replication and B7, B8, and
BIO from the Nevin replication the negative
slopes for right-key runs were, in terms of
experimenter judgment, clearly greater than
for left-key runs; and none of the left-key
runs appeared to have greater negative
slopes than right-key runs.
Discussion
The results of Experiment 1 may be summarized as follows:
1. When analyzed in terms of Shimp's
(1966) primary measure—detailed sequential statistics—similar sequential dependencies of choice obtained for both replications (Table 1). These dependencies showed
some correspondence with the predictions of
the momentary-maximizing rule, a result
consistent with Shimp's earlier findings.
2. When analyzed in terms of Nevin's primary measure—the probability of a changeover as a function of run length to a key
(Figure 1)—the data were still similar for
both replications. Moreover, some of these
plots showed a progressive tendency toward
perseveration—.that is, the likelihood of
switching keys decreased the longer a subject responded to a key, a result consistent
with Nevin's earlier work.
The rationale for replicating the choice
procedures of Shimp and Nevin was that
Table 1
Individual and Group Data in the Shimp and Nevin Procedures, and Momentary-Maximizing Prediction
(1)
(2)
(3)
(4)
(5)
(6)
Shimp procedure subjects
(7)
(8)
(9)
(10)
Probability
of a left
given
Nevin procedure subjects
Bl
B2
B3
B4
B5
B6
B7
B8
B9
BIO
L
R
LL
LR
RL
RR
LLL
LLR
LRL
LRR
RLL
RLR
RRL
RRR
LLLL
LLLR
LLRL
LLRR
LRLL
.68
.90
.67
.92
.69
.73
.69
.93
.69
.82
.63
.91
.77
.50
.71
.94
.67
.84
.63
.53
.89
.48
.90
.58
.87
.46
.90
.58
.87
.49
.89
.64
.84
.50
.90
.58
.86
.49
.64
.40
.97
.57
.97
.76
.84
.57
.98
.76
.96
.57
.93
.80
.30
.59
.98
.79
.91
.57
.84
.32
.84
.45
.84
.36
.84
.43
.85
.30
.84
.54
.80
.41
.81
.44
.86
.30
.89
1.00
.88
.99
.98
—
.88
.99
.98
.58
.83
.53
.83
.64
.84
.56
.82
.63
.85
.49
.84
.67
.79
.58
.83
.64
.86
.51
.63
.81
.58
.82
.71
.77
.60
.83
.69
.78
.55
.80
.76
.75
.62
.84
.69
.80
.56
.67
.83
.66
.85
.70
.76
.69
.86
.70
.77
.60
.82
.69
.71
.69
.86
.70
.81
.61
.68
.73
.71
.72
.60
.74
.72
.68
.64
.80
.69
.79
.51
.59
.73
.67
.64
.85
.68
.67
.78
.66
.79
.68
.72
.69
.81
.68
.78
.62
.77
.67
.56
.71
.83
.69
.78
.64
.94
.94
—
.87
.99
.98
—
.94
(11) (12)
Shimp procedure
mean
Trials
between
reinAll forcetrials ments
.66
.90
.67
.91
.64
.83
.72
.93
.64
.86
.56
.88
.62
.69
.77
.94
.68
.86
.57
.59
.93
.54
.95 .60
.76
.52
.97
.61
.78
.44
.92
.38
.67
.54
.98
.75
.88
.44
(13) (14)
Nevin procedure
mean
(15)
be)
W
be)
W
Trials
be-
tween
reinAll forcetrials ments
.64
.80
.66
.82
.63
.81
.66
.77
.66
.81
.67
.80
.58
.81
.65
.65
.68
.81
.68
.83
.59
.74
.82
.62
.74
.80
.84
.58
.76
.64
.79
.53
.70
.85
.84
.64
.74
.57
en
i—i
t~*
Momentarymaximizing
prediction
.P
^
>
.50
r
1.00
-00
1.00
£
2
i.oo*
aW
1.00
1.00
*
.00
-^
>
2
°
*
*
1.00
*
.00
«!
*
*
*
H-1
ns
2*
w
Table 1 (continued)
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
Nevin procedure subjects
Shimp procedure subjects
(11) (12)
Shimp procedure
mean
(13) (14)
Nevin procedure
mean
Trials
between
reinAll forcetrials ments
Trials
between
reinAll forcetrials ments
Probability
of a left
given
Bl
B2
B3
B4
B5
B6
B7
B8
B9
BIO
LRLR
LRRL
LRRR
RLLL
RLLR
RLRL
RLRR
RRLL
RRLR
RRRL
RRRR
.91
.76
.68
.65
.91
.72
.81
.61
.86
.78
.35
.90
.63
.81
.43
.90
.57
.88
.51
.76
.69
—
.94
.78
.50
.53
.98
.65
1.00
.72
—
—
.14
.84
..53
.80
.36
.85
.42
.83
.31
.84
.61
—
.94
—
—
.90
1.00
1.00
—
—
—
—
—
.84
.67
.79
.53
.81
.61
.84
.43
.82
.69
.81
.81
.75
.82
.57
.81
.69
.75
.54
.73
.78
.54
.84
.70
.75
.67
.85
.70
.71
.55
.71
.66
.65
.78
.53
.56
.70
.71
.63
.67
.71
.80
.43
.64
.79
.69
.65
.64
.78
.65
.80
.53
.69
.62
.45
.88
.61
.77
.60
.91
.55
.85
.46
.82
.68
.52
.93
.39
.57
.39
.96
.50
.63
.20
.74
.55
—
.82
.65
.70
.62
.80
.65
.76
.55
.77
.61
.58
.82
.52
.71
.68
.82
.53
.73
.53
.66
.53
.64
.74
.66
.73
.58
.90
Relative response rate
.66
.69
.72
.69
.70
.73
.74
.69
.72
.75
.72
.72
.74
.76
Relative reinforcement frequency
.76
.77
.74
.76
.75
.74
.75
.76
.77
Note. Numbers in parentheses represent column headings. B = bird; L = left-key response; R = right-key response.
* Theoretically nonoccurring sequence.
(15)
Momentarymaximizing
prediction
1.00
w
tn
H
90
d
o
H
d
»
w
o
*
*
ffi
o
I —I
o
w
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
376
control of choice by a molecular process
(e.g., momentary maximizing). Nevin's procedure also showed matching but without
apparent sequential dependencies of choice—
these studies appeared to support contradictory explanations of matching Shimp's
procedure showed matching with sequential dependencies—presumptive evidence for
INDIVIDUAL
DATA
SKIMP MCTHOO
NIVIN METHOD
.90,70.60-
Bl (.74)
B6 (.66)
.30-
<
.70-
Z
62 (.66)
u.
o
CO
O
.20
. . B6 1.72)
B» (.73)
B9 (.69)
; B4 (.56)
.70
.30
I(BO)**LEFT-KEVI'UN
' t-*°' OO RIOHT-KEY RUN
.10
1
2
3
4
5
6
1
2
3
4
5
6
SUCCESSIVE T R I A L S
Figure 1. Probability of a changeover between keys as a function of successive choices to a key
for each bird (B) in the Shimp (left-hand panels) and Nevin (right-hand panels) procedures.
(Relative left-key response rates are in parentheses adjacent to each bird's identification number
within a panel.)
THE STRUCTURE OF CHOICE
a finding consistent with the view that
choice is best described at the molar level of
relative response rate. We hypothesized that
these differences were more apparent than
real, a consequence of using different measures in each study as assays for sequential
dependencies of choice. It now appears that
this hypothesis was correct. In both replications, analysis in terms of Shimp's measure
(sequential statistics) produced data comparable to Shimp's (see Shimp, 1966, Table
3), whereas analysis in terms of Nevin's
measure (changeover probabilities) reproduced Nevin's original findings (see Nevin,
1969, Figure 4). Moreover, it is important
to keep in mind that Shimp's measure subsumes Nevin's: Sequential statistics are composed of all choice sequences; changeover
probabilities are not. Hence, it is in order to
conclude that matching occurred with sequential dependencies in both the Nevin
and Shimp studies. The reason these dependencies were not manifest in Nevin's original
work was that his measure excluded those
sequences needed to demonstrate the underlying dependencies in his choice data.
Figure 1 shows that the probability of a
changeover often decreased as a function of
run length to a key. A similar tendency toward response perseveration was noted by
Nevin. He attributed this result to sequential
changes that occur in relative reinforcement
frequency as run length to a key increases.
Nevin supported this interpretation by showing that plots of obtained relative reinforcement frequency covaried with the probability-of-changeover curves shown in Figure 1
of the present experiment. While Nevin's '
account appears adequate for choice procedures using independent schedules of reinforcement, it cannot explain why the probability-of-changeover curves from the Shimp
replication also decrease with increasing run
length. In the Shimp replication, only one
reinforcement was assigned at a time, a
consequence of which was that relative reinforcement frequency could not vary as a
function of run length. For Nevin's interpretation to be generally applicable, flat probability-of-changeover curves would be predicted for subjects in the Shimp procedure.
377
Clearly, this result did not obtain, a fact
that calls into question whether sequential
changes in relative reinforcement frequency
actually control sequential changes in the
probability of a changeover. An alternative
interpretation of these findings might be that
perseveration is simply a frequent concomitant of choice (e.g., see Morgan, 1974).
The sequential dependencies found in the
Shimp and Nevin replications bore an imperfect correspondence with the predictions
of the momentary-maximizing sequence.
This finding shows that momentary-maximizing is of some predictive value in describing the sequential characteristics of choice.
Nevertheless, inspection of Table 1 suggests
two serious problems with a momentarymaximizing account of the data:
1. Momentary maximizing tends to overstate key preference, the predicted conditional probability of an L or R often being
1.0. For example, an R should always follow
two Ls according to a momentary-maximizing account, an L should always follow an R,
and so forth. As is clear from Table 1, conditional probabilities of an L or an R only
infrequently approximated the predicted
total preference for a given alternative.
2. Momentary maximizing predicts that
some sequences §hould never occur (see
asterisks in Table 1). Yet these sequences
not only occurred, but sometimes they occurred with substantial frequency.
The problem with momentary maximizing
as an account of choice behavior is the all-ornone nature of its predictions. It does not
consider factors (e.g., inattention, forgetting) that might introduce variability
into the execution of the momentary-maximizing sequence. An alternative momentarymaximizing model, related to one originally
advanced by Silberberg and Williams
(1974), does consider these factors. It will
be shown that this new model can account
for the nonexclusive conditional preferences
seen in Table 1 and for the occurrence of
theoretically nonoccurring sequences. Moreover, it will produce sequential statistics and
probability-of-changeover curves similar to
those seen in Table 1 and Figure 1.
Momentary maximizing with errors. In
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
378
O
-
Ct
UJ
6
>
c o
_ HJ
d ^ .4
SIMULATION
g
1
NEVIN
METHOD
SHIMP
METHOD
2
LEFT-KEY
3
4
5
RUN LENGTH
Figure 2. Probability of a changeover to the right
key as a function of the number of left-key choices
(left-key run length).
this model it is hypothesized that choice behavior is controlled by a momentary-maximizing sequence; however, errors occasionally occur in sequence execution because
subjects fail to recall the loci of prior choices.
The likelihood of recalling past choices is
influenced by schedule factors, such as the
duration of the ITI, and behavioral factors,
such as inattention. The likelihood of "forgetting" the loci of prior choices is assumed
to be equiprobable after all responses. Hence,
on those occasions when past behavior is
correctly recalled, the momentary-maximizing sequences will have a higher likelihood
of reinforcement than.all other response sequences. Occasional faille to recall prior
choices will therefore influence the rate at
which subjects learn the momentary-maximizing sequence but not whether it is learned
(see Silberberg & Williams, 1974).
After the momentary-maximizing sequence
is learned, this model predicts that strong
sequential dependencies will appear when
recollection of the loci of prior choices is
correct. When past choices are not recalled,
pigeons will "guess" what response would
have been next in the momentary-maximizing sequence. Choices governed by
guessing will be randomly distributed among
the response alternatives denning the momentary-maximizing sequence. These statistically independent pecks will tend to
minimize the underlying sequential dependencies that characterize choice when errors
do not occur. Matching can still occur, however, because each choice in the momentarymaximizing sequence will, over time, re-
ceive an approximately equal number of
misappropriated responses. The degree to
which relative response rate appears to be a
consequence of a sequentially dependent
process will depend upon the blend of "remembered" and "forgotten" response sequences a given choice procedure supports.
Simulation assay of maximizing with
errors. The predictive adequacy of this
"error"-based version of Shimp's momentary-maximizing rule can be assessed via
computer simulation. Toward this end, the
experimental contingencies characterizing
the Shimp and Nevin replications (see
Method section) were programmed on a
computer. Also simulated was a stat bird
that was programmed to emit the LLR
sequence dictated by momentary maximizing
for both of these procedures. Errors were
simulated in the following manner: After
each choice there was some probability that
the stat bird would forget the locus of its
prior response, and when forgetting
occurred, the stat bird guessed which response would have been next in the maximizing sequence, each of the three elements
in the sequence (first L, second L, and R)
being equiprobable. The sole parameter in
this model was the probability of forgetting.
In the simulations presented here, the
forgetting parameter was set at p = .45, the
rationale for its selection being that it provided a reasonable empirical fit to the data
from the Shimp and Nevin replications.
Simulated sessions ended after 200 trials
for the Shimp stat bird and after 60 reinforcements for the Nevin stat bird. Data,
* summed over 20 sessions, were recorded in
terms of the measures used in Experiment 1.
Figure 2 presents the probability of a
changeover as a function of left-key run
length. The Shimp and Nevin stat birds
are represented, respectively, by solid lines
and closed circles. Several features of these
stat birds' data are of interest:
1. The data in Figure 2 establish that
essentially flat probability-of-changeover
curves are not incompatible with the idea
that choice is controlled by a maximizingtype response strategy.
Consequently,
Nevin's (1969) failure to find positive slopes
THE STRUCTURE OF CHOICE
in terms of this measure does not preclude
molecular control of choice in his study.
2. These simulated performances replicate
an important aspect of the changeover-probability curves from Experiment 1: The
simulated curves for the Shimp and Nevin
methods are similar in form, despite between-study procedural differences. This result is consistent with the thesis advanced
earlier that the different interpretations lent
Slump's and Nevin's data were more a consequence of differences in measures than
differences in the contingencies of their concurrent schedules.
3. Except for the tendency toward progressive perseveration seen in several subjects' curves in Figure 1, these curves seem
qualitatively similar to the actual findings
from the Shimp and Nevin replications. This
result suggests (a) that maximizing with
errors is a reasonable first approximation to
modeling choice allocation in these two
studies and (b) that possibly the addition
of a perseveration element to the model may
improve the correspondence between simulated and actual performances.
The simplest model on which the matching
law might be based attributes matching rela-
379
tions to the independent selection among
choice alternatives (see Herrnstein, 1961, p.
270). Yet, in terms of a probability-ofchangeover analysis, this response-independence interpretation offers no more effective a description of the Shimp and the Nevin replication data than does maximizing
with errors. The reason is clear-cut: With
a forgetting parameter of .45, both maximizing with errors and stochastic choice allocation generate essentially flat probabilityof-changeover curves. Thus, even though
many curves from Figure 1 have near-zero
slopes, these data cannot be used in support
of the idea that successive choices are statistically independent.
In fact, a maximizing-with-errors account
of choice has significant predictive advantages over accounts based on either momentary maximizing or the statistical independence of choice, and these advantages become
apparent when these different models are
compared in terms of the sequential statistics they generate. Table 2 makes these
comparisons. It presents the probability of
an L following different choice combinations
for the Shimp (columns 2 through 5)
and Nevin (columns 6 through 9) pro-
Table 2
Comparison of Sequential Statistics From Shimp and Nevin Procedures
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
L
R
.66
.90
.58
1.00
.67
.91
.73
.73
.73
.73
.73
.73
.73
.73
.73
.73
.73
.73
.73
.73
.64
.80
.63
.81
.66
.77
.57
.85
LL
LR
RL
.62
.85
.54
.84
.50
1.00
0.00
1.00
1.00
*
*
1.00
1.00
.69
.69
.69
.69
.69
.69
.69
.69
.69
.69
.69
.69
.69
.69
(1)
RR
LLL
LLR
LRL
LRR
RLL
RLR
RRL
RRR
.64
.83
.72
.93
.64
.86
.56
.88
.62
.69
.77
.86
.55
.85
.77
.85
.52
.83
.74
.90
.28
1.00
1.00
* .
.37
1.00
1.00
*
.25
*
*
*
.66
.81
.67
.80
.58
.81
.65
.65
.45
.85
.74
.87
.50
.84
.74
.86
.41
.88
.73
.87
*
0.00
*
*
*
Note. Numbers in parentheses designate column headings: (1) represents the probability of a left given the
responses in the column; (2) represents the Shimp data from Table 1, column 11; (3) represents momentary
maximizing with 45% forgetting for the Shimp data; (4) represents the momentary maximizing prediction
for the Shimp data; (5) represents the response independence prediction for the Shimp data; (6) represents
the Nevin data from Table 1, column 13; (7) represents momentary maximizing with 45% forgetting for the
Nevin data; (8) represents the momentary maximizing prediction for the Nevin data; (9) represents trie
response independence prediction for the Nevin data.
* Sequence that did not occur during simulations.
380
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
dence interpretation (columns 5 and 9) in
accounting for the results of Experiment 1
is obvious. This account predicts no changes
in sequential statistics; yet inspection of
columns 2 and 6 shows the presence of strong
(2)
(3)
(4)
(5)
(6)
(7)
(1)
sequential dependencies in the Shimp and
Nevin replications. The momentary-maxiRR
923 2501 1582 2675 4138 2114
mizing
account (columns 4 and 8) does a
LRR
758 1816 1359 2041 2842 1830
RLR
2986 1816 2030 3539 2842 3194
noticeably better job. It often predicts the
RRL
762 1816 1359 2051 2842 1830
direction of change in sequential statistics
RRR
161 681 223 623 1283 284
for
the Shimp and Nevin data. Its major
LLRR
385 1319 1014 1348 1972 1425
failings are that it tends to overstate left- or
LRLR 2698 1319 1682 2812 1972 2702
649 1319 1160 1636 1972 1582
LRRL
right-key preferences and that it predicts
109 493 200 404 887 248
LRRR
that some frequently occurring sequences
RLRL 2612 1319 1689 2851 1972 2796
will not occur at all (see asterisks). Of the
369 493 341 684 887 398
RLRR
three accounts considered, maximizing with
471 1319 1011 1321 1972 1338
RRLL
RRLR
287 493 348 725 887 492
errors comes closest to predicting the se111 493 200 407 887 248
RRRL
quential statistics obtained in Experiment 1.
RRRR
50 188
36
23 216 396
It does an excellent job of predicting ordinal
in sequential statistics, even among
changes
Note. Figures in parentheses designate column headings: (1) represents the theoretically nonoccurring sequences that should be nonoccurring acresponse sequence; (2) represents the frequency cording to a momentary-maximizing interfrom the Shimp replication; (3) represents the pretation. Moreover, except for one statistic
response-independence prediction for the Shimp
(the probability of an L given an LLR from
data; (4) represents momentary maximizing with
45% forgetting for the Shimp data; (5) represents the Shimp replication), maximizing with
the frequency from the Nevin replication; (6) repre- errors corresponds more closely than does
sents the response-independence prediction for the momentary maximizing with all the seNevin data; (7) represents momentary maximizing
with 45% forgetting for the Nevin data. L = left- quential statistics obtained in Experiment 1.
key response; R = right-key response.
The predictive superiority of maximizing
with errors in accounting for the molecular
cedures. Columns 2 and 6 present the characteristics of choice can be illustrated in
sequential statistics that obtained for each another manner—in terms of how well it
procedure in Experiment 1. Columns 3 and predicts the frequency of sequences momen7 present the sequential statistics predicted tary maximizing defines as nonoccurring.
by maximizing with 45% forgetting. Col- Some of these sequences are listed in columns 4 and 8 present the momentary-maxi- umn 1 of Table 3. Columns 2 and 5 premizing predictions for the Shimp and Nevin sent, respectively, the absolute frequency
procedures. These predictions were deter- of these sequences for the Shimp and Nevin
mined by setting the forgetting parameter subjects summed over Sessions 31-60 in
in the maximizing-with-errors model at 0%. Experiment 1. Columns 3 and 6 present the
At 0% forgetting, simulated behavior cor- frequency predicted if successive choices
responds exactly with momentary maximiz- were statistically independent for the Shimp
ing. The response-independence prediction and Nevin replications. Columns 4 and 7
(columns 5 and 9) states that the probability present the frequency predicted if all subof an L is unaffected by prior choices. Hence, jects were following the maximizing-withits prediction equals the relative left-key re- errors model with a forgetting parameter of
sponse rate for the Shimp and Nevin groups 45%. The frequencies in columns 4 and 7
in Experiment 1. Asterisks refer to se- were adjusted so as to equal the absolute
quences that did not occur during simula- number of responses produced by subjects
in the Shimp and Nevin replications. As is
tions.
clear
from Table 3, sequence frequencies
The inadequacy of a response-indepen-
Table 3
Sequence-Frequency Comparison Between
Maximizing- With-Errors and
Response-Independence Predictions
THE STRUCTURE OF CHOICE
predicted by maximizing with errors are
closer than those predicted by a responseindependence model for all sequences save
RRLR and RRRR from the Nevin replication. In addition, maximizing with errors is
obviously a more successful account of these
sequence frequencies than is a momentarymaximizing model, because momentary maximizing predicts that none of these sequences
will occur.
Forgetting-mode behavior: molar or molecular ? The maximizing-with-errors model
accommodates within microstructural principles many empirical features of the Shimp
and the Nevin replication data. Despite these
successes, one might question the necessity of labeling the model itself as an exclusively molecular account of choice. The problem is that, although this model attributes
choice allocation solely to molecular factors,
its constituent remembering and forgetting
processes can be respectively viewed as molecular and molar components of the model:
When the subject remembers prior choices,
sequential dependencies abound in a fashion
conforming with the maximizing sequence
and molecular control of choice; when, however, the subject forgets prior choices, subsequent responding is statistically independent and also conforms with the molar version of the matching law. In other words,
forgetting-mode behavior can be interpreted in two ways: (a) as following the maximizing sequence, but failing to evidence
this fact in sequential statistics due to randomly distributed errors in sequence execution or (b) as following the molar predictions of the matching law.
There is no absolute way of resolving
whether the molar or molecular interpretation is correct. Why, then, does the model
account for forgetting-mode behavior exclusively in molecular terms? The answer
can be found in the Silberberg and Williams
(1974) study, from whence this model came.
In their procedure only changeovers could
be reinforced, and the probability of reinforcement for a changeover to the left key
exceeded that for the right key. In this
case, momentary maximizing (always responding to that alternative more likely to
381
produce reinforcement) dictated strict alternation between keys, a strategy that translates into a relative left-key response rate
of .5. Matching relations, on the other hand,
predicted a left-key preference because the
majority of reinforcements occurred for leftkey choices. SilberBerg and Williams exposed birds to this choice procedure on a
trials basis; the ITI separating successive
trials was 1, 22, or 120 sec, depending on
the bird. For the 1-sec ITI birds the primacy of molecular control was established
in terms of both molecular and molar
measures: Changeover-probability curves
showed high frequencies of alternation between keys, and the relative left-key response
rates equaled .5. For the 22- and 120-sec
ITI groups, however, a different picture
emerged: In terms of their molecular measures, successive choices appeared to be
sequentially independent; nevertheless, the
relative left-key response rate, which still
approximated .5, consistently undermatched
the relative left-key reinforcement frequency.
This finding is obviously compatible with
the notion that the statistically independent
choice allocation is due to errors in executing the molecular LR maximizing sequence;
however, it is incompatible with the notion
that when the animal forgets the loci of
prior choices, its behavior is controlled by a
molar variable such as relative response
rate. Because only an exclusively molecular
interpretation of the maximizing-with-errors
model can account for the data from both the
Silberberg and Williams study and the
Shimp and Nevin replications, the only generally applicable and parsimonious interpretation of forgetting-mode behavior is that
it is molecularly controlled.
Experiment 2
The results of Experiment 1 are of some
theoretical interest, for they unify under
microstructural
principles two studies
(Nevin, 1969; Shimp, 1966) that have
given incompatible accounts of why matching often obtains on concurrent schedules.
As is now apparent, matching in both of
these studies was likely due to a molecular
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
382
Table 4
Relative Left-Key Reinforcement Frequencies
and Relative Response Rates
Relative left-key
reinforcement frequency
Bird
Scheduled
Obtained
Relative
left-key
response
rate
21
22
23
24
Condition 1
.75
.72
.25
.25
.50
.46
.50
.55
.43
.32
.31
.53
21
22
23
24
Condition 2
.67
.68
.33
.36
.67
.71
.33
.34
.48
.43
.58
.26
process—a fact obfuscated in Nevin's experiment by the use of the probability-of-changeover measure as the major assay for molecular control. Moreover, an adaptation of
Shimp's momentary-maximizing principle
was presented that predicts errors in the execution of the momentary-maximizing sequence. This model, called maximizing with
errors, meets two important criteria expected
of any new account of concurrent performances : (a) It is superior to extant
accounts in modeling the molecular characteristics of the choice data obtained, and
(b) it produces matching when all choice
sequences are averaged together. At least
preliminarily, these results suggest that
matching is a consequence of a molecular
process that can be described in an orderly
way in terms of mechanisms operative at the
level of the response sequence.
This progress notwithstanding, it should
be kept in mind that neither Nevin's nor
Shimp's procedures are representative of
most choice studies because in these experiments successive choices were presented in
discrete trials. Thus, the replications just
presented still leave one question unanswered : What is the proper description of
choice on the more frequently used, freeoperant paradigms? In order to answer this
question, Experiment 2 studied choice allocation on a free-operant choice procedure
similar to that used by Herrnstein (1961) in
his prototypic demonstration of matching
relations.
Method
Subjects. Four adult male White Carneaux
pigeons, deprived to 80% of their free-feeding
weights, were used. All birds were experimentally
naive at the beginning of the experiment.
Apparatus. The apparatus was the same as in
Experiment 1.
Procedure. After magazine training, all birds
were exposed to an autoshaping procedure in
which only the two side keys served as conditioned
stimuli (see Experiment 1). After reliable pecking obtained to both side keys, all subjects were
placed in the main experimental procedure. In
this procedure, both side keys were illuminated,
one with red and the other with green light.
Reinforcement was assigned for side-key responses
by a single VI 45-sec schedule, defined according
to the specifications of Fleshier and Hoffman
(1962). When this schedule made reinforcement
available, it was assigned by a probability generator to either the left or right key (see Stubbs &
Pliskoff, 1969). Once a reinforcement was assigned,
the VI schedule was inoperative until that reinforcement was delivered. Every changeover between keys began a 1.5-sec COD clock. The first
post-COD response accessed reinforcement if it
had been assigned to that key. All responses during
the COD were recorded but had no scheduled consequences. Each daily session ended after 50 4-sec
reinforcements.
Each subject was exposed to two different conditions, each condition distinguishable on the basis
of its relative frequency of reinforcement (see
Table 4). Detailed sequential statistics were recorded for all choices and for those choices
occurring during the post-COD period. All data
presented are based on the last 10 sessions of each
30-session condition.
Results
A comparison in Table 4 between relative response rate and relative reinforcement
frequency shows considerable intersubject
variability in the quality of matching obtained. Most discrepant with the matching
law were B21's data. This subject preferred
the right key during both experimental conditions despite the fact that the other key
provided a substantially greater frequency of
reinforcement. Although all other subjects
preferred the appropriate key when exposed
to unequal reinforcement frequencies, there
was not as close a correspondence between
relative rates and reinforcement frequencies
THE STRUCTURE OF CHOICE
as is sometimes found in this procedure
(e.g., see Stubbs & Pliskoff, 1969).
Table 5 presents for each bird in each
condition the probability of a left-key response following all combinations through
lengths of three responses. Conditional probabilities on the left-hand side of the table are
based on all responses. Probabilities on the
right-hand side exclude all COD responding.
These data show powerful sequential dependence of choice whether based on all responses or only on post-COD responding.
Generally speaking, the probability of a left
response was high if the prior choice was
left and was low if the prior choice was right
—a finding equivalent to saying that the
probability of a changeover between keys was
383
low for all subjects. However, not all features of these response sequences can be
explained in terms of low changeover probabilities. For example, in 16 out of 16 observations, the probability of an L after one
L was higher than after two Ls, and in 14
of 16 observations, the probability of an L
after two Ls was higher than after three
Ls. Also, the probability of an L after an
RL was consistently higher than after an
LL. Hence, choice was controlled not only
by the locus of the prior response, but by the
structure of the preceding response sequence.
Figure 3 presents for each condition the
probability of a switch between keys as a
function of run length to a key. Left- and
right-key curves, which are based on all
Table 5
Subjects' Left-Key Conditional Probabilities During All Choices and During Post-COD Choices
All choices
Post-COD choices
Probability
of a left
given
B21
B22
L
R
LL
LR
RL
RR
LLL
LLR
LRL
LRR
RLL
RLR
RRL
RRR
.84
.12
.82
.02
.94
.13
.80
.02
.90
.03
.89
.02
.94
.15
.91
.05
.90
.01
.96
.05
.89
.01
.89
.01
.99
.02
.96
.05
Condition 1
.89
.85
.05
.19
.88
.82
.00
.05
.99
1.00
.06
.23
.86
.78
.00
.05
1.00
.99
.00
.12
1.00
.99
.22
.13
.99
1.00
.06
.26
L
R
LL
LR
RL
.88
.12
.87
.01
.99
RR
.14
.84
.01
1.00
.02
1.00
.00
.99
.15
.85
.12
.82
.01
.99
.14
.78
.00
1.00
.01
.98
.07
.99
.16
Condition 2
.88
.85
.16
.06
.86
.83
.04
.00
.97
1.00
.19
.06
.84
.79
.04
.00
.78
1.00
.06
.00
.98
1.00
.06
.75
.98
1.00
.22
.06
LLL
LLR
LRL
LRR
RLL
RLR
RRL
RRR
B23
B24
B21
B22
B23
B24
.78
.20
.76
.18
.83
.20
.75
.18
.83
.16
.79
.14
.83
.21
.74
.05
.72
.01
.83
.05
.67
.02
.92
.01
.83
.00
.83
.05
.60
.10
.54
.06
.68
.10
.55
.06
.64
.08
.53
.07
.68
.10
.72
.26
.69
.18
.79
.29
.66
.16
.76
.23
.78
.23
.79
.31
.61
.16
.60
.56
.84
.15
.51
.05
.63
.17
.47
.51
.53
.07
.06
.63
.18
.62
.06
.73
.12
.56
.06
.63
.19
.04
.61
.11
.56
.07
.63
.18
.82
.45
.96
.58
.81
.45
.96
.57
.87
.41
.96
.58
Note. B = bird; L = left-key response; R = right-key response; COD = changeover delay.
.42
.00
.65
.08
.34
.00
1.00
.00
.48
.00
.65
.08
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
384
FIRST
.to-
CONDITION
SECOND
CONDITION
B2I (.43,.72)
B2I (.48,.68)
B22 (.32,.25)
B22 (.43,.36)
.50-
.30-
w
.10-
UJ
UJ
u
m
z
O
,90-
.70-
I L E F T - K E Y RUN
CW3 R I G H T - K E Y RUN
.50-1
.30-
Ul
<
U.
O
.10-
.90-
="
CD
.70
00
O
QC
Q.
.
.-80
B23 (.31,.46)
B23 (.58,.71)
B24 (.53,.55)
B24 (.26,.34)
-SO-
Q
Z .90
O
U
.SO •
.SO-
.10 •
I
I
—r- -i—
2 3
5
6
S U C C E S S I V E C H O I C E S TO A KEY
Figure 3. Probability of a changeover as a function of run length to a key. (In parentheses
adjacent to bird identification numbers are, respectively, the left-key relative response rate
and relative reinforcement frequency.)
THE STRUCTURE OF CHOICE
choices, are represented by closed and open
circles, respectively. Subject identification,
relative response rate, and relative reinforcement frequency are presented in the
upper left-hand corner of each panel. Generally speaking, there was a small but consistent tendency for birds to switch keys
the longer they had responded to a key.
Discussion
At the molar level of relative response
rate, the choice data of this experiment
showed considerable intersubject variability,
failing in some instances to conform with the
predictions of the matching law. Although
the mean deviation from the matching prediction across subjects and conditions was
.13—a result larger than sometimes obtains
in matching studies (e.g., see Herrnstein,
1961; Stubbs & Pliskoff, 1969)—this degree of deviation is by no means unprecedented (see Baum & Rachlin, 1969; Myers
& Myers, 1977). Although the relative-rate
data were variable, statistics based on response sequences seemed quite consistent
across subjects and conditions. The largest
effect at this molecular level was that the
probability of a changeover was low regardless of the composition of the prior response
sequence. This finding was no doubt a consequence of the COD, which punishes high
changeover rates (see Shull & Pliskoff,
1967). Nevertheless, a second-order effect,
demonstrating control by larger response sequences, was also discernible. For example,
although changeover probabilities were low
for all response sequences, they were lower
still if subjects had just switched keys.
This conclusion is corroborated by the
changeover-probability curves in Figure 3.
These curves show that the changeover probability is always well below chance levels
(.50 on Y axis) at the beginning of a response run to either choice alternative. The
consistent rise in these curves with run
length suggests that choice was governed
not simply by the locus of the last response,
but also by some other local process, such
as time since a changeover or the structure
of the prior response sequence.
385
Despite COD-dictated constraints on the
pattern of choice allocation, molecular control of choice was evidenced in both sequential statistics (Table 5) and in the changeover-probability plots (Figure 3). Nevertheless, the presence of the COD complicated
any attempt to identify the concurrent response rule for free-operant choice allocation. Because of this problem, the next experiment studied choice on! concurrent VIVI schedules without a COD. Although
undermatching is to be expected (e.g., see
Findley, 1958), the sequential properties of
free-operant concurrent performances can
be analyzed in this procedure unencumbered
by the effects of COD contingencies.
Experiment 3
Method
Subjects. Four adult male White Carneaux pigeons, deprived to 80% of their free-feeding
weights, were used. All birds were experimentally
naive at the beginning of training.
Apparatus. The apparatus was the same as in
Experiment 1.
Procedure. After magazine training, all birds
were trained to peck both side keys via the autoshaping procedure described in Experiment 2. The
birds were then exposed to the choice procedure
described in Experiment 2, except that no COD
was used. As in Experiment 2, reinforcement was
assigned by a single VI 45-sec schedule; however, in this experiment, the left key was three
times as likely to receive an assignment as the
right key for all subjects. All data presented are
based on the last 5 of 20 daily sessions. All other
features of the procedure are the same as in
Experiment 2.
Results and Discussion
Table 6 presents the conditional probability of a left-key peck given various sequences of prior responses for each subject
and for the average of all subjects. When
analyzed only in terms of the prior response,
there is little evidence of sequential dependence of choice. However, when larger sequences are considered, particularly response
runs to a key, large differences in conditional probabilities become apparent. For
example, although there is no difference in
the probability of an L given just one L
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
386
Table 6
Left-Key Conditional Probabilities
{Experiment 3)
Probability
of a left
given
B61
B62
B63
B64
M
L
R
LL
LR
RL
RR
LLL
LLR
LRL
LRR
RLL
RLR
RRL
RRR
LLLL
RRRR
LLLLL
RRRRR
LLLLLLL
RRRRRR
.59
.60
.55
.59
.64
.61
.55
.60
.64
.61
.55
.57
.63
.60
.57
.54
.56
.51
.51
.32
.60
.59
.61
.64
.59
.50
.62
.64
.57
.58
.58
.66
.63
.43
.64
.36
.66
.29
.67
.26
.75
.71
.70
.67
.93
.81
.67
.68
.93
.84
.75
.63
.93
.68
.67
.60
.69
.62
.71
.55
.68
.82
.68
.85
.70
.67
.69
.84
.68
.72
.65
.88
.78
.56
.71
.50
.73
.49
.74
.48
.69
.69
.66
.71
.75
.65
.66
Subjects
.71
.74
.71
.67
.71
.77
.54
.67
.46
.69
.39
.70
.31
Note. L = left-key response; R = right-key response.
or one R (mean data), large differences appear when comparing five Ls (p = .69)
with five Rs (p = .39).
These data suggest that choice probability
varies not on a response-by-response .basis,
but in terms of larger behavioral units, such
as run length to a key. If choice allocation
in the present experiments is governed by
the lengths of response runs to a key, the
goal of understanding the dynamics of concurrent performances is complicated considerably. On the one hand, choice on trials
procedures, such as those used by Shimp
(1966) and Nevin (1969), is well predicted
by maximizing rules; on the other hand, at
least one free-operant choice procedure—
that of concurrent VI-VI schedules—is controlled by a fundamentally different molecular rule, possibly run lengths to a key.
Nor is run length the only plausible variable accounting for the data in Table 6.
Possibly, choice was controlled by the time
since the last changeover. If time (and not
response number) in the presence of a key
controls choice, then run-length variations
in changeover probability are epiphenomenal
correlates of variations in the relative frequencies of interchangeover times (ICTs).
In order to evaluate this second possibility,
Figure 4 presents in ,25-sec classes the distribution of times between changeovers for
each key on a "per opportunity" (see Anger,
1956) basis. Each subject's left- and rightkey distributions are presented on the leftand right-hand sides of the figure. The bottom panels present these distributions averaged across all subjects. Each subject's
relative left-key response rate is presented in
parentheses beside its identification number.
As regards relative rates, it was anticipated that all subjects would undermatch
because no COD was used in this experiment. Given this expectation, the quality of
matching obtained was surprisingly good:
Two birds (B63 and B64) had relative
rates closely conforming with the matching
prediction, the other two birds (B61 and
B62) undermatched, and the mean deviation
from the matching prediction was only .06.
As regards the ICT distributions from
this experiment, the ICT curves from Figure 4 show that (a) there is a modal time
for staying on a key before switching, (b)
the mode is either the same for both keys
(B62, B64) or longer for the preferred key
(B61, B63), and (c) the variance in ICTs
is generally substantially less for the unpreferred key than for the preferred key. The
differences in distribution variances for each
key show that when subjects responded to
the unpreferred key, it was for an essentially
fixed period of time. For example, the large
majority of B64's ICTs in the presence of the
unpreferred key fell between .75 and 1.75
sec. When not making these brief samples of
the unpreferred key, the bird responded in
the presence of the preferred key, that key's
ICTs being more variable.
In Figure 4 we see that pigeons tend to
spend a particular period of time in the presence of a key before switching. This finding
raises the possibility that choice is governed
by the duration of successive time allocations to a key and not by run length to a
key. Because conditional probabilities of a
changeover vary in an orderly way whether
387
THE STRUCTURE OF CHOICE
based on responses (Table 6) or time (Figure 4), it is impossible to discern whether
response number or time in the presence
of a key is the likely factor governing choice
allocation in this experiment. Moreover, the
data from Figure 4, like those from Table 6,
violate the predictions of the maximizingwith-errors account presented in Experiment
1. Because responses and time covary in discrete trials, one need only replace "response
number" (left-key run length) in Figure 2
with "time" in arbitrary units to assess the
adequacy of maximizing with errors in interpreting the data from Figure 4. As can be
seen in Figure 2, the tendency of Figure 4's
curves to have a modal ICT is not predicted by the curves in Figure 2; nor can it
be consistent with a simple optimizing
strategy because decreases in any portion of
an ICT curve are incompatible with maximizing the time rate of reinforcement.
The purpose of the present experiment
was to assess the generality of the maximizing-with-errors account developed in Ex-
VI 60-SEC KEY
.30
.30
' 861 (.59)
.10
.10
.30
VI 180-SEC KEY
.30
' B62 (.60)
.10
.10
.70
B63 (.74)
.50
.30
.10
.10
.50
B64 (.72)
.30
.10
.10
'
.30
MEAN (.69)
.10
2.5
7.5
2.5
7.5
TIME BETWEEN CHANGEOVERS (IN SEC)
Figure 4. Conditional probability of a changeover as a function of time since the last changeover in .2S-sec units for the variable-interval (VI) 60-sec key and the VI 180-sec key. (Adjacent to bird identification numbers and in parentheses is the relative response rate to the
VI 60-sec key. The bottom pair of panels present the mean performances of all birds.)
388
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
periment 1. The findings show that the
generality of this account is limited. In particular, the results raise the possibility that
choice is largely due to a maximizing-witherrors process in some situations (i.e., discrete-trials procedures as used in Experiment 1) but not in others (e.g., free-operant
procedures as used in the present experiment). Moreover, a new question emerged
from the analysis of the ICT distributions of
Figure 4: Are local changes in the pattern
of choice allocation governed by the loci of
prior choices as was suggested in Experiment 1, or are they governed by the time
since a changeover?
Two-state choice model. Quite obviously,
a new model is needed, one that successfully
predicts the results of Experiment 1, as
well as the major features of the ICT distributions from Figure 4 and the changeoverprobability functions from Table 6. Toward
this end, four models were evaluated by
computer simulation, each simulation based
on an alternative response rule. Two of the
simulations were based on the responseindependence and maximizing-with-errors
models presented in Experiment 1. As in
Experiment 1, the stat bird was programmed
to emit the LLR maximizing sequence with
some probability of failing to "recall" the
prior element in the response sequence.
When forgetting occurred, the stat bird
guessed what the prior element had been,
these guesses being equally distributed
among the three elements defining the maximizing sequence. The response-independence
model was simulated by setting the forgetting parameter at p = 1.0; because this
stat bird always forgets its place in the maximizing sequence, the relative response rate
obtained should show no sequential dependence of choice. For the stat bird in the
maximizing-with-errors simulation, the forgetting parameter was arbitrarily set at p =
.45, the same value as in Experiment 1.
The other two simulations retained response independence and maximizing with
errors as the basic response rules; however,
a second element was added, that of response
perseveration. These last two models assume
pigeons' choice behavior can be described
by two states, one state being either response independence (Simulation 3) or
maximizing with errors (Simulation 4) and
the other state being response perseveration.
When in the response-independence or the
maximizing mode, choices (and hence,
ICTs) are allocated according to the basic
response rules discussed in Experiment 1.
However, with p — .05, each response-independence or maximizing-mode response
leads to entering the perseveration mode, in
which the same key is selected again and
again without regard to local reinforcement
probabilities. With p — .35, each perseveration-mode response returns program control
to the response-independence or maximizing
algorithms.
In all four simulations, stat birds responded on a simulated concurrent schedule,
the contingencies of which were identical to
those of the present experiment (see Method
section). Since the present series of simulations is based on a free-operant concurrent-choice procedure, an algorithm was
developed for generating a free-operant response rate: After each response, a potential interresponse time (IRT) was selected
from a normal distribution with a mean value
of .35 sec and a standard deviation of .05
sec. Each potential IRT had a probability
of emission equal to .4667. After each peck/
no-peck "decision," another potential IRT
was chosen. This algorithm produces a
multimodal IRT distribution with a period
of .35 sec that has the appearance of a
damped sine-wave function (see Slough,
1963) and has a mean rate of 80 responses
per minute. Finally, all simulations had a
minimum ICT of .7 sec and lasted for 3,000
min of simulated session time.
In making these simulations, only one pair
of parameters was systematically varied:
the probabilities of entering and exiting the
perserveration mode. Some 50 simulations
were made of the last two models using a
wide range of values for each of these parameters. For both response independence plus
perseveration and maximizing plus perseveration, the entrance and exit probabilities selected—those of .05 and .35—pro-
THE STRUCTURE OF CHOICE
vided the closest fit to the actual data, based
on experimenter judgment.
Figure 5 compares the mean data from
Experiment 3 (top two panels) with that
generated by the four models just described
(next four rows of panels) in terms of
three measures: (a) relative left-key response rates (in parentheses in the left-hand
panels), (b) ICT distributions (left-hand
panels), and (c) changeover-probability
curves as a function of run length (righthand panels). Each of the four models by
which the simulated data were produced is
identified in the left-hand panels. Left- and
right-key runs are respectively signified by
closed circles joined by solid lines and by
open circles joined by dashed lines.
Both the response-independence and maximizing-with-errors models (second and
third rows of panels in Figure 5) fail to
account for the averaged data found in the
top row. Neither model shows the decreasing
postmodal time-allocation functions seen in
the ICT distributions; nor does either model
show the selective tendency of the right-key
changeover-probability curve to decrease
with increases in run length.
The addition of the perseveration element
appears to have a beneficial effect on the
predictive adequacy of both of those models
(see bottom two rows of panels). Both
models reproduce two important features of
the curves from the mean data: (a) decreasing postmodal ICT distributions and
changeover-probability curves and (b) larger
decreases on the right key than on the
left key. Beyond this, maximizing plus perseveration seems to approximate mean performances more closely than does response
independence plus perseveration: In terms
of the ICT functions, the maximizing-witherrors-and-perseveration model produces
modes corresponding with those found for
the left and right keys from the mean data;
the alternative model does not. In terms of
changeover-probability curves (see righthand panels of Figure 5), maximizing plus
perseveration produces a fit qualitatively superior to response independence plus perseveration : Only the maximizing model produces a monotone-decreasing curve for the
389
right key and a nonmonotonic function for
the left key, a result corresponding in form
with mean performances. However, an argument more compelling than goodness of fit
can be mustered in support of the maximizing-plus-perseveration model: Only this account among those considered can accommodate within a unitary conceptual scheme
the findings of Experiments 1 and 3.
Viewed retrospectively, it is apparent that
perseveration was operative in Experiment 1,
albeit to a smaller degree than in the present experiment. The evidence for perseveration can be seen in the decreases in the
changeover-probability curves of Figure 1.
These decreases are not simulated by the
maximizing-with-errors model from Experiment 1 (see Figure 2) but are when a perseveration element is added (see Figure 5).
What distinguishes Experiment 1 from Experiment 3 is not differences in process,
only differences in the relative contribution
of perseveration to the results obtained.
But what of the results found in Experiment 2? One might argue that even if
maximizing plus perseveration is of some
value in accounting for choice on some procedures, it still fails when a COD is present.
Our thesis is that this failure is due not to
the absence of maximizing and perseveration
processes in Experiment 2, but to the COD
defining an alternate maximizing rule.
What distinguishes Experiment 2 from
the other experiments reported so far is
that in Experiment 2 perseveration is consistent with, rather than antithetical to,
maximizing: COD contingencies punish
nonperseverative pecking by imposing a delay between the first changeover response
and response-dependent reinforcement. Consequently, the low probabilities of a changeover seen in the curves from Figure 3, presumptive evidence of the operation of
perseveration, are not incompatible with
maximization. In fact, the positive slopes
these curves show, which are opposite in
value to those from Experiments 1 and 3,
suggest control by local reinforcement probabilities in a way consistent with maximizing
principles. As the end of a COD approaches
—an event necessarily correlated with in-
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
390
<X>
*—• L««t key ( V T D
0---0 Might key ( VI 3)
•1
R*«pon>*
ind*p*nd*nc*
(•
88 )
o
i'i
i-
-3
Vw/v
•
;
I
ft
"
0--0—0-
I
00
o
CK
a.
z
O
E
o
T
I
I
I
I
1
2
3
4
'2
z
S -1
Response Independence
with perseveration
{•
66 }
1
2
3
4
5
6
INTERCHANGEOVER TIME
(in
/4 s e c )
RUN
5
6
LENGTH TO A KEY
THE STRUCTURE OF CHOICE
creasing run length to a key—the probability
of a changeover should increase if maximizing is present, because the large majority of
reinforcements on choice procedures with a
COD occurs during the first few seconds
of the post-COD period (see Menlove,
1975). As a consequence, once the post-COD
period is sampled, maximizing predicts
higher changeover probabilities—the very
result obtained. Thus, these curves provide
some evidence for maximizing and perseveration. The failure of the maximizing-plusperseveration model is due not to the absence
of these constituent processes, but to the
COD defining an alternate maximizing rule.
Response perseveration as found in Experiments 1, 2, and 3 has been noted many
times before in studies of choice (e.g., Morgan, 1974; Shimp, 1976). Its stronger operation in Experiment 3 might be explained
in the following way: When exposed to
trials procedures as in Experiment 1, perseverative tendencies were interrupted somewhat by the imposition of an ITI (Nevin
method) or the requirement of a center-key
peck (Shimp method). On a free-operant
choice procedure, such as that used in Experiment 3, however, there were no responsecontingent interruptions of access to the
choice alternatives. Consequently, any perseverative tendencies naturally present were
unencumbered by the method of choice
presentation.
A dividend of selecting the maximizingplus-perseveration model is that it answers,
admittedly by fiat, another question asked
earlier—Is choice governed by the loci of
prior choices or by time in the presence of
a key? Because this maximizing model is
Figure 5. Conditional probability of a changeover as
a function of the time in .25-sec units since the last
changeover and as a function of run length. (The
top panel presents the mean performance for each
of these measures for all birds. The four pairs of
panels presented below present the simulated performances of stat birds following four alternative
response rules. The sources for the data in each
pair of panels, along with the relative left-key
response rate in parentheses, are indicated in the
left-hand panels. VI 1 = variable interval 1-min
schedule; VI 3 = variable-interval 3-min schedule.)
391
response based rather than time based, its
adoption favors analysis in terms of response
locus in predicting subsequent choice allocation.
Experiment 4
In Experiment 3, maximization and perseveration were identified as likely concomitants of free-operant choice behavior.
The present experiment tests the generality
of this conclusion by seeing if these two
processes are evident in the major alternative
method for arranging free-operant concurrent schedules—that of concurrent chains
(Autor, 1969).
In the concurrent-chains procedure, each
side key is illuminated synchronously with a
stimulus signifying the initial links of each
chain schedule. Choices to a key occasionally
access the terminal-link stimulus for that
key. Responding in the presence of each
mutually exclusive terminal-link stimulus is
reinforced according to some schedule of
reinforcement. Autor found that when each
terminal-link schedule is a VI, the relative
response rate in the presence of the equalvalued VI schedules associated with the
initial links matches the relative time rate of
terminal-link reinforcement (see also Herrnstein, 1964).
Method
Subjects. Four male White Carneaux pigeons,
deprived to 80% of their free-feeding weights,
served as subjects. All subjects were experimentally
naive at the beginning of training.
Apparatus. The apparatus was the same as
in Experiment 1.
Procedure. Prior to exposure to the experimental procedure, birds were trained to eat from
the food magazine and to respond to both side keys
as described in Experiment 2. All birds were then
placed on concurrent chain VI 120-sec VI 80-sec
versus chain VI 120-sec VI 40-sec schedules. Daily
sessions began with the illumination of the houselight and side keys with white light; white key
lights signified the initial link of each chain schedule. Associated with each white key was an independent, constant-probability VI 120-sec schedule.
The first white-key peck following a schedule
assignment changed that key's color to either red
(left key) or green (right key) and darkened
the other key. Responding in the presence of either
of these mutually exclusive terminal-link stimuli
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
392
Table 7
Left-Key Conditional Probabilities
(Experiment 4}
Probability
of a left
given
B71
B72
B73
B74
M
L
R
LL
LR
RL
RR
LLL
LLR
LRL
LRR
RLL
RLR
RRL
RRR
LLLL
RRRR
LLLLL
RRRRR
LLLLLL
RRRRRR
.13
.60
.43
.61
.09
.58
.73
.65
.10
.61
.19
.61
.08
.53
.91
.51
.94
.46
.97
.43
.35
.63
.44
.62
.30
.65
.53
.62
.30
.67
.37
.63
.29
.60
.60
.60
.66
.54
.72
.48
.06
.34
.14
.35
.06
.33
.10
.36
.04
.46
.15
.35
.07
.27
—
.26
—
.23
—
.21
.27
.28
.33
.38
.25
.24
.36
.41
.27
.38
.31
.36
.23
.19
.43
.16
.38
.14
,31
.12
.25
.48
.40
.55
.20
.41
.53
.55
.20
.55
.31
.54
.19
.31
.65
.24
.73
.19
.80
.15
Subjects
Note, L = left-key response; R = right-key response.
resulted in 4 sec of access to grain according to
either a VI 80-sec (left key) or VI 40-sec (right
key) schedule. The houselight was constantly
illuminated except during the hopper cycles. Each
session terminated after 60 reinforcements. All data
presented are based on the last 5 of 20 sessions.
Results
Molecular control of choice was evidenced
in terms of both sequential statistics (Table
7) and ICT distributions (Figure 6). As
regards the former measure, Table 7 presents
the probability of an L given various
sequences of prior responses for each subject
and for the average of all subjects. No
sequence probabilities are presented for leftkey runs beyond a length of three for B73
because during the last five sessions none of
these sequences occurred. Sequential dependencies are evident in most of the choice
data in the table, including those based only
on one-member sequences. For example,
with the exception of B74, all birds were
more likely to peck the left key if the prior
response had been an L than if it had been an
R. Also apparent in the data is a progressive
tendency to perseverate on a key as a function of run length—that is, the longer a bird
responded to a key, the less likely it was to
switch keys.
Figure 6 presents the ICT distribution on
a "per-opportunity" basis during the initial
links of chain VI 120-sec VI 80-sec schedules (left side of figure) and chain VI 120sec VI 40-sec schedules (right side of figure). Relative left-key response rates are
presented beside subject identification numbers. Averaged data are in the bottom pair of
panels. In the present experiment, matching
equaled a relative left-key response rate of
.33. Two birds (B73 and B74) approximated
this value; two others (B71 and B72) undermatched. Times in the presence of the
unpreferred key were of brief duration: For
all subjects, the modal ICT for the unpreferred key was between 1 and 1.5 sec. Except for B73, all birds also had modal ICTs
in the presence of the preferred key. These
modes had lower frequencies and were at
longer ICTs than the modes for the unpreferred key.
Discussion
In Experiment 3, the microstructure of
choice was characterized by a two-process
model, one process being reinforcement optimization and the other being response perseveration. By varying the strength of each
process, this model accounted for the local
patterns of responding on two different concurrent procedures, those using discrete
trials (Experiment 1) and those using concurrent VI-VI schedules without a COD
(Experiment 3). The purpose of the present
experiment was to test the generality of this
account on yet another choice procedure—
that of concurrent chains. The issue at hand
is whether the constituent processes of this
two-state model are readily identifiable in the
sequential-statistics data from concurrent
chains.
The data from Table 7 provide strong
evidence for the operation of both processes.
Consider first some predictions of maximizing with errors. Because the choice procedure
of this experiment has LRR as its maximizing sequence (assuming equal successive
THE STRUCTURE OF CHOICE
VI 80-SEC
TERMINAL LINK
.7
.5
.3
393
VI 40-SEC
TERMINAL LINK
.5
B71 (.41)
.3
.1
.1
.3
.1
.7
.5
.3
.1
.5
.5
.3
B74(.28)
.3
.1
.1
.3
.3
. MEAN (.39)
.1
.1
2.5
7.5
10
2JS
7.5
10
TIME BETWEEN CHANGEOVERS (IN SEC)
Figure 6. Conditional probability of a changeover as a function of time since the last changeover in .25-sec units during the initial link terminating in a variable-interval (VI) 80-sec
schedule and in a VI 40-sec schedule. (Adjacent to bird identification numbers is the relative
response rate during the initial link terminating in the VI 80-sec schedule. Mean performances
for all birds are presented in the bottom pair of panels.)
IRTs), this response rule predicts for all
values of the forgetting parameter save 1.0
that the probability of an L will be higher
following an R than an L, higher following
an LR than an RL, and higher following an
LLR or an RLR than an LRL or an RRL.
All of these ordinal predictions are supported
by the data in Table 7?
The operation of response perseveration,
the second element in this model, is no less
3
Sequences terminating in multiples of the same
response (e.g., LRR, RLL) have been excluded
from these comparisons because their frequencies
are likely to be affected by perseverative responding.
394
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
obvious in the data from Table 7. For all
subjects, the probability of an L decreases
with the number of successive Rs, a result
characteristic of the operation of perseveration (see Figure S). In fact, despite differences in procedure, the changeover-probability data from the present experiment seem
quite similar to those from Experiment 3,
Time-based measures also seem similar between these experiments: In both studies,
ICT' distributions showed that there were
modal ICTs, the mode for the less preferred
key being of shorter duration and more
peaked than for the preferred key. These
empirical correspondences suggest that the
microstructure of choice is invariant across
several procedural transformations. For all
experiments, this microstructure seems well
described as an outcome of two separable
processes: perseveration and maximization.
And with the exception of Experiment 2, in
which COD contingencies to some extent
specified the order of successive responses,
the manner in which these two processes
interacted seems consistent with the maximizing-plus-perseveration model.
General Discussion
The matching law states a relation between
molar variables: In its definition, choices
and reinforcements are each summed over
time with consideration given neither to the
order in which choices occur nor to the local
changes in the likelihood of reinforcement.
The apparent wisdom of stating this law in
molar terms is reflected in the simple
elegance of its definition and in the breadth
of its predictive power (e.g., see de Villiers,
1977).
Elegance and power are notable accomplishments for any behavioral law, and they
no doubt explain the strong emphasis on
using molar variables whenever choice has
been studied. Indeed, with the exception of
a very few studies (e.g., Nevin, 1969; Shimp,
1966), work on concurrent schedules has as
an often implicit assumption that whatever
the processes of choice allocation might be,
they should be described in molar terms—as
some type of average of responses and of
reinforcers. To be sure, many interesting
questions have been addressed by these
molar analyses (see de Villiers, 1977, for a
review), but a fundamental question remains : Are these matching relations, in all
their various forms and with all their alternative molar measures, just descriptive statistics, or do they also make contact with an
underlying psychological principle?
It has been Shimp's (1966, 1969b) contention that matching relations do not describe a psychological process; rather, they
are an unintended consequence of using
molar measures—an artifact of averaging
across many choices, none of which obeys a
matching principle. He and his co-workers
have supported this position by showing the
generality of molecular control not only in
choice procedures, but in certain singleschedule situations as well (e.g., Hawkes &
Shimp, 1974, 1975; Shimp, 1966, 1968,
1969a, 1973a, 1973b). Despite these findings,
molecular reinterpretations of matching relations have not gained favor, a fact evidenced
by the robust and continuing preference for
using molar measures in studying choice. In
our opinion, two factors have discouraged
widespread adoption of a more molecular approach: (a) Nevin's (1969) finding of
matching without apparent molecular control of choice, a result suggesting that matching need not be explained in terms of a
more molecular process and (b) an unfortunate tendency for assessing the contribution of molecular processes to matching on
choice procedures unrepresentative of those
usually used to show matching (e.g., Shimp,
1973a).
These two problems are remedied in the
present study. In Experiment 1 it was shown
that there was no incompatibility between
the results of Nevin (1969) and Shimp
(1966) and that the matching relations obtained in both studies were likely due to a
similar molecular process. As regards the
second problem, Experiments 2, 3, and 4
studied the molecular characteristics of
choice on concurrent VI-VI and concurrentchains schedules, these procedures being the
conventional paradigms for studying choice
and matching relations. As in Experiment 1,
THE STRUCTURE OF CHOICE
there was a discernible microstructure to
choice in each of these three experiments. In
sum, the data from these four experiments
support the view that matching relations are
due to molecular control of choice.
In Experiments 1 and 3, attempts were
made to identify the processes constituting
this molecular control. In Experiment 1 it
was found that the sequential statistics from
that experiment were well described by a
model that assumed that pigeons always
choose that alternative more likely to be
reinforced but that they often make errors
in the execution of this optimizing strategy.
In Experiment 3, a response-perseveration
element was added to this model to accommodate the finding in that experiment that.
pigeons occasionally select the same key
again and again in violation of the expectations of maximizing. With these two elements in hand, the model in Experiment
3 was capable of synthesizing, at least qualitatively, major features of the microstructure of choice in all experiments save Experiment 2, in which COD contingencies
made perseveration actually part of the maximizing strategy; however, even though a
direct application of the maximizing-plusperseveration model failed in Experiment 2,
its constituent elements of maximizing and
perseveration were identifiable in the choice
data. In fact, the microstructure of choice in
all four experiments seemed to be an outcome of differing blends of each of these two
processes.
Despite the successes of the two-state
model in synthesizing some of the sequential
properties of choice, it is not our assertion
that the definition of the processes controlling choice is necessarily at hand. To make
such an assertion, one would want (a) an
exhaustive list of all processes likely to be
controlling choice selection and (b) a quantitative assessment of the data variance accounted for by adding or differentially combining those factors under consideration. For
these reasons, we consider the present study
to be only a first step in delineating the
mechanisms of choice. Quite possibly, future
research will posit alternative processes that
will provide a better empirical fit to larger
395
sets of data than the present model. However, although some skepticism is in order
as regards the model proffered in this study,
a weaker claim can be made in its behalf
with confidence: Whatever the proper characterization of choice might be, its basis is
molecular. This conclusion follows from the
finding that in all studies of choice in which
sequential data have been recorded, ranging
from the "typical" choice procedures of the
present study to choice among IRTs (e.g.,
Shimp, 1973a), a discernible microstructure
in the form of statistical dependencies of
choice has been regularly uncovered.
This study's consistent evidence for a
microstructure of choice denies any isomorphism between the matching law and
the processes governing choice allocation.
The problem for the matching law is that it
describes behavior at a molar level, based on
large averages of responses and reinforcers;
yet, as is now apparent, individual choices
are controlled at a molecular level—by prior
choices and local reinforcement probabilities.
That the matching law is so often adequate
as a descriptive statistic now appears a happenstance : A small set of molecular response
rules, possibly based on maximization and
perseveration processes, has sufficient crossprocedural generality to be evident on several alternative choice paradigms. When
choice behavior governed by these rules is
integrated over time, a fortuitous concordance is uncovered: The relative response
rate equals the relative frequency of reinforcement. Nevertheless, these matching
relations are averaging artifacts. They appear only because matching is compatible
with the underlying structure of choice.
If this analysis is correct, it should be
possible to construct a choice procedure in
which these molecular rules are incompatible
with matching. Such a procedure has, in
fact, been arranged by Silberberg and Williams (1974). In their study, only changeovers between choice alternatives were reinforced; yet, the probability of a reinforcement for a changeover differed between keys.
If matching was primary in their study,
pigeons should have responded more to the
key with the higher reinforcement prob-
396
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
ability. If, on the other hand, choices were quential dependencies in a subject's choices
governed by a molecular process such as between the activities of, say, pecking a key
maximizing, pigeons should have strictly and grooming. In the absence of this molecalternated their choices because only change- ular analysis, it is obviously premature to
overs were reinforced. Silberberg and Wil- claim that the predictions of Herrnstein's
liams found pigeons selected both keys (1970) broadened matching law can be atequally often despite unequal reinforcement tributed to response patterning between
frequencies—a finding incompatible with these explicit and implicit schedules. Our
matching (see, however, Herrnstein & Love- point here is simply that there is no incomland, 1975) but compatible with molecular patibility between the predictions of this law
control in the form of maximizing. This and the thesis that its predictive success is
result is consistent with the present study's due to molecular processes.
conclusion that molecular processes control
choice and that matching is an outcome of
Conclusions
this molecular control.
A corollary of this study's attributions of
In their discussion of molar and molecular
matching to molecular control is that analyses of choice, Herrnstein and Loveland
wherever matching has been found, a micro- (1975) note that any claim for the primacy
structure of choice should also be evident. of molecular control would be predicated on
This corollary forces consideration of de the regular demonstration that matching
Villiers' (1977) recent demonstration that relations had a constituent microstructure.
Herrnstein's (1970) broadened version of The purpose of the present series of experithe matching law can account not only for ments was to determine whether such a
standard choice data, but also for a wide microstructure was present. In all four exrange of single-schedule performances (see periments, such a microstructure was unalso de Villiers & Herrnstein, 1976). Cer- covered, the form of which could be qualtainly, no comparable generality has yet been itatively described as the outcome of two
demonstrated for molecular accounts of be- molecular processes : maximization and perhavior; yet, such a demonstration should be severation.
possible if the primacy claimed for molecular
Three implications of these findings come
control is to be maintained.
to mind:
A key element in the broadened matching
1. The matching law is a descriptive stalaw's success is the assumption that even tistic, not a psychological principle of choice
single-schedule performances involve choice allocation. Its successes appear due to its
—in this case, between an explicit, experi- fortuitous compatibility with response rules
menter-arranged schedule and an implicit, that control choice at the molecular level of
omnipresent "environmental" schedule that the response sequence.
reinforces other activities (e.g., grooming).
2. The major theoretical accounts that use
Although this version of the matching law the matching law as the basis for measuring
accounts for many findings about which ex- reinforcing value or response strength
tant molecular models make no statements, (Baum & Rachlin, 1969; Catania, 1973;
its superiority would disappear if molecular Herrnstein, 1970) are consequentially weakaccounts were also to advocate that single- ened. Their dilemma is clear-cut: Choice is
schedule procedures are implicitly choice controlled not by value or strength, but by
paradigms. A microstructural reinterpreta- prior choices and local reinforcement probtion of matching on single schedules would abilities.
be that molecular processes govern a sub3. The results of the present study accomject's choices between the experimenter- pany a host of other findings (e.g., see
arranged schedule and the environmental Hawkes & Shimp, 1975; Kuch & Platt,
reinforcement schedule. For example, a 1976; Weiss, 1970; Williams, 1968) in supmolecular analysis might demonstrate se- porting a renewed emphasis on molecular
THE STRUCTURE OF CHOICE
397
W. K. Honig & J. E. R. Staddon (Eds.), Handanalysis in studying schedule effects. This
book of operant behavior. Englewood Cliffs, N J.:
conclusion is dictated by the frequent demon1977.
stration in the studies cited above that molar dePrentice-Hall,
Villiers, P. A., & Herrnstein, R. J. Toward a
and molecular measures offer fundamentally
law of response strength. Psychological Bulletin,
different characterizations of behavior. De1976, 83, 1131-1153.
spite these demonstrations, the primacy of Findley, J. D. Preference and switching under concurrent scheduling. Journal of the Experimental
molar analysis in operant-oriented research
Analysis of Behavior, 1958, 1, 123-144.
persists. The reasons for preferring molar Fleshier, M., & Hoffman, H. S. A progression for
measures are easily guessed at—among them
generating variable-interval schedules. Journal of
the Experimental Analysis of Behavior, 1962, 5,
being their historical precedence, their sur529-530.
face simplicity, and their ease of recording.
Hawkes, L., & Shimp, C. P. Choice between reYet, the results of the present study illustrate
sponse rates. Journal of the Experimental Anala danger in their exclusive use: The very
ysis of Behavior, 1974, 21, 109-115.
real possibility exists that a molar measure Hawkes, L., & Shimp, C. P. Reinforcement of behavioral patterns: Shaping a scallop. Journal of
obscures a qualitatively different, molecularly
the Experimental Analysis of Behavior, 1975, 23,
controlled pattern of behavior. For this rea3-16.
son, decomposing standard molar measures Herrnstein, R. J. Relative and absolute strength of
into plausible molecular units is a conservaresponse as a function of frequency of reinforcement. Journal of the Experimental Analysis of
tive strategy to adopt. If no structure is
Behavior, 1961, 4, 267-272.
evident in behavior at a molecular level, the Herrnstein,
R. J. Secondary reinforcement and rate
molar measure can be reconstituted from its
of primary reinforcement. Journal of the Experimolecular parts. If, on the other hand, the
mental Analysis of Behavior, 1964, 7, 27-36.
molecular measure suggests a structure of Herrnstein, R. J. On the law of effect. Journal of
the Experimental Analysis of Behavior, 1970, 13,
behavior incompatible with its molar coun243-266.
terpart, behavior can be characterized in Herrnstein, R. J., & Loveland, D. H. Maximizing
molecular terms. Such an approach maxand matching on concurrent ratio schedules.
Journal of the Experimental Analysis of Beimizes the likelihood that the level of analysis
havior, 1975, 24, 107-116.
corresponds with the level at which beHull, C. L. Principles of behavior. New York:
havioral processes exercise control.
Appleton-Century-Crofts, 1943.
References
Anger, D. The dependence of interresponse times
upon the relative reinforcement of different interresponse times. Journal of Experimental Psychology, 1956, 52, 145-161.
Autor, S. M. The strength of conditioned reinforcers as a function of the frequency and probability of reinforcement. In D. P. Hendry (Ed.),
Conditioned reinforcement. Homewood, 111.: Dorsey Press, 1969.
Baum, W. M., & Rachlin, H. C. Choice as time
allocation. Journal of the Experimental Analysis
of Behavior, 1969, 12, 861-S74.
Blough, D. S. Interresponse time as a function of
continuous variables: A new method and some
data. Journal of the Experimental Analysis of
Behavior, 1963, 6, 237-246.
Brown, P. L., & Jenkins, H. M. Auto-shaping of
the pigeon's key-peck. Journal of the Experimental Analysis of Behavior, 1968, 11, 1-8.
Catania, A. C. Self-inhibiting effects of reinforcement. Journal of the Experimental Analysis of
Behavior, 1973, 19, 517-526.
de Villiers, P. Choice in concurrent schedules and a
quantitative formulation of the law of effect. In
Kuch, D. O., & Platt, J. R. Reinforcement rate and
interresponse time differentiation. Journal of the
Experimental Analysis of Behavior, 1976, 26,
471-486.
Menlove, R. L. Local patterns of responding maintained by concurrent and multiple schedules.
Journal of the Experimental Analysis of Behavior, 1975, 23, 309-337.
Mohr, S. E. An experimental investigation of two
models of choice behavior, (Doctoral dissertation,
The American University, 1976). Dissertation
Abstracts International, 1976, 37, 1463B. (University Microfilms No. 76-19, 461).
Morgan, M. J. Effects of random reinforcement
sequences. Journal of the Experimental Analysis
of Behavior, 1974, 22, 301-310.
Myers, D. L., & Myers, L. E. Undermatching: A
reappraisal of performance on concurrent variable-interval schedules of reinforcement. Journal
of the Experimental Analysis of Behavior, 1977,
25, 203-214.
Nevin, J. A. Interval reinforcement of choice behavior in discrete trials. Journal of the Experimental Analysis of Behavior, 1969, 12, 875-885.
Shimp, C. P. Probabilistically reinforced choice
behavior in pigeons. Journal of the Experimental
Analysis of Behavior, 1966, 9, 433-455.
398
SILBERBERG, HAMILTON, ZIRIAX, AND CASEY
Shimp, C. P. Magnitude and frequency of reinforcement and frequency of interresponse times.
Journal of the Experimental Analysis of Behavior, 1968, 11, 525-535.
Shimp, C. P. The concurrent reinforcement of two
interresponse times: The relative frequency of
an interresponse time equals its relative harmonic
length. Journal of the Experimental Analysis of
Behavior, 1969,12, 403-411. (a)
Shimp, C. P. Optimum behavior in free-operant experiments. Psychological Review, 1969, 76, 97112. (b)
Shimp, C. P. Sequential dependencies in freeresponding. Journal of the Experimental Analysis
of Behavior, 1973, 19, 491-497. (a)
Shimp, C. P. Synthetic variable-interval schedules
of reinforcement. Journal of the Experimental
Analysis of Behavior, 1973,19, 311-330. (b)
Shimp, C. P. Short-term memory in the pigeon:
The previously reinforced response. Journal of
the Experimental Analysis of Behavior, 1976, 26,
487-493.
Shull, R. L., & Pliskoff, S. S. Changeover delay
and concurrent schedules: Some effects on relative performance measures. Journal of the Ex-
perimental Analysis of Behavior, 1967, 10, 517527.
Siegel, S. Nonparametric statistics for the behavioral sciences. New York: McGraw-Hill,
1956.
Silberberg, A., & Williams, D. R. Choice behavior
on discrete trials: A demonstration of the occurrence of a response strategy. Journal of the Experimental Analysis of Behavior, 1974, 21, 315322.
Skinner, B. F. Science and human behavior. New
York: Macmillan, 1953.
Stubbs, D. A., & Pliskoff, S. S. Concurrent responding with fixed relative rate of reinforcement. Journal of the Experimental Analysis of
Behavior, 1969, 12, 887-895.
Weiss, B. The fine structure of operant behavior.
In W. N. Schoenfeld (Ed.), The theory of reinforcement schedules. New York: Appleton-Century-Crofts, 1970.
Williams, D. R. The structure of response rate.
Journal of the Experimental Analysis of Behavior, 1968, 11, 251-258.
Received January 20, 1978
Revision received June 8, 1978 •