Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The structure of choice

1978, Journal of Experimental Psychology-animal Behavior Processes

Journal of Experimental Psychology: Animal Behavior Processes 1978, Vol. 4, No. 4, 368-398 The Structure of Choice Alan Silberberg, Bruce Hamilton, John M. Ziriax, and Jay Casey The American University Both Nevin (1969) and Shimp (1966) found on different choice procedures that pigeons equate (match) the proportion of their choices to the proportion of reinforcers each choice delivers. Their results differed in terms of the order of successive choices: Shimp found pigeons ordered successive choices so as to maximize the reinforcement rate, whereas Nevin found no evidence of such an ordering. Experiment 1 replicated both studies and found in both: (a) matching relations and (b) sequential dependencies of choice that corresponded with Shimp's maximizing prediction. The next three experiments studied the order of choices in three other choice procedures: (a) concurrent variable-interval schedules with a changeover delay, (b) concurrent variable-interval schedules without a changeover delay, and (c) concurrent-chains schedules. In all of these procedures, control of choice at the level of the response sequence was evident. The major features of the data from all four experiments were attributed to two molecular processes: response perseveration and reinforcement maximization. This evidence for a microstructure of choice suggests that the molar matching law is not isomorphic with the molecular processes governing concurrent performances. In a study by Herrnstein (1961), pigeons chose between two response keys, each associated with an independent variable-interval (VI) schedule of food reinforcement. A changeover delay (COD) that specified a minimum interval during which reinforcement was unavailable following a switch of choice- between keys was used to minimize interaction between these schedules. Herrnstein found with several pairs of VI schedules that the proportion of responses to a key (responses to a key divided by response total to both keys) equaled or "matched" the proportion of reinforcements that that This research was supported by National Institute of Mental Health grant MH22881 to The American University. This report was written during A. Silberberg's sabbatical stay at the University of Sussex, England. He thanks them for their courtesy and support. Bruce Hamilton is now at the Walter Reed Army Institute of Research, Washington, D.C. Requests for reprints should be sent to A. Silberberg, Department of Psychology, 321 Asbury Building, The American University, Washington, D.C. 20016. key delivered (reinforcements to a key divided by reinforcement total to both keys). These matching relations have proven to be of considerable species and procedural generality (see de Villiers, 1977, for a review). To present just one example, Shull and Pliskoff (1967) found matching using a choice procedure different from Herrnstein's, using rats instead of pigeons and brain stimulation instead of grain as the reinforcer. This generality, in conjunction with the elegant characterization of choice offered by the matching law, accounts for much of the current research interest in concurrent performances. There is, however, also an important theoretical reason for studying choice behavior. As Baum and Rachlin (1969) have argued, the generality of matching relations makes the relative-rate measure a useful metric for defining a reinforcer's relative value. In their study, pigeons were given a choice between two behaviors—standing at one or the other end of an elongated chamber. Each of these behaviors was reinforced according to different schedules of Copyright 1978 by the American Psychological Association, Inc. 0097-7403/78/0404-0368?00.75 368 THE STRUCTURE OF CHOICE VI reinforcement. They found that the ratio of time spent standing on either side equaled the ratio of reinforcements these behaviors produced. The more general possibility is that the time spent in one of several behaviors equals the relative value of that behavior. This equation of relative response rate or time with relative reinforcing value is a bold extension of the matching law and suggests that choice theorists have answered a historically significant question in the psychology of learning—What is the appropriate measure of the strength of a reflex (e.g., see Herrnstein, 1970; Hull, 1943; Skinner, 1953) ? The answer is relative response rate. This progress notwithstanding, a possible problem should be acknowledged: Relative response rate is an average of an organism's individual choices. It must be determined whether the relative-response-rate statistic is consistent with the choices denning it. For the relative-response-rate measure to be fully descriptive of the psychological process of choice allocation, successive choices must not show orderly sequential dependencies. If they did, relative response rate could be a theoretically misleading and empirically impoverished measure, with matching occurring at the molar level of relative response rate, yet without the more molecular response sequences composing that measure conforming with the matching law (e.g., see Shimp, 1969b). For the same reason, sequential dependencies among choices would do damage to Baum and Rachlin's notion of choice as an assay of relative reinforcing value. Were successive choices sequentially dependent, the molar relative-frequency measure would not index any relation between the strengths of behavior that different reinforcement conditions maintain; rather, this measure would only be an average of different choices controlled by a molecular process based on the order of prior choices. Two studies (Nevin, 1969; Shimp, 1966) have tested whether matching, defined in terms of relative rate, is a consequence of discernible regularities in the order of successive choices. In the Shimp study (Experiment 3), a center-key response illumi- 369 nated two side keys and occasionally assigned a reinforcement with unequal probability to one of the two keys. Once an assignment was made to a key, no further assignments were possible until reinforcement was delivered. Each side-key response darkened the choice keys and reilluminated the center key. Shimp found that pigeons matched— that is, they partitioned their choices between keys so that the proportion of total responding to a key equaled the proportion of total reinforcement that that key provided. Shimp then carried this analysis one step further: He recorded the order of all choices falling between successive reinforcements. He found that these interreinforcement response sequences showed several correspondences with the changes each choice induced in the relative probability of reinforcement. Based on this finding, he argued that matching at the level of relative response rate was a consequence of a more molecular process— control of successive choices by local reinforcement probabilities. Shimp found pigeons' behavior was well described by a reinforcement-optimizing strategy he called momentary maximizing: At each instance a choice is made, that choice is allocated to whichever key is more likely to provide reinforcement. In his study, in which reinforcement for key-A responses was three times as likely as for key-B responses, the momentary-maximizing sequence was AAB.1 Not only were sequential dependencies in choice found that conformed to this sequence, but a computer simulation of choice governed by this rule simulated matching on concurrent VI-VI schedules. Thus, he argued, matching need not be a consequence of the probabilistic allocation of choices in accordance with their averaged relative frequency of reinforcement. Rather, choice may be governed by a molecular 1 Although the momentary-maximizing principle is conceptually simple (at each moment of choice the pigeon selects whichever alternative is more likely to be reinforced), the mathematical derivation of the maximizing sequence is somewhat more complicated. The interested reader is referred to Shimp (1966, Appendix A). 370 SILBERBERG, HAMILTON, ZIRIAX, AND CASEY process—that of optimizing the time rate of reinforcement (see also Shimp, 1969b). Data inconsistent with this interpretation were presented by Nevin (1969). He studied pigeons' choice behavior on a discretetrials procedure. Each trial began with the illumination of both side keys. A response to either key turned off these key lights for an intertrial interval (ITI) of 6 sec. Associated with one key was a VI 1-min schedule and with the other key, a separate and independent VI 3-min schedule. Nevin found the conventional matching result—that approximately three times as many choices were made to the VI 1-min key as were made to the VI 3-min key. Of greater interest was his method for analyzing the order in which choices were made: Instead of recording all combinations of interreinforcement response sequences as was done by Shimp, he recorded only some sequences—those terminating in a changeover (i.e., a switch between keys). Nevin found that the probability of a changeover decreased slightly as a function of the number of successive choices to a given key. Because, on concurrent VI-VI schedules, the probability of reinforcement for a changeover increases with run length to a key, he concluded that pigeons were insensitive to this local dimension of differential reinforcement. More generally, his results suggest that matching can occur at the molar level of relative rate without evidencing, in terms of his changeover measure, any tendency to maximize the time rate of reinforcement at the molecular level of the response sequence. Clearly, the results of Shimp and Nevin appear to be contradictory—the former found evidence that matching was a consequence of control by molecular contingencies; the latter did not. Herrnstein and Loveland (1975) attempted to resolve this apparent incompatibility. They argued that studies that support a molecular interpretation of choice (e.g., Shimp, 1966) use procedures in which the conditions of concurrent reinforcement are "inhomogeneous." By inhomogeneous they meant that the loci of prior choices dramatically influence the relative probability of reinforcement between choice alternatives. Evidence for molecular control obtains when the subject's choices track these inhomogeneities. Hence, finding sequential dependencies among choice is not so much evidence against matching as a demonstration of a subject's sensitivity to local changes in the likelihood of reinforcement. In their view, the proper assay for matching relations is on schedules where the conditions of concurrent reinforcement are more homogeneous (e.g., Nevin, 1969)— that is, where each choice presumably does not produce such large changes in local reinforcement probabilities. A test of this "relative-homogeneity" argument seems straightforward: Statistically independent choice allocation that conforms with matching should produce larger changes in the local reinforcement probability in the Shimp procedure than in the Nevin procedure. In order to test this premise, the Shimp (1966, Experiment 3) and Nevin (1969) choice contingencies were simulated on a computer. In both procedures, key-A responses were three times as likely to produce reinforcement as key-B responses. The following two assumptions were made regarding simulated schedule performances: (a) The relative response rate would equal the relative frequency of reinforcement (i.e., matching would obtain), and (b) all choices would be statistically independent of prior choices. These assumptions were fulfilled by assigning 75% of the numbers from a random number stream to key A. A third assumption in these simulations was that responding would occur on all choice trials, immediately after the choice alternatives were presented. After 100,000 trials of simulated choice allocation, the relative frequency of key-A reinforcement was calculated in two ways for each procedure: (a) given the prior choice had been to key A and (b) given the prior choice had been to key B. When the prior response was to key A, the relative frequency of key-A reinforcement for the Shimp and Nevin procedures, respectively, was .43 and .40, and when the prior response was to key B, these statistics were .90 and .88, respectively. As these results clearly THE STRUCTURE OF CHOICE show, the relative changes in local reinforcement frequencies appear no more inhomogeneous in the Shimp procedure than in the Nevin procedure. Hence, any explanation of the empirical differences between these two procedures based on differences in schedule homogeneity is suspect. Silberberg and Williams (1974) attributed the differences in Shimp's and Nevin's results to the differences in ITI duration in these studies. In their experiment, they showed that ITI duration can influence whether control by local reinforcement contingencies is manifest. They hypothesized that the 6-sec ITI of the Nevin study caused subjects to make errors in emitting the momentary-maximizing sequence and that these errors obfuscated the appearance of the underlying molecular control of choice. Nevertheless, matching obtained in the Nevin study because pigeons tended to distribute their errors randomly among the members of the maximizing sequence. To test this notion, Mohr (1976) replicated Nevin's procedure with two different ITI durations. She found little evidence that ITI duration affected choice sequences with his procedure. Thus, it does not appear that ITI duration alone can account for the absence of sequential dependencies in Nevin's results. A close reading of Shimp's and Nevin's experiments suggests another possible way of reconciling their results: Perhaps the empirical differences in these studies are due to differences in the way sequential dependencies were measured. In the Shimp study, sequential statistics were recorded among all combinations of choices defining interreinforcement sequences. For example, he recorded the probability of a left (L) given prior choices had been L, right (R), LR, RL, LRL, and so forth. Nevin's primary measure, on the other hand, was the probability of a changeover between key colors as a function of run length to a key color. For example, he recorded the probability of a red-key response given one green-key response, two green-key responses, and so forth. Shimp's measures subsume Nevin's in that Shimp recorded all choice combina- 371 tions, whereas Nevin only recorded changes in changeover probability. Whether this difference in measures translates into different interpretations of the process of choice is best answered by replicating Shimp's and Nevin's procedures and presenting both types of measures. This was the purpose of Experiment 1. Experiment 1 In this experiment, choice procedures similar to those of Shimp and Nevin were used. Detailed sequential statistics were recorded and presented in two ways: (a) in a table composed of the probability of an L given different combinations of Ls and Rs (see Shimp, 1966) and (b) in a plot of the probability of a changeover as a function of run length to a key (see Nevin, 1969). If accounts that attribute the differences between the Shimp and Nevin results to procedural factors are correct (e.g., Herrnstein & Loveland, 1975; Silberberg & Williams, 1974), differences should obtain between procedures for both dependent variables. If our current hypothesis is correct, however—that the empirical differences between these studies are largely a consequence of different methods of data presentation—each dependent variable should produce roughly the same results across procedures. Method Subjects. Ten adult male White Carneaux pigeons, deprived to 80% of their free-feeding weights, were used. All birds were experimentally naive at the beginning of the experiment. Apparatus. Four identical sound-attenuated experimental chambers, electrically connected to a PDP-8/e minicomputer, were used. Each chamber's dimensions were 34.3 X 30.5 X 33 cm. With the exception of the stainless-steel response panel and the wire mesh floor, all surfaces were composed of galvanized steel. The distances from the floor of the chamber to the hopper aperture, to the midpoint of the center key, and to the houselight were, respectively, 9.5 cm, 25 cm, and 30.5 cm. The midpoints of each of the two side keys were displaced 7.6 cm from the midpoint of the center key. Lehigh Valley Electronics response keys, requiring a minimum force of .1 N for operation and transilluminated by Industrial Electronic Engineers multistimulus projectors, were used. 372 SILBERBERG, HAMILTON, ZIRIAX, AND CASEY Procedure. After being trained to eat from the food magazine when it was presented unpredictably in time, the pigeons were placed on an autoshaping schedule (Brown & Jenkins, 1968) in which all three keys were. transilluminated with white light for 6 sec before the response-independent 4-sec presentation of grain. Successive presentations of the lighted keys were separated by a variable ITI of 30 sec, during which only the houselight was illuminated. After two SO-trial sessions, during which reliable key pecking was induced, the birds were randomly assigned to two five-subject groups, corresponding to the Shimp (birds Bl through B5) and Nevin (birds B6 through BIO) replications. In the Shimp replication, a trial began with the illumination of the center key with white light. A center-key response darkened that key and illuminated the two side keys, one with green and one with red light. Each choice-key response re-illuminated the center key, either immediately or after reinforcement. The probability of assigning reinforcement to a side key equaled .25. Assignments were three times as likely to one key as to the other. Only one assignment could be made on a choice trial, and that assignment remained available on subsequent trials until obtained. Each daily session ended after 200 trials. In the Nevin replication, a trial began with the illumination of the side keys, one with red and one with green light. Side-key illumination ended with a response or after 2 sec, whichever occurred first. Separating successive trials was an ITI of 6 sec, excluding reinforcement-cycle time. Associated with each choice key was a VI 1-min or a VI 3-min schedule, the interreinforcement intervals of which were defined according to the formulation of Fleshier and Hoffman (1962). Because the VI schedules were independent, reinforcement could be assigned on any trial to neither, either, or both choice alternatives. Each assignment remained available until obtained. Each daily session ended after 60 reinforcements. In both replications, choice-key colors were counterbalanced among birds insofar as was practicable; however, the richer reinforcement schedule was associated with the left key for all birds. A houselight was continuously illuminated throughout a session except during the hopper cycle, which was of 4-sec duration. Each replication lasted 70 sessions. All response-sequence combinations from Sessions 31 through 60 were recorded up to a length of seven without regard to the occurrence of a reinforcement. For Sessions 61 through 70, only response sequences between reinforcements were recorded. The all-response-sequence and interreinforcement-response-sequence measures were not recorded concurrently, due to computer memory limitations. All data analysis is based on the sum of Sessions 31 through 60 or 61 through 70. Results Table 1 presents the probability of a leftkey response as a function of all possible combinations of prior choices through a length of four. These probabilities were calculated by dividing the frequency of occurrence of a particular response sequence terminated by an L by the sum of that sequence plus the same sequence terminated by an R. For example, the probability of an L given RLR equaled the frequency of the sequence RLRL/( frequency of RLRL+ frequency of RLRR). Columns 1 through 5 of Table 1 present sequential statistics from birds Bl through B5 of the Shimp replication, and columns 6 through 10 present sequential statistics from birds B6 through BIO of the Nevin replication. These probabilities are based on all choice sequences for Sessions 31 through 60 for each bird. Probabilities were not calculated when fewer than 25 instances of a response sequence occurred. Columns 11 and 13 present these probabilities averaged across the Shimp and Nevin replications, respectively. Columns 12 and 14 present the group data for the Shimp and Nevin replications averaged over Sessions 61-70 in which only interreinforcement response sequences were recorded. Column 15 presents the sequential statistics predicted by the momentary-maximizing sequence, which was LLR for the Shimp and Nevin replications. Asterisks refer to the theoretically nonoccurring interreinforcement response sequences of a subject following the momentary-maximizing rule.2 The relative response rate and relative reinforcement frequency for all individual and group data are presented at the bottom of Table 1. If choices were statistically independent, the sequential statistics for individual sub2 These momentary-maximizing predictions are correct for all data based on interreinforcement response sequences (columns 12 and 14 of Table 1) ; however, they vary somewhat for data based on all trials (e.g., columns 11 and 13 of Table 1). This variance is due to the fact that response sequences are not reset when reinforcement occurs in the all-trials analysis. THE STRUCTURE OF CHOICE jects in Table 1 would all closely approximate the subjects' relative* response rates. This test for independence! was clearly violated for all subjects. In fact, orderly sequential dependencies did develop for all birds. For example, the probability of an L given L was lower for all birds than an L given R. Moreover, the sequential statistics for the Shimp and Nevin replications show a high degree of correspondence: The Spearman rank-order correlation between their group data (columns/11 and 13) is .92 (Siegel, 1956). There is also some correspondence between the predictions of the momentary-maximizing rule and the obtained sequential statistics for each group. To illustrate this point, compare the sequential statistics from the 7 sequences in which maximizing predicts the subject will always make an L (a 1.0 sequence) with the 3 sequences in which the subject will never make an L (a .0 sequence). In terms of the all-trials data from the Shimp procedure (column 11), 5 of the 7 1.0 sequences are higher than the highest .0 sequence, and 2 of the 3 .0 sequences are the lowest of the 10 sequences under consideration. This effect is even more clearcut in the Nevin data (column 13): The probability of an L for all 1.0 sequences is higher than the probability of an L for all .0 sequences. Although these correspondences show that momentary maximizing is of some value in describing changes in sequential statistics in both studies, it must also be noted .that the dependencies that obtained were sometimes substantially different from the momentary-maximizing prediction. Particularly at variance with the momentary-maximizing rule was the finding that left-key responses were likely even when a right-key response was predicted (e.g., see probability of L given LL). As regards relative response rate, the quality of matching was roughly comparable between birds in the Shimp and Nevin replications: The mean absolute deviation from matching in the former group was 7.5%, and in the latter group it was 6.4%. As can be seen at the bottom of Table 1, birds in the former group tended, on average, to 373 match, whereas birds in the latter group tended to undermatch. Figure 1 presents the probability of a changeover to a key as a function of successive choices to the other key. Filled circles refer to left-key runs; open circles to rightkey runs. Runs occurring fewer than 25 times are not plotted. In the lower left-hand corner of each panel is the bird number and its relative left-key response rate. Birds from the Shimp replication are in the left-hand column, and birds from the Nevin replication are in the right-hand column. With the exception of left-key runs for B5, all birds showed a progressively greater tendency to remain on a key the longer they had responded on it. Moreover, in some birds this progressive perseverative tendency was unequal between keys: For Bl, B2, and B3 from the Shimp replication and B7, B8, and BIO from the Nevin replication the negative slopes for right-key runs were, in terms of experimenter judgment, clearly greater than for left-key runs; and none of the left-key runs appeared to have greater negative slopes than right-key runs. Discussion The results of Experiment 1 may be summarized as follows: 1. When analyzed in terms of Shimp's (1966) primary measure—detailed sequential statistics—similar sequential dependencies of choice obtained for both replications (Table 1). These dependencies showed some correspondence with the predictions of the momentary-maximizing rule, a result consistent with Shimp's earlier findings. 2. When analyzed in terms of Nevin's primary measure—the probability of a changeover as a function of run length to a key (Figure 1)—the data were still similar for both replications. Moreover, some of these plots showed a progressive tendency toward perseveration—.that is, the likelihood of switching keys decreased the longer a subject responded to a key, a result consistent with Nevin's earlier work. The rationale for replicating the choice procedures of Shimp and Nevin was that Table 1 Individual and Group Data in the Shimp and Nevin Procedures, and Momentary-Maximizing Prediction (1) (2) (3) (4) (5) (6) Shimp procedure subjects (7) (8) (9) (10) Probability of a left given Nevin procedure subjects Bl B2 B3 B4 B5 B6 B7 B8 B9 BIO L R LL LR RL RR LLL LLR LRL LRR RLL RLR RRL RRR LLLL LLLR LLRL LLRR LRLL .68 .90 .67 .92 .69 .73 .69 .93 .69 .82 .63 .91 .77 .50 .71 .94 .67 .84 .63 .53 .89 .48 .90 .58 .87 .46 .90 .58 .87 .49 .89 .64 .84 .50 .90 .58 .86 .49 .64 .40 .97 .57 .97 .76 .84 .57 .98 .76 .96 .57 .93 .80 .30 .59 .98 .79 .91 .57 .84 .32 .84 .45 .84 .36 .84 .43 .85 .30 .84 .54 .80 .41 .81 .44 .86 .30 .89 1.00 .88 .99 .98 — .88 .99 .98 .58 .83 .53 .83 .64 .84 .56 .82 .63 .85 .49 .84 .67 .79 .58 .83 .64 .86 .51 .63 .81 .58 .82 .71 .77 .60 .83 .69 .78 .55 .80 .76 .75 .62 .84 .69 .80 .56 .67 .83 .66 .85 .70 .76 .69 .86 .70 .77 .60 .82 .69 .71 .69 .86 .70 .81 .61 .68 .73 .71 .72 .60 .74 .72 .68 .64 .80 .69 .79 .51 .59 .73 .67 .64 .85 .68 .67 .78 .66 .79 .68 .72 .69 .81 .68 .78 .62 .77 .67 .56 .71 .83 .69 .78 .64 .94 .94 — .87 .99 .98 — .94 (11) (12) Shimp procedure mean Trials between reinAll forcetrials ments .66 .90 .67 .91 .64 .83 .72 .93 .64 .86 .56 .88 .62 .69 .77 .94 .68 .86 .57 .59 .93 .54 .95 .60 .76 .52 .97 .61 .78 .44 .92 .38 .67 .54 .98 .75 .88 .44 (13) (14) Nevin procedure mean (15) be) W be) W Trials be- tween reinAll forcetrials ments .64 .80 .66 .82 .63 .81 .66 .77 .66 .81 .67 .80 .58 .81 .65 .65 .68 .81 .68 .83 .59 .74 .82 .62 .74 .80 .84 .58 .76 .64 .79 .53 .70 .85 .84 .64 .74 .57 en i—i t~* Momentarymaximizing prediction .P ^ > .50 r 1.00 -00 1.00 £ 2 i.oo* aW 1.00 1.00 * .00 -^ > 2 ° * * 1.00 * .00 «! * * * H-1 ns 2* w Table 1 (continued) (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Nevin procedure subjects Shimp procedure subjects (11) (12) Shimp procedure mean (13) (14) Nevin procedure mean Trials between reinAll forcetrials ments Trials between reinAll forcetrials ments Probability of a left given Bl B2 B3 B4 B5 B6 B7 B8 B9 BIO LRLR LRRL LRRR RLLL RLLR RLRL RLRR RRLL RRLR RRRL RRRR .91 .76 .68 .65 .91 .72 .81 .61 .86 .78 .35 .90 .63 .81 .43 .90 .57 .88 .51 .76 .69 — .94 .78 .50 .53 .98 .65 1.00 .72 — — .14 .84 ..53 .80 .36 .85 .42 .83 .31 .84 .61 — .94 — — .90 1.00 1.00 — — — — — .84 .67 .79 .53 .81 .61 .84 .43 .82 .69 .81 .81 .75 .82 .57 .81 .69 .75 .54 .73 .78 .54 .84 .70 .75 .67 .85 .70 .71 .55 .71 .66 .65 .78 .53 .56 .70 .71 .63 .67 .71 .80 .43 .64 .79 .69 .65 .64 .78 .65 .80 .53 .69 .62 .45 .88 .61 .77 .60 .91 .55 .85 .46 .82 .68 .52 .93 .39 .57 .39 .96 .50 .63 .20 .74 .55 — .82 .65 .70 .62 .80 .65 .76 .55 .77 .61 .58 .82 .52 .71 .68 .82 .53 .73 .53 .66 .53 .64 .74 .66 .73 .58 .90 Relative response rate .66 .69 .72 .69 .70 .73 .74 .69 .72 .75 .72 .72 .74 .76 Relative reinforcement frequency .76 .77 .74 .76 .75 .74 .75 .76 .77 Note. Numbers in parentheses represent column headings. B = bird; L = left-key response; R = right-key response. * Theoretically nonoccurring sequence. (15) Momentarymaximizing prediction 1.00 w tn H 90 d o H d » w o * * ffi o I —I o w SILBERBERG, HAMILTON, ZIRIAX, AND CASEY 376 control of choice by a molecular process (e.g., momentary maximizing). Nevin's procedure also showed matching but without apparent sequential dependencies of choice— these studies appeared to support contradictory explanations of matching Shimp's procedure showed matching with sequential dependencies—presumptive evidence for INDIVIDUAL DATA SKIMP MCTHOO NIVIN METHOD .90,70.60- Bl (.74) B6 (.66) .30- < .70- Z 62 (.66) u. o CO O .20 . . B6 1.72) B» (.73) B9 (.69) ; B4 (.56) .70 .30 I(BO)**LEFT-KEVI'UN ' t-*°' OO RIOHT-KEY RUN .10 1 2 3 4 5 6 1 2 3 4 5 6 SUCCESSIVE T R I A L S Figure 1. Probability of a changeover between keys as a function of successive choices to a key for each bird (B) in the Shimp (left-hand panels) and Nevin (right-hand panels) procedures. (Relative left-key response rates are in parentheses adjacent to each bird's identification number within a panel.) THE STRUCTURE OF CHOICE a finding consistent with the view that choice is best described at the molar level of relative response rate. We hypothesized that these differences were more apparent than real, a consequence of using different measures in each study as assays for sequential dependencies of choice. It now appears that this hypothesis was correct. In both replications, analysis in terms of Shimp's measure (sequential statistics) produced data comparable to Shimp's (see Shimp, 1966, Table 3), whereas analysis in terms of Nevin's measure (changeover probabilities) reproduced Nevin's original findings (see Nevin, 1969, Figure 4). Moreover, it is important to keep in mind that Shimp's measure subsumes Nevin's: Sequential statistics are composed of all choice sequences; changeover probabilities are not. Hence, it is in order to conclude that matching occurred with sequential dependencies in both the Nevin and Shimp studies. The reason these dependencies were not manifest in Nevin's original work was that his measure excluded those sequences needed to demonstrate the underlying dependencies in his choice data. Figure 1 shows that the probability of a changeover often decreased as a function of run length to a key. A similar tendency toward response perseveration was noted by Nevin. He attributed this result to sequential changes that occur in relative reinforcement frequency as run length to a key increases. Nevin supported this interpretation by showing that plots of obtained relative reinforcement frequency covaried with the probability-of-changeover curves shown in Figure 1 of the present experiment. While Nevin's ' account appears adequate for choice procedures using independent schedules of reinforcement, it cannot explain why the probability-of-changeover curves from the Shimp replication also decrease with increasing run length. In the Shimp replication, only one reinforcement was assigned at a time, a consequence of which was that relative reinforcement frequency could not vary as a function of run length. For Nevin's interpretation to be generally applicable, flat probability-of-changeover curves would be predicted for subjects in the Shimp procedure. 377 Clearly, this result did not obtain, a fact that calls into question whether sequential changes in relative reinforcement frequency actually control sequential changes in the probability of a changeover. An alternative interpretation of these findings might be that perseveration is simply a frequent concomitant of choice (e.g., see Morgan, 1974). The sequential dependencies found in the Shimp and Nevin replications bore an imperfect correspondence with the predictions of the momentary-maximizing sequence. This finding shows that momentary-maximizing is of some predictive value in describing the sequential characteristics of choice. Nevertheless, inspection of Table 1 suggests two serious problems with a momentarymaximizing account of the data: 1. Momentary maximizing tends to overstate key preference, the predicted conditional probability of an L or R often being 1.0. For example, an R should always follow two Ls according to a momentary-maximizing account, an L should always follow an R, and so forth. As is clear from Table 1, conditional probabilities of an L or an R only infrequently approximated the predicted total preference for a given alternative. 2. Momentary maximizing predicts that some sequences §hould never occur (see asterisks in Table 1). Yet these sequences not only occurred, but sometimes they occurred with substantial frequency. The problem with momentary maximizing as an account of choice behavior is the all-ornone nature of its predictions. It does not consider factors (e.g., inattention, forgetting) that might introduce variability into the execution of the momentary-maximizing sequence. An alternative momentarymaximizing model, related to one originally advanced by Silberberg and Williams (1974), does consider these factors. It will be shown that this new model can account for the nonexclusive conditional preferences seen in Table 1 and for the occurrence of theoretically nonoccurring sequences. Moreover, it will produce sequential statistics and probability-of-changeover curves similar to those seen in Table 1 and Figure 1. Momentary maximizing with errors. In SILBERBERG, HAMILTON, ZIRIAX, AND CASEY 378 O - Ct UJ 6 > c o _ HJ d ^ .4 SIMULATION g 1 NEVIN METHOD SHIMP METHOD 2 LEFT-KEY 3 4 5 RUN LENGTH Figure 2. Probability of a changeover to the right key as a function of the number of left-key choices (left-key run length). this model it is hypothesized that choice behavior is controlled by a momentary-maximizing sequence; however, errors occasionally occur in sequence execution because subjects fail to recall the loci of prior choices. The likelihood of recalling past choices is influenced by schedule factors, such as the duration of the ITI, and behavioral factors, such as inattention. The likelihood of "forgetting" the loci of prior choices is assumed to be equiprobable after all responses. Hence, on those occasions when past behavior is correctly recalled, the momentary-maximizing sequences will have a higher likelihood of reinforcement than.all other response sequences. Occasional faille to recall prior choices will therefore influence the rate at which subjects learn the momentary-maximizing sequence but not whether it is learned (see Silberberg & Williams, 1974). After the momentary-maximizing sequence is learned, this model predicts that strong sequential dependencies will appear when recollection of the loci of prior choices is correct. When past choices are not recalled, pigeons will "guess" what response would have been next in the momentary-maximizing sequence. Choices governed by guessing will be randomly distributed among the response alternatives denning the momentary-maximizing sequence. These statistically independent pecks will tend to minimize the underlying sequential dependencies that characterize choice when errors do not occur. Matching can still occur, however, because each choice in the momentarymaximizing sequence will, over time, re- ceive an approximately equal number of misappropriated responses. The degree to which relative response rate appears to be a consequence of a sequentially dependent process will depend upon the blend of "remembered" and "forgotten" response sequences a given choice procedure supports. Simulation assay of maximizing with errors. The predictive adequacy of this "error"-based version of Shimp's momentary-maximizing rule can be assessed via computer simulation. Toward this end, the experimental contingencies characterizing the Shimp and Nevin replications (see Method section) were programmed on a computer. Also simulated was a stat bird that was programmed to emit the LLR sequence dictated by momentary maximizing for both of these procedures. Errors were simulated in the following manner: After each choice there was some probability that the stat bird would forget the locus of its prior response, and when forgetting occurred, the stat bird guessed which response would have been next in the maximizing sequence, each of the three elements in the sequence (first L, second L, and R) being equiprobable. The sole parameter in this model was the probability of forgetting. In the simulations presented here, the forgetting parameter was set at p = .45, the rationale for its selection being that it provided a reasonable empirical fit to the data from the Shimp and Nevin replications. Simulated sessions ended after 200 trials for the Shimp stat bird and after 60 reinforcements for the Nevin stat bird. Data, * summed over 20 sessions, were recorded in terms of the measures used in Experiment 1. Figure 2 presents the probability of a changeover as a function of left-key run length. The Shimp and Nevin stat birds are represented, respectively, by solid lines and closed circles. Several features of these stat birds' data are of interest: 1. The data in Figure 2 establish that essentially flat probability-of-changeover curves are not incompatible with the idea that choice is controlled by a maximizingtype response strategy. Consequently, Nevin's (1969) failure to find positive slopes THE STRUCTURE OF CHOICE in terms of this measure does not preclude molecular control of choice in his study. 2. These simulated performances replicate an important aspect of the changeover-probability curves from Experiment 1: The simulated curves for the Shimp and Nevin methods are similar in form, despite between-study procedural differences. This result is consistent with the thesis advanced earlier that the different interpretations lent Slump's and Nevin's data were more a consequence of differences in measures than differences in the contingencies of their concurrent schedules. 3. Except for the tendency toward progressive perseveration seen in several subjects' curves in Figure 1, these curves seem qualitatively similar to the actual findings from the Shimp and Nevin replications. This result suggests (a) that maximizing with errors is a reasonable first approximation to modeling choice allocation in these two studies and (b) that possibly the addition of a perseveration element to the model may improve the correspondence between simulated and actual performances. The simplest model on which the matching law might be based attributes matching rela- 379 tions to the independent selection among choice alternatives (see Herrnstein, 1961, p. 270). Yet, in terms of a probability-ofchangeover analysis, this response-independence interpretation offers no more effective a description of the Shimp and the Nevin replication data than does maximizing with errors. The reason is clear-cut: With a forgetting parameter of .45, both maximizing with errors and stochastic choice allocation generate essentially flat probabilityof-changeover curves. Thus, even though many curves from Figure 1 have near-zero slopes, these data cannot be used in support of the idea that successive choices are statistically independent. In fact, a maximizing-with-errors account of choice has significant predictive advantages over accounts based on either momentary maximizing or the statistical independence of choice, and these advantages become apparent when these different models are compared in terms of the sequential statistics they generate. Table 2 makes these comparisons. It presents the probability of an L following different choice combinations for the Shimp (columns 2 through 5) and Nevin (columns 6 through 9) pro- Table 2 Comparison of Sequential Statistics From Shimp and Nevin Procedures (2) (3) (4) (5) (6) (7) (8) (9) L R .66 .90 .58 1.00 .67 .91 .73 .73 .73 .73 .73 .73 .73 .73 .73 .73 .73 .73 .73 .73 .64 .80 .63 .81 .66 .77 .57 .85 LL LR RL .62 .85 .54 .84 .50 1.00 0.00 1.00 1.00 * * 1.00 1.00 .69 .69 .69 .69 .69 .69 .69 .69 .69 .69 .69 .69 .69 .69 (1) RR LLL LLR LRL LRR RLL RLR RRL RRR .64 .83 .72 .93 .64 .86 .56 .88 .62 .69 .77 .86 .55 .85 .77 .85 .52 .83 .74 .90 .28 1.00 1.00 * . .37 1.00 1.00 * .25 * * * .66 .81 .67 .80 .58 .81 .65 .65 .45 .85 .74 .87 .50 .84 .74 .86 .41 .88 .73 .87 * 0.00 * * * Note. Numbers in parentheses designate column headings: (1) represents the probability of a left given the responses in the column; (2) represents the Shimp data from Table 1, column 11; (3) represents momentary maximizing with 45% forgetting for the Shimp data; (4) represents the momentary maximizing prediction for the Shimp data; (5) represents the response independence prediction for the Shimp data; (6) represents the Nevin data from Table 1, column 13; (7) represents momentary maximizing with 45% forgetting for the Nevin data; (8) represents the momentary maximizing prediction for the Nevin data; (9) represents trie response independence prediction for the Nevin data. * Sequence that did not occur during simulations. 380 SILBERBERG, HAMILTON, ZIRIAX, AND CASEY dence interpretation (columns 5 and 9) in accounting for the results of Experiment 1 is obvious. This account predicts no changes in sequential statistics; yet inspection of columns 2 and 6 shows the presence of strong (2) (3) (4) (5) (6) (7) (1) sequential dependencies in the Shimp and Nevin replications. The momentary-maxiRR 923 2501 1582 2675 4138 2114 mizing account (columns 4 and 8) does a LRR 758 1816 1359 2041 2842 1830 RLR 2986 1816 2030 3539 2842 3194 noticeably better job. It often predicts the RRL 762 1816 1359 2051 2842 1830 direction of change in sequential statistics RRR 161 681 223 623 1283 284 for the Shimp and Nevin data. Its major LLRR 385 1319 1014 1348 1972 1425 failings are that it tends to overstate left- or LRLR 2698 1319 1682 2812 1972 2702 649 1319 1160 1636 1972 1582 LRRL right-key preferences and that it predicts 109 493 200 404 887 248 LRRR that some frequently occurring sequences RLRL 2612 1319 1689 2851 1972 2796 will not occur at all (see asterisks). Of the 369 493 341 684 887 398 RLRR three accounts considered, maximizing with 471 1319 1011 1321 1972 1338 RRLL RRLR 287 493 348 725 887 492 errors comes closest to predicting the se111 493 200 407 887 248 RRRL quential statistics obtained in Experiment 1. RRRR 50 188 36 23 216 396 It does an excellent job of predicting ordinal in sequential statistics, even among changes Note. Figures in parentheses designate column headings: (1) represents the theoretically nonoccurring sequences that should be nonoccurring acresponse sequence; (2) represents the frequency cording to a momentary-maximizing interfrom the Shimp replication; (3) represents the pretation. Moreover, except for one statistic response-independence prediction for the Shimp (the probability of an L given an LLR from data; (4) represents momentary maximizing with 45% forgetting for the Shimp data; (5) represents the Shimp replication), maximizing with the frequency from the Nevin replication; (6) repre- errors corresponds more closely than does sents the response-independence prediction for the momentary maximizing with all the seNevin data; (7) represents momentary maximizing with 45% forgetting for the Nevin data. L = left- quential statistics obtained in Experiment 1. key response; R = right-key response. The predictive superiority of maximizing with errors in accounting for the molecular cedures. Columns 2 and 6 present the characteristics of choice can be illustrated in sequential statistics that obtained for each another manner—in terms of how well it procedure in Experiment 1. Columns 3 and predicts the frequency of sequences momen7 present the sequential statistics predicted tary maximizing defines as nonoccurring. by maximizing with 45% forgetting. Col- Some of these sequences are listed in columns 4 and 8 present the momentary-maxi- umn 1 of Table 3. Columns 2 and 5 premizing predictions for the Shimp and Nevin sent, respectively, the absolute frequency procedures. These predictions were deter- of these sequences for the Shimp and Nevin mined by setting the forgetting parameter subjects summed over Sessions 31-60 in in the maximizing-with-errors model at 0%. Experiment 1. Columns 3 and 6 present the At 0% forgetting, simulated behavior cor- frequency predicted if successive choices responds exactly with momentary maximiz- were statistically independent for the Shimp ing. The response-independence prediction and Nevin replications. Columns 4 and 7 (columns 5 and 9) states that the probability present the frequency predicted if all subof an L is unaffected by prior choices. Hence, jects were following the maximizing-withits prediction equals the relative left-key re- errors model with a forgetting parameter of sponse rate for the Shimp and Nevin groups 45%. The frequencies in columns 4 and 7 in Experiment 1. Asterisks refer to se- were adjusted so as to equal the absolute quences that did not occur during simula- number of responses produced by subjects in the Shimp and Nevin replications. As is tions. clear from Table 3, sequence frequencies The inadequacy of a response-indepen- Table 3 Sequence-Frequency Comparison Between Maximizing- With-Errors and Response-Independence Predictions THE STRUCTURE OF CHOICE predicted by maximizing with errors are closer than those predicted by a responseindependence model for all sequences save RRLR and RRRR from the Nevin replication. In addition, maximizing with errors is obviously a more successful account of these sequence frequencies than is a momentarymaximizing model, because momentary maximizing predicts that none of these sequences will occur. Forgetting-mode behavior: molar or molecular ? The maximizing-with-errors model accommodates within microstructural principles many empirical features of the Shimp and the Nevin replication data. Despite these successes, one might question the necessity of labeling the model itself as an exclusively molecular account of choice. The problem is that, although this model attributes choice allocation solely to molecular factors, its constituent remembering and forgetting processes can be respectively viewed as molecular and molar components of the model: When the subject remembers prior choices, sequential dependencies abound in a fashion conforming with the maximizing sequence and molecular control of choice; when, however, the subject forgets prior choices, subsequent responding is statistically independent and also conforms with the molar version of the matching law. In other words, forgetting-mode behavior can be interpreted in two ways: (a) as following the maximizing sequence, but failing to evidence this fact in sequential statistics due to randomly distributed errors in sequence execution or (b) as following the molar predictions of the matching law. There is no absolute way of resolving whether the molar or molecular interpretation is correct. Why, then, does the model account for forgetting-mode behavior exclusively in molecular terms? The answer can be found in the Silberberg and Williams (1974) study, from whence this model came. In their procedure only changeovers could be reinforced, and the probability of reinforcement for a changeover to the left key exceeded that for the right key. In this case, momentary maximizing (always responding to that alternative more likely to 381 produce reinforcement) dictated strict alternation between keys, a strategy that translates into a relative left-key response rate of .5. Matching relations, on the other hand, predicted a left-key preference because the majority of reinforcements occurred for leftkey choices. SilberBerg and Williams exposed birds to this choice procedure on a trials basis; the ITI separating successive trials was 1, 22, or 120 sec, depending on the bird. For the 1-sec ITI birds the primacy of molecular control was established in terms of both molecular and molar measures: Changeover-probability curves showed high frequencies of alternation between keys, and the relative left-key response rates equaled .5. For the 22- and 120-sec ITI groups, however, a different picture emerged: In terms of their molecular measures, successive choices appeared to be sequentially independent; nevertheless, the relative left-key response rate, which still approximated .5, consistently undermatched the relative left-key reinforcement frequency. This finding is obviously compatible with the notion that the statistically independent choice allocation is due to errors in executing the molecular LR maximizing sequence; however, it is incompatible with the notion that when the animal forgets the loci of prior choices, its behavior is controlled by a molar variable such as relative response rate. Because only an exclusively molecular interpretation of the maximizing-with-errors model can account for the data from both the Silberberg and Williams study and the Shimp and Nevin replications, the only generally applicable and parsimonious interpretation of forgetting-mode behavior is that it is molecularly controlled. Experiment 2 The results of Experiment 1 are of some theoretical interest, for they unify under microstructural principles two studies (Nevin, 1969; Shimp, 1966) that have given incompatible accounts of why matching often obtains on concurrent schedules. As is now apparent, matching in both of these studies was likely due to a molecular SILBERBERG, HAMILTON, ZIRIAX, AND CASEY 382 Table 4 Relative Left-Key Reinforcement Frequencies and Relative Response Rates Relative left-key reinforcement frequency Bird Scheduled Obtained Relative left-key response rate 21 22 23 24 Condition 1 .75 .72 .25 .25 .50 .46 .50 .55 .43 .32 .31 .53 21 22 23 24 Condition 2 .67 .68 .33 .36 .67 .71 .33 .34 .48 .43 .58 .26 process—a fact obfuscated in Nevin's experiment by the use of the probability-of-changeover measure as the major assay for molecular control. Moreover, an adaptation of Shimp's momentary-maximizing principle was presented that predicts errors in the execution of the momentary-maximizing sequence. This model, called maximizing with errors, meets two important criteria expected of any new account of concurrent performances : (a) It is superior to extant accounts in modeling the molecular characteristics of the choice data obtained, and (b) it produces matching when all choice sequences are averaged together. At least preliminarily, these results suggest that matching is a consequence of a molecular process that can be described in an orderly way in terms of mechanisms operative at the level of the response sequence. This progress notwithstanding, it should be kept in mind that neither Nevin's nor Shimp's procedures are representative of most choice studies because in these experiments successive choices were presented in discrete trials. Thus, the replications just presented still leave one question unanswered : What is the proper description of choice on the more frequently used, freeoperant paradigms? In order to answer this question, Experiment 2 studied choice allocation on a free-operant choice procedure similar to that used by Herrnstein (1961) in his prototypic demonstration of matching relations. Method Subjects. Four adult male White Carneaux pigeons, deprived to 80% of their free-feeding weights, were used. All birds were experimentally naive at the beginning of the experiment. Apparatus. The apparatus was the same as in Experiment 1. Procedure. After magazine training, all birds were exposed to an autoshaping procedure in which only the two side keys served as conditioned stimuli (see Experiment 1). After reliable pecking obtained to both side keys, all subjects were placed in the main experimental procedure. In this procedure, both side keys were illuminated, one with red and the other with green light. Reinforcement was assigned for side-key responses by a single VI 45-sec schedule, defined according to the specifications of Fleshier and Hoffman (1962). When this schedule made reinforcement available, it was assigned by a probability generator to either the left or right key (see Stubbs & Pliskoff, 1969). Once a reinforcement was assigned, the VI schedule was inoperative until that reinforcement was delivered. Every changeover between keys began a 1.5-sec COD clock. The first post-COD response accessed reinforcement if it had been assigned to that key. All responses during the COD were recorded but had no scheduled consequences. Each daily session ended after 50 4-sec reinforcements. Each subject was exposed to two different conditions, each condition distinguishable on the basis of its relative frequency of reinforcement (see Table 4). Detailed sequential statistics were recorded for all choices and for those choices occurring during the post-COD period. All data presented are based on the last 10 sessions of each 30-session condition. Results A comparison in Table 4 between relative response rate and relative reinforcement frequency shows considerable intersubject variability in the quality of matching obtained. Most discrepant with the matching law were B21's data. This subject preferred the right key during both experimental conditions despite the fact that the other key provided a substantially greater frequency of reinforcement. Although all other subjects preferred the appropriate key when exposed to unequal reinforcement frequencies, there was not as close a correspondence between relative rates and reinforcement frequencies THE STRUCTURE OF CHOICE as is sometimes found in this procedure (e.g., see Stubbs & Pliskoff, 1969). Table 5 presents for each bird in each condition the probability of a left-key response following all combinations through lengths of three responses. Conditional probabilities on the left-hand side of the table are based on all responses. Probabilities on the right-hand side exclude all COD responding. These data show powerful sequential dependence of choice whether based on all responses or only on post-COD responding. Generally speaking, the probability of a left response was high if the prior choice was left and was low if the prior choice was right —a finding equivalent to saying that the probability of a changeover between keys was 383 low for all subjects. However, not all features of these response sequences can be explained in terms of low changeover probabilities. For example, in 16 out of 16 observations, the probability of an L after one L was higher than after two Ls, and in 14 of 16 observations, the probability of an L after two Ls was higher than after three Ls. Also, the probability of an L after an RL was consistently higher than after an LL. Hence, choice was controlled not only by the locus of the prior response, but by the structure of the preceding response sequence. Figure 3 presents for each condition the probability of a switch between keys as a function of run length to a key. Left- and right-key curves, which are based on all Table 5 Subjects' Left-Key Conditional Probabilities During All Choices and During Post-COD Choices All choices Post-COD choices Probability of a left given B21 B22 L R LL LR RL RR LLL LLR LRL LRR RLL RLR RRL RRR .84 .12 .82 .02 .94 .13 .80 .02 .90 .03 .89 .02 .94 .15 .91 .05 .90 .01 .96 .05 .89 .01 .89 .01 .99 .02 .96 .05 Condition 1 .89 .85 .05 .19 .88 .82 .00 .05 .99 1.00 .06 .23 .86 .78 .00 .05 1.00 .99 .00 .12 1.00 .99 .22 .13 .99 1.00 .06 .26 L R LL LR RL .88 .12 .87 .01 .99 RR .14 .84 .01 1.00 .02 1.00 .00 .99 .15 .85 .12 .82 .01 .99 .14 .78 .00 1.00 .01 .98 .07 .99 .16 Condition 2 .88 .85 .16 .06 .86 .83 .04 .00 .97 1.00 .19 .06 .84 .79 .04 .00 .78 1.00 .06 .00 .98 1.00 .06 .75 .98 1.00 .22 .06 LLL LLR LRL LRR RLL RLR RRL RRR B23 B24 B21 B22 B23 B24 .78 .20 .76 .18 .83 .20 .75 .18 .83 .16 .79 .14 .83 .21 .74 .05 .72 .01 .83 .05 .67 .02 .92 .01 .83 .00 .83 .05 .60 .10 .54 .06 .68 .10 .55 .06 .64 .08 .53 .07 .68 .10 .72 .26 .69 .18 .79 .29 .66 .16 .76 .23 .78 .23 .79 .31 .61 .16 .60 .56 .84 .15 .51 .05 .63 .17 .47 .51 .53 .07 .06 .63 .18 .62 .06 .73 .12 .56 .06 .63 .19 .04 .61 .11 .56 .07 .63 .18 .82 .45 .96 .58 .81 .45 .96 .57 .87 .41 .96 .58 Note. B = bird; L = left-key response; R = right-key response; COD = changeover delay. .42 .00 .65 .08 .34 .00 1.00 .00 .48 .00 .65 .08 SILBERBERG, HAMILTON, ZIRIAX, AND CASEY 384 FIRST .to- CONDITION SECOND CONDITION B2I (.43,.72) B2I (.48,.68) B22 (.32,.25) B22 (.43,.36) .50- .30- w .10- UJ UJ u m z O ,90- .70- I L E F T - K E Y RUN CW3 R I G H T - K E Y RUN .50-1 .30- Ul < U. O .10- .90- =" CD .70 00 O QC Q. . .-80 B23 (.31,.46) B23 (.58,.71) B24 (.53,.55) B24 (.26,.34) -SO- Q Z .90 O U .SO • .SO- .10 • I I —r- -i— 2 3 5 6 S U C C E S S I V E C H O I C E S TO A KEY Figure 3. Probability of a changeover as a function of run length to a key. (In parentheses adjacent to bird identification numbers are, respectively, the left-key relative response rate and relative reinforcement frequency.) THE STRUCTURE OF CHOICE choices, are represented by closed and open circles, respectively. Subject identification, relative response rate, and relative reinforcement frequency are presented in the upper left-hand corner of each panel. Generally speaking, there was a small but consistent tendency for birds to switch keys the longer they had responded to a key. Discussion At the molar level of relative response rate, the choice data of this experiment showed considerable intersubject variability, failing in some instances to conform with the predictions of the matching law. Although the mean deviation from the matching prediction across subjects and conditions was .13—a result larger than sometimes obtains in matching studies (e.g., see Herrnstein, 1961; Stubbs & Pliskoff, 1969)—this degree of deviation is by no means unprecedented (see Baum & Rachlin, 1969; Myers & Myers, 1977). Although the relative-rate data were variable, statistics based on response sequences seemed quite consistent across subjects and conditions. The largest effect at this molecular level was that the probability of a changeover was low regardless of the composition of the prior response sequence. This finding was no doubt a consequence of the COD, which punishes high changeover rates (see Shull & Pliskoff, 1967). Nevertheless, a second-order effect, demonstrating control by larger response sequences, was also discernible. For example, although changeover probabilities were low for all response sequences, they were lower still if subjects had just switched keys. This conclusion is corroborated by the changeover-probability curves in Figure 3. These curves show that the changeover probability is always well below chance levels (.50 on Y axis) at the beginning of a response run to either choice alternative. The consistent rise in these curves with run length suggests that choice was governed not simply by the locus of the last response, but also by some other local process, such as time since a changeover or the structure of the prior response sequence. 385 Despite COD-dictated constraints on the pattern of choice allocation, molecular control of choice was evidenced in both sequential statistics (Table 5) and in the changeover-probability plots (Figure 3). Nevertheless, the presence of the COD complicated any attempt to identify the concurrent response rule for free-operant choice allocation. Because of this problem, the next experiment studied choice on! concurrent VIVI schedules without a COD. Although undermatching is to be expected (e.g., see Findley, 1958), the sequential properties of free-operant concurrent performances can be analyzed in this procedure unencumbered by the effects of COD contingencies. Experiment 3 Method Subjects. Four adult male White Carneaux pigeons, deprived to 80% of their free-feeding weights, were used. All birds were experimentally naive at the beginning of training. Apparatus. The apparatus was the same as in Experiment 1. Procedure. After magazine training, all birds were trained to peck both side keys via the autoshaping procedure described in Experiment 2. The birds were then exposed to the choice procedure described in Experiment 2, except that no COD was used. As in Experiment 2, reinforcement was assigned by a single VI 45-sec schedule; however, in this experiment, the left key was three times as likely to receive an assignment as the right key for all subjects. All data presented are based on the last 5 of 20 daily sessions. All other features of the procedure are the same as in Experiment 2. Results and Discussion Table 6 presents the conditional probability of a left-key peck given various sequences of prior responses for each subject and for the average of all subjects. When analyzed only in terms of the prior response, there is little evidence of sequential dependence of choice. However, when larger sequences are considered, particularly response runs to a key, large differences in conditional probabilities become apparent. For example, although there is no difference in the probability of an L given just one L SILBERBERG, HAMILTON, ZIRIAX, AND CASEY 386 Table 6 Left-Key Conditional Probabilities {Experiment 3) Probability of a left given B61 B62 B63 B64 M L R LL LR RL RR LLL LLR LRL LRR RLL RLR RRL RRR LLLL RRRR LLLLL RRRRR LLLLLLL RRRRRR .59 .60 .55 .59 .64 .61 .55 .60 .64 .61 .55 .57 .63 .60 .57 .54 .56 .51 .51 .32 .60 .59 .61 .64 .59 .50 .62 .64 .57 .58 .58 .66 .63 .43 .64 .36 .66 .29 .67 .26 .75 .71 .70 .67 .93 .81 .67 .68 .93 .84 .75 .63 .93 .68 .67 .60 .69 .62 .71 .55 .68 .82 .68 .85 .70 .67 .69 .84 .68 .72 .65 .88 .78 .56 .71 .50 .73 .49 .74 .48 .69 .69 .66 .71 .75 .65 .66 Subjects .71 .74 .71 .67 .71 .77 .54 .67 .46 .69 .39 .70 .31 Note. L = left-key response; R = right-key response. or one R (mean data), large differences appear when comparing five Ls (p = .69) with five Rs (p = .39). These data suggest that choice probability varies not on a response-by-response .basis, but in terms of larger behavioral units, such as run length to a key. If choice allocation in the present experiments is governed by the lengths of response runs to a key, the goal of understanding the dynamics of concurrent performances is complicated considerably. On the one hand, choice on trials procedures, such as those used by Shimp (1966) and Nevin (1969), is well predicted by maximizing rules; on the other hand, at least one free-operant choice procedure— that of concurrent VI-VI schedules—is controlled by a fundamentally different molecular rule, possibly run lengths to a key. Nor is run length the only plausible variable accounting for the data in Table 6. Possibly, choice was controlled by the time since the last changeover. If time (and not response number) in the presence of a key controls choice, then run-length variations in changeover probability are epiphenomenal correlates of variations in the relative frequencies of interchangeover times (ICTs). In order to evaluate this second possibility, Figure 4 presents in ,25-sec classes the distribution of times between changeovers for each key on a "per opportunity" (see Anger, 1956) basis. Each subject's left- and rightkey distributions are presented on the leftand right-hand sides of the figure. The bottom panels present these distributions averaged across all subjects. Each subject's relative left-key response rate is presented in parentheses beside its identification number. As regards relative rates, it was anticipated that all subjects would undermatch because no COD was used in this experiment. Given this expectation, the quality of matching obtained was surprisingly good: Two birds (B63 and B64) had relative rates closely conforming with the matching prediction, the other two birds (B61 and B62) undermatched, and the mean deviation from the matching prediction was only .06. As regards the ICT distributions from this experiment, the ICT curves from Figure 4 show that (a) there is a modal time for staying on a key before switching, (b) the mode is either the same for both keys (B62, B64) or longer for the preferred key (B61, B63), and (c) the variance in ICTs is generally substantially less for the unpreferred key than for the preferred key. The differences in distribution variances for each key show that when subjects responded to the unpreferred key, it was for an essentially fixed period of time. For example, the large majority of B64's ICTs in the presence of the unpreferred key fell between .75 and 1.75 sec. When not making these brief samples of the unpreferred key, the bird responded in the presence of the preferred key, that key's ICTs being more variable. In Figure 4 we see that pigeons tend to spend a particular period of time in the presence of a key before switching. This finding raises the possibility that choice is governed by the duration of successive time allocations to a key and not by run length to a key. Because conditional probabilities of a changeover vary in an orderly way whether 387 THE STRUCTURE OF CHOICE based on responses (Table 6) or time (Figure 4), it is impossible to discern whether response number or time in the presence of a key is the likely factor governing choice allocation in this experiment. Moreover, the data from Figure 4, like those from Table 6, violate the predictions of the maximizingwith-errors account presented in Experiment 1. Because responses and time covary in discrete trials, one need only replace "response number" (left-key run length) in Figure 2 with "time" in arbitrary units to assess the adequacy of maximizing with errors in interpreting the data from Figure 4. As can be seen in Figure 2, the tendency of Figure 4's curves to have a modal ICT is not predicted by the curves in Figure 2; nor can it be consistent with a simple optimizing strategy because decreases in any portion of an ICT curve are incompatible with maximizing the time rate of reinforcement. The purpose of the present experiment was to assess the generality of the maximizing-with-errors account developed in Ex- VI 60-SEC KEY .30 .30 ' 861 (.59) .10 .10 .30 VI 180-SEC KEY .30 ' B62 (.60) .10 .10 .70 B63 (.74) .50 .30 .10 .10 .50 B64 (.72) .30 .10 .10 ' .30 MEAN (.69) .10 2.5 7.5 2.5 7.5 TIME BETWEEN CHANGEOVERS (IN SEC) Figure 4. Conditional probability of a changeover as a function of time since the last changeover in .2S-sec units for the variable-interval (VI) 60-sec key and the VI 180-sec key. (Adjacent to bird identification numbers and in parentheses is the relative response rate to the VI 60-sec key. The bottom pair of panels present the mean performances of all birds.) 388 SILBERBERG, HAMILTON, ZIRIAX, AND CASEY periment 1. The findings show that the generality of this account is limited. In particular, the results raise the possibility that choice is largely due to a maximizing-witherrors process in some situations (i.e., discrete-trials procedures as used in Experiment 1) but not in others (e.g., free-operant procedures as used in the present experiment). Moreover, a new question emerged from the analysis of the ICT distributions of Figure 4: Are local changes in the pattern of choice allocation governed by the loci of prior choices as was suggested in Experiment 1, or are they governed by the time since a changeover? Two-state choice model. Quite obviously, a new model is needed, one that successfully predicts the results of Experiment 1, as well as the major features of the ICT distributions from Figure 4 and the changeoverprobability functions from Table 6. Toward this end, four models were evaluated by computer simulation, each simulation based on an alternative response rule. Two of the simulations were based on the responseindependence and maximizing-with-errors models presented in Experiment 1. As in Experiment 1, the stat bird was programmed to emit the LLR maximizing sequence with some probability of failing to "recall" the prior element in the response sequence. When forgetting occurred, the stat bird guessed what the prior element had been, these guesses being equally distributed among the three elements defining the maximizing sequence. The response-independence model was simulated by setting the forgetting parameter at p = 1.0; because this stat bird always forgets its place in the maximizing sequence, the relative response rate obtained should show no sequential dependence of choice. For the stat bird in the maximizing-with-errors simulation, the forgetting parameter was arbitrarily set at p = .45, the same value as in Experiment 1. The other two simulations retained response independence and maximizing with errors as the basic response rules; however, a second element was added, that of response perseveration. These last two models assume pigeons' choice behavior can be described by two states, one state being either response independence (Simulation 3) or maximizing with errors (Simulation 4) and the other state being response perseveration. When in the response-independence or the maximizing mode, choices (and hence, ICTs) are allocated according to the basic response rules discussed in Experiment 1. However, with p — .05, each response-independence or maximizing-mode response leads to entering the perseveration mode, in which the same key is selected again and again without regard to local reinforcement probabilities. With p — .35, each perseveration-mode response returns program control to the response-independence or maximizing algorithms. In all four simulations, stat birds responded on a simulated concurrent schedule, the contingencies of which were identical to those of the present experiment (see Method section). Since the present series of simulations is based on a free-operant concurrent-choice procedure, an algorithm was developed for generating a free-operant response rate: After each response, a potential interresponse time (IRT) was selected from a normal distribution with a mean value of .35 sec and a standard deviation of .05 sec. Each potential IRT had a probability of emission equal to .4667. After each peck/ no-peck "decision," another potential IRT was chosen. This algorithm produces a multimodal IRT distribution with a period of .35 sec that has the appearance of a damped sine-wave function (see Slough, 1963) and has a mean rate of 80 responses per minute. Finally, all simulations had a minimum ICT of .7 sec and lasted for 3,000 min of simulated session time. In making these simulations, only one pair of parameters was systematically varied: the probabilities of entering and exiting the perserveration mode. Some 50 simulations were made of the last two models using a wide range of values for each of these parameters. For both response independence plus perseveration and maximizing plus perseveration, the entrance and exit probabilities selected—those of .05 and .35—pro- THE STRUCTURE OF CHOICE vided the closest fit to the actual data, based on experimenter judgment. Figure 5 compares the mean data from Experiment 3 (top two panels) with that generated by the four models just described (next four rows of panels) in terms of three measures: (a) relative left-key response rates (in parentheses in the left-hand panels), (b) ICT distributions (left-hand panels), and (c) changeover-probability curves as a function of run length (righthand panels). Each of the four models by which the simulated data were produced is identified in the left-hand panels. Left- and right-key runs are respectively signified by closed circles joined by solid lines and by open circles joined by dashed lines. Both the response-independence and maximizing-with-errors models (second and third rows of panels in Figure 5) fail to account for the averaged data found in the top row. Neither model shows the decreasing postmodal time-allocation functions seen in the ICT distributions; nor does either model show the selective tendency of the right-key changeover-probability curve to decrease with increases in run length. The addition of the perseveration element appears to have a beneficial effect on the predictive adequacy of both of those models (see bottom two rows of panels). Both models reproduce two important features of the curves from the mean data: (a) decreasing postmodal ICT distributions and changeover-probability curves and (b) larger decreases on the right key than on the left key. Beyond this, maximizing plus perseveration seems to approximate mean performances more closely than does response independence plus perseveration: In terms of the ICT functions, the maximizing-witherrors-and-perseveration model produces modes corresponding with those found for the left and right keys from the mean data; the alternative model does not. In terms of changeover-probability curves (see righthand panels of Figure 5), maximizing plus perseveration produces a fit qualitatively superior to response independence plus perseveration : Only the maximizing model produces a monotone-decreasing curve for the 389 right key and a nonmonotonic function for the left key, a result corresponding in form with mean performances. However, an argument more compelling than goodness of fit can be mustered in support of the maximizing-plus-perseveration model: Only this account among those considered can accommodate within a unitary conceptual scheme the findings of Experiments 1 and 3. Viewed retrospectively, it is apparent that perseveration was operative in Experiment 1, albeit to a smaller degree than in the present experiment. The evidence for perseveration can be seen in the decreases in the changeover-probability curves of Figure 1. These decreases are not simulated by the maximizing-with-errors model from Experiment 1 (see Figure 2) but are when a perseveration element is added (see Figure 5). What distinguishes Experiment 1 from Experiment 3 is not differences in process, only differences in the relative contribution of perseveration to the results obtained. But what of the results found in Experiment 2? One might argue that even if maximizing plus perseveration is of some value in accounting for choice on some procedures, it still fails when a COD is present. Our thesis is that this failure is due not to the absence of maximizing and perseveration processes in Experiment 2, but to the COD defining an alternate maximizing rule. What distinguishes Experiment 2 from the other experiments reported so far is that in Experiment 2 perseveration is consistent with, rather than antithetical to, maximizing: COD contingencies punish nonperseverative pecking by imposing a delay between the first changeover response and response-dependent reinforcement. Consequently, the low probabilities of a changeover seen in the curves from Figure 3, presumptive evidence of the operation of perseveration, are not incompatible with maximization. In fact, the positive slopes these curves show, which are opposite in value to those from Experiments 1 and 3, suggest control by local reinforcement probabilities in a way consistent with maximizing principles. As the end of a COD approaches —an event necessarily correlated with in- SILBERBERG, HAMILTON, ZIRIAX, AND CASEY 390 <X> *—• L««t key ( V T D 0---0 Might key ( VI 3) •1 R*«pon>* ind*p*nd*nc* (• 88 ) o i'i i- -3 Vw/v • ; I ft " 0--0—0- I 00 o CK a. z O E o T I I I I 1 2 3 4 '2 z S -1 Response Independence with perseveration {• 66 } 1 2 3 4 5 6 INTERCHANGEOVER TIME (in /4 s e c ) RUN 5 6 LENGTH TO A KEY THE STRUCTURE OF CHOICE creasing run length to a key—the probability of a changeover should increase if maximizing is present, because the large majority of reinforcements on choice procedures with a COD occurs during the first few seconds of the post-COD period (see Menlove, 1975). As a consequence, once the post-COD period is sampled, maximizing predicts higher changeover probabilities—the very result obtained. Thus, these curves provide some evidence for maximizing and perseveration. The failure of the maximizing-plusperseveration model is due not to the absence of these constituent processes, but to the COD defining an alternate maximizing rule. Response perseveration as found in Experiments 1, 2, and 3 has been noted many times before in studies of choice (e.g., Morgan, 1974; Shimp, 1976). Its stronger operation in Experiment 3 might be explained in the following way: When exposed to trials procedures as in Experiment 1, perseverative tendencies were interrupted somewhat by the imposition of an ITI (Nevin method) or the requirement of a center-key peck (Shimp method). On a free-operant choice procedure, such as that used in Experiment 3, however, there were no responsecontingent interruptions of access to the choice alternatives. Consequently, any perseverative tendencies naturally present were unencumbered by the method of choice presentation. A dividend of selecting the maximizingplus-perseveration model is that it answers, admittedly by fiat, another question asked earlier—Is choice governed by the loci of prior choices or by time in the presence of a key? Because this maximizing model is Figure 5. Conditional probability of a changeover as a function of the time in .25-sec units since the last changeover and as a function of run length. (The top panel presents the mean performance for each of these measures for all birds. The four pairs of panels presented below present the simulated performances of stat birds following four alternative response rules. The sources for the data in each pair of panels, along with the relative left-key response rate in parentheses, are indicated in the left-hand panels. VI 1 = variable interval 1-min schedule; VI 3 = variable-interval 3-min schedule.) 391 response based rather than time based, its adoption favors analysis in terms of response locus in predicting subsequent choice allocation. Experiment 4 In Experiment 3, maximization and perseveration were identified as likely concomitants of free-operant choice behavior. The present experiment tests the generality of this conclusion by seeing if these two processes are evident in the major alternative method for arranging free-operant concurrent schedules—that of concurrent chains (Autor, 1969). In the concurrent-chains procedure, each side key is illuminated synchronously with a stimulus signifying the initial links of each chain schedule. Choices to a key occasionally access the terminal-link stimulus for that key. Responding in the presence of each mutually exclusive terminal-link stimulus is reinforced according to some schedule of reinforcement. Autor found that when each terminal-link schedule is a VI, the relative response rate in the presence of the equalvalued VI schedules associated with the initial links matches the relative time rate of terminal-link reinforcement (see also Herrnstein, 1964). Method Subjects. Four male White Carneaux pigeons, deprived to 80% of their free-feeding weights, served as subjects. All subjects were experimentally naive at the beginning of training. Apparatus. The apparatus was the same as in Experiment 1. Procedure. Prior to exposure to the experimental procedure, birds were trained to eat from the food magazine and to respond to both side keys as described in Experiment 2. All birds were then placed on concurrent chain VI 120-sec VI 80-sec versus chain VI 120-sec VI 40-sec schedules. Daily sessions began with the illumination of the houselight and side keys with white light; white key lights signified the initial link of each chain schedule. Associated with each white key was an independent, constant-probability VI 120-sec schedule. The first white-key peck following a schedule assignment changed that key's color to either red (left key) or green (right key) and darkened the other key. Responding in the presence of either of these mutually exclusive terminal-link stimuli SILBERBERG, HAMILTON, ZIRIAX, AND CASEY 392 Table 7 Left-Key Conditional Probabilities (Experiment 4} Probability of a left given B71 B72 B73 B74 M L R LL LR RL RR LLL LLR LRL LRR RLL RLR RRL RRR LLLL RRRR LLLLL RRRRR LLLLLL RRRRRR .13 .60 .43 .61 .09 .58 .73 .65 .10 .61 .19 .61 .08 .53 .91 .51 .94 .46 .97 .43 .35 .63 .44 .62 .30 .65 .53 .62 .30 .67 .37 .63 .29 .60 .60 .60 .66 .54 .72 .48 .06 .34 .14 .35 .06 .33 .10 .36 .04 .46 .15 .35 .07 .27 — .26 — .23 — .21 .27 .28 .33 .38 .25 .24 .36 .41 .27 .38 .31 .36 .23 .19 .43 .16 .38 .14 ,31 .12 .25 .48 .40 .55 .20 .41 .53 .55 .20 .55 .31 .54 .19 .31 .65 .24 .73 .19 .80 .15 Subjects Note, L = left-key response; R = right-key response. resulted in 4 sec of access to grain according to either a VI 80-sec (left key) or VI 40-sec (right key) schedule. The houselight was constantly illuminated except during the hopper cycles. Each session terminated after 60 reinforcements. All data presented are based on the last 5 of 20 sessions. Results Molecular control of choice was evidenced in terms of both sequential statistics (Table 7) and ICT distributions (Figure 6). As regards the former measure, Table 7 presents the probability of an L given various sequences of prior responses for each subject and for the average of all subjects. No sequence probabilities are presented for leftkey runs beyond a length of three for B73 because during the last five sessions none of these sequences occurred. Sequential dependencies are evident in most of the choice data in the table, including those based only on one-member sequences. For example, with the exception of B74, all birds were more likely to peck the left key if the prior response had been an L than if it had been an R. Also apparent in the data is a progressive tendency to perseverate on a key as a function of run length—that is, the longer a bird responded to a key, the less likely it was to switch keys. Figure 6 presents the ICT distribution on a "per-opportunity" basis during the initial links of chain VI 120-sec VI 80-sec schedules (left side of figure) and chain VI 120sec VI 40-sec schedules (right side of figure). Relative left-key response rates are presented beside subject identification numbers. Averaged data are in the bottom pair of panels. In the present experiment, matching equaled a relative left-key response rate of .33. Two birds (B73 and B74) approximated this value; two others (B71 and B72) undermatched. Times in the presence of the unpreferred key were of brief duration: For all subjects, the modal ICT for the unpreferred key was between 1 and 1.5 sec. Except for B73, all birds also had modal ICTs in the presence of the preferred key. These modes had lower frequencies and were at longer ICTs than the modes for the unpreferred key. Discussion In Experiment 3, the microstructure of choice was characterized by a two-process model, one process being reinforcement optimization and the other being response perseveration. By varying the strength of each process, this model accounted for the local patterns of responding on two different concurrent procedures, those using discrete trials (Experiment 1) and those using concurrent VI-VI schedules without a COD (Experiment 3). The purpose of the present experiment was to test the generality of this account on yet another choice procedure— that of concurrent chains. The issue at hand is whether the constituent processes of this two-state model are readily identifiable in the sequential-statistics data from concurrent chains. The data from Table 7 provide strong evidence for the operation of both processes. Consider first some predictions of maximizing with errors. Because the choice procedure of this experiment has LRR as its maximizing sequence (assuming equal successive THE STRUCTURE OF CHOICE VI 80-SEC TERMINAL LINK .7 .5 .3 393 VI 40-SEC TERMINAL LINK .5 B71 (.41) .3 .1 .1 .3 .1 .7 .5 .3 .1 .5 .5 .3 B74(.28) .3 .1 .1 .3 .3 . MEAN (.39) .1 .1 2.5 7.5 10 2JS 7.5 10 TIME BETWEEN CHANGEOVERS (IN SEC) Figure 6. Conditional probability of a changeover as a function of time since the last changeover in .25-sec units during the initial link terminating in a variable-interval (VI) 80-sec schedule and in a VI 40-sec schedule. (Adjacent to bird identification numbers is the relative response rate during the initial link terminating in the VI 80-sec schedule. Mean performances for all birds are presented in the bottom pair of panels.) IRTs), this response rule predicts for all values of the forgetting parameter save 1.0 that the probability of an L will be higher following an R than an L, higher following an LR than an RL, and higher following an LLR or an RLR than an LRL or an RRL. All of these ordinal predictions are supported by the data in Table 7? The operation of response perseveration, the second element in this model, is no less 3 Sequences terminating in multiples of the same response (e.g., LRR, RLL) have been excluded from these comparisons because their frequencies are likely to be affected by perseverative responding. 394 SILBERBERG, HAMILTON, ZIRIAX, AND CASEY obvious in the data from Table 7. For all subjects, the probability of an L decreases with the number of successive Rs, a result characteristic of the operation of perseveration (see Figure S). In fact, despite differences in procedure, the changeover-probability data from the present experiment seem quite similar to those from Experiment 3, Time-based measures also seem similar between these experiments: In both studies, ICT' distributions showed that there were modal ICTs, the mode for the less preferred key being of shorter duration and more peaked than for the preferred key. These empirical correspondences suggest that the microstructure of choice is invariant across several procedural transformations. For all experiments, this microstructure seems well described as an outcome of two separable processes: perseveration and maximization. And with the exception of Experiment 2, in which COD contingencies to some extent specified the order of successive responses, the manner in which these two processes interacted seems consistent with the maximizing-plus-perseveration model. General Discussion The matching law states a relation between molar variables: In its definition, choices and reinforcements are each summed over time with consideration given neither to the order in which choices occur nor to the local changes in the likelihood of reinforcement. The apparent wisdom of stating this law in molar terms is reflected in the simple elegance of its definition and in the breadth of its predictive power (e.g., see de Villiers, 1977). Elegance and power are notable accomplishments for any behavioral law, and they no doubt explain the strong emphasis on using molar variables whenever choice has been studied. Indeed, with the exception of a very few studies (e.g., Nevin, 1969; Shimp, 1966), work on concurrent schedules has as an often implicit assumption that whatever the processes of choice allocation might be, they should be described in molar terms—as some type of average of responses and of reinforcers. To be sure, many interesting questions have been addressed by these molar analyses (see de Villiers, 1977, for a review), but a fundamental question remains : Are these matching relations, in all their various forms and with all their alternative molar measures, just descriptive statistics, or do they also make contact with an underlying psychological principle? It has been Shimp's (1966, 1969b) contention that matching relations do not describe a psychological process; rather, they are an unintended consequence of using molar measures—an artifact of averaging across many choices, none of which obeys a matching principle. He and his co-workers have supported this position by showing the generality of molecular control not only in choice procedures, but in certain singleschedule situations as well (e.g., Hawkes & Shimp, 1974, 1975; Shimp, 1966, 1968, 1969a, 1973a, 1973b). Despite these findings, molecular reinterpretations of matching relations have not gained favor, a fact evidenced by the robust and continuing preference for using molar measures in studying choice. In our opinion, two factors have discouraged widespread adoption of a more molecular approach: (a) Nevin's (1969) finding of matching without apparent molecular control of choice, a result suggesting that matching need not be explained in terms of a more molecular process and (b) an unfortunate tendency for assessing the contribution of molecular processes to matching on choice procedures unrepresentative of those usually used to show matching (e.g., Shimp, 1973a). These two problems are remedied in the present study. In Experiment 1 it was shown that there was no incompatibility between the results of Nevin (1969) and Shimp (1966) and that the matching relations obtained in both studies were likely due to a similar molecular process. As regards the second problem, Experiments 2, 3, and 4 studied the molecular characteristics of choice on concurrent VI-VI and concurrentchains schedules, these procedures being the conventional paradigms for studying choice and matching relations. As in Experiment 1, THE STRUCTURE OF CHOICE there was a discernible microstructure to choice in each of these three experiments. In sum, the data from these four experiments support the view that matching relations are due to molecular control of choice. In Experiments 1 and 3, attempts were made to identify the processes constituting this molecular control. In Experiment 1 it was found that the sequential statistics from that experiment were well described by a model that assumed that pigeons always choose that alternative more likely to be reinforced but that they often make errors in the execution of this optimizing strategy. In Experiment 3, a response-perseveration element was added to this model to accommodate the finding in that experiment that. pigeons occasionally select the same key again and again in violation of the expectations of maximizing. With these two elements in hand, the model in Experiment 3 was capable of synthesizing, at least qualitatively, major features of the microstructure of choice in all experiments save Experiment 2, in which COD contingencies made perseveration actually part of the maximizing strategy; however, even though a direct application of the maximizing-plusperseveration model failed in Experiment 2, its constituent elements of maximizing and perseveration were identifiable in the choice data. In fact, the microstructure of choice in all four experiments seemed to be an outcome of differing blends of each of these two processes. Despite the successes of the two-state model in synthesizing some of the sequential properties of choice, it is not our assertion that the definition of the processes controlling choice is necessarily at hand. To make such an assertion, one would want (a) an exhaustive list of all processes likely to be controlling choice selection and (b) a quantitative assessment of the data variance accounted for by adding or differentially combining those factors under consideration. For these reasons, we consider the present study to be only a first step in delineating the mechanisms of choice. Quite possibly, future research will posit alternative processes that will provide a better empirical fit to larger 395 sets of data than the present model. However, although some skepticism is in order as regards the model proffered in this study, a weaker claim can be made in its behalf with confidence: Whatever the proper characterization of choice might be, its basis is molecular. This conclusion follows from the finding that in all studies of choice in which sequential data have been recorded, ranging from the "typical" choice procedures of the present study to choice among IRTs (e.g., Shimp, 1973a), a discernible microstructure in the form of statistical dependencies of choice has been regularly uncovered. This study's consistent evidence for a microstructure of choice denies any isomorphism between the matching law and the processes governing choice allocation. The problem for the matching law is that it describes behavior at a molar level, based on large averages of responses and reinforcers; yet, as is now apparent, individual choices are controlled at a molecular level—by prior choices and local reinforcement probabilities. That the matching law is so often adequate as a descriptive statistic now appears a happenstance : A small set of molecular response rules, possibly based on maximization and perseveration processes, has sufficient crossprocedural generality to be evident on several alternative choice paradigms. When choice behavior governed by these rules is integrated over time, a fortuitous concordance is uncovered: The relative response rate equals the relative frequency of reinforcement. Nevertheless, these matching relations are averaging artifacts. They appear only because matching is compatible with the underlying structure of choice. If this analysis is correct, it should be possible to construct a choice procedure in which these molecular rules are incompatible with matching. Such a procedure has, in fact, been arranged by Silberberg and Williams (1974). In their study, only changeovers between choice alternatives were reinforced; yet, the probability of a reinforcement for a changeover differed between keys. If matching was primary in their study, pigeons should have responded more to the key with the higher reinforcement prob- 396 SILBERBERG, HAMILTON, ZIRIAX, AND CASEY ability. If, on the other hand, choices were quential dependencies in a subject's choices governed by a molecular process such as between the activities of, say, pecking a key maximizing, pigeons should have strictly and grooming. In the absence of this molecalternated their choices because only change- ular analysis, it is obviously premature to overs were reinforced. Silberberg and Wil- claim that the predictions of Herrnstein's liams found pigeons selected both keys (1970) broadened matching law can be atequally often despite unequal reinforcement tributed to response patterning between frequencies—a finding incompatible with these explicit and implicit schedules. Our matching (see, however, Herrnstein & Love- point here is simply that there is no incomland, 1975) but compatible with molecular patibility between the predictions of this law control in the form of maximizing. This and the thesis that its predictive success is result is consistent with the present study's due to molecular processes. conclusion that molecular processes control choice and that matching is an outcome of Conclusions this molecular control. A corollary of this study's attributions of In their discussion of molar and molecular matching to molecular control is that analyses of choice, Herrnstein and Loveland wherever matching has been found, a micro- (1975) note that any claim for the primacy structure of choice should also be evident. of molecular control would be predicated on This corollary forces consideration of de the regular demonstration that matching Villiers' (1977) recent demonstration that relations had a constituent microstructure. Herrnstein's (1970) broadened version of The purpose of the present series of experithe matching law can account not only for ments was to determine whether such a standard choice data, but also for a wide microstructure was present. In all four exrange of single-schedule performances (see periments, such a microstructure was unalso de Villiers & Herrnstein, 1976). Cer- covered, the form of which could be qualtainly, no comparable generality has yet been itatively described as the outcome of two demonstrated for molecular accounts of be- molecular processes : maximization and perhavior; yet, such a demonstration should be severation. possible if the primacy claimed for molecular Three implications of these findings come control is to be maintained. to mind: A key element in the broadened matching 1. The matching law is a descriptive stalaw's success is the assumption that even tistic, not a psychological principle of choice single-schedule performances involve choice allocation. Its successes appear due to its —in this case, between an explicit, experi- fortuitous compatibility with response rules menter-arranged schedule and an implicit, that control choice at the molecular level of omnipresent "environmental" schedule that the response sequence. reinforces other activities (e.g., grooming). 2. The major theoretical accounts that use Although this version of the matching law the matching law as the basis for measuring accounts for many findings about which ex- reinforcing value or response strength tant molecular models make no statements, (Baum & Rachlin, 1969; Catania, 1973; its superiority would disappear if molecular Herrnstein, 1970) are consequentially weakaccounts were also to advocate that single- ened. Their dilemma is clear-cut: Choice is schedule procedures are implicitly choice controlled not by value or strength, but by paradigms. A microstructural reinterpreta- prior choices and local reinforcement probtion of matching on single schedules would abilities. be that molecular processes govern a sub3. The results of the present study accomject's choices between the experimenter- pany a host of other findings (e.g., see arranged schedule and the environmental Hawkes & Shimp, 1975; Kuch & Platt, reinforcement schedule. For example, a 1976; Weiss, 1970; Williams, 1968) in supmolecular analysis might demonstrate se- porting a renewed emphasis on molecular THE STRUCTURE OF CHOICE 397 W. K. Honig & J. E. R. Staddon (Eds.), Handanalysis in studying schedule effects. This book of operant behavior. Englewood Cliffs, N J.: conclusion is dictated by the frequent demon1977. stration in the studies cited above that molar dePrentice-Hall, Villiers, P. A., & Herrnstein, R. J. Toward a and molecular measures offer fundamentally law of response strength. Psychological Bulletin, different characterizations of behavior. De1976, 83, 1131-1153. spite these demonstrations, the primacy of Findley, J. D. Preference and switching under concurrent scheduling. Journal of the Experimental molar analysis in operant-oriented research Analysis of Behavior, 1958, 1, 123-144. persists. The reasons for preferring molar Fleshier, M., & Hoffman, H. S. A progression for measures are easily guessed at—among them generating variable-interval schedules. Journal of the Experimental Analysis of Behavior, 1962, 5, being their historical precedence, their sur529-530. face simplicity, and their ease of recording. Hawkes, L., & Shimp, C. P. Choice between reYet, the results of the present study illustrate sponse rates. Journal of the Experimental Anala danger in their exclusive use: The very ysis of Behavior, 1974, 21, 109-115. real possibility exists that a molar measure Hawkes, L., & Shimp, C. P. Reinforcement of behavioral patterns: Shaping a scallop. Journal of obscures a qualitatively different, molecularly the Experimental Analysis of Behavior, 1975, 23, controlled pattern of behavior. For this rea3-16. son, decomposing standard molar measures Herrnstein, R. J. Relative and absolute strength of into plausible molecular units is a conservaresponse as a function of frequency of reinforcement. Journal of the Experimental Analysis of tive strategy to adopt. If no structure is Behavior, 1961, 4, 267-272. evident in behavior at a molecular level, the Herrnstein, R. J. Secondary reinforcement and rate molar measure can be reconstituted from its of primary reinforcement. Journal of the Experimolecular parts. If, on the other hand, the mental Analysis of Behavior, 1964, 7, 27-36. molecular measure suggests a structure of Herrnstein, R. J. On the law of effect. Journal of the Experimental Analysis of Behavior, 1970, 13, behavior incompatible with its molar coun243-266. terpart, behavior can be characterized in Herrnstein, R. J., & Loveland, D. H. Maximizing molecular terms. Such an approach maxand matching on concurrent ratio schedules. Journal of the Experimental Analysis of Beimizes the likelihood that the level of analysis havior, 1975, 24, 107-116. corresponds with the level at which beHull, C. L. Principles of behavior. New York: havioral processes exercise control. Appleton-Century-Crofts, 1943. References Anger, D. The dependence of interresponse times upon the relative reinforcement of different interresponse times. Journal of Experimental Psychology, 1956, 52, 145-161. Autor, S. M. The strength of conditioned reinforcers as a function of the frequency and probability of reinforcement. In D. P. Hendry (Ed.), Conditioned reinforcement. Homewood, 111.: Dorsey Press, 1969. Baum, W. M., & Rachlin, H. C. Choice as time allocation. Journal of the Experimental Analysis of Behavior, 1969, 12, 861-S74. Blough, D. S. Interresponse time as a function of continuous variables: A new method and some data. Journal of the Experimental Analysis of Behavior, 1963, 6, 237-246. Brown, P. L., & Jenkins, H. M. Auto-shaping of the pigeon's key-peck. Journal of the Experimental Analysis of Behavior, 1968, 11, 1-8. Catania, A. C. Self-inhibiting effects of reinforcement. Journal of the Experimental Analysis of Behavior, 1973, 19, 517-526. de Villiers, P. Choice in concurrent schedules and a quantitative formulation of the law of effect. In Kuch, D. O., & Platt, J. R. Reinforcement rate and interresponse time differentiation. Journal of the Experimental Analysis of Behavior, 1976, 26, 471-486. Menlove, R. L. Local patterns of responding maintained by concurrent and multiple schedules. Journal of the Experimental Analysis of Behavior, 1975, 23, 309-337. Mohr, S. E. An experimental investigation of two models of choice behavior, (Doctoral dissertation, The American University, 1976). Dissertation Abstracts International, 1976, 37, 1463B. (University Microfilms No. 76-19, 461). Morgan, M. J. Effects of random reinforcement sequences. Journal of the Experimental Analysis of Behavior, 1974, 22, 301-310. Myers, D. L., & Myers, L. E. Undermatching: A reappraisal of performance on concurrent variable-interval schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 1977, 25, 203-214. Nevin, J. A. Interval reinforcement of choice behavior in discrete trials. Journal of the Experimental Analysis of Behavior, 1969, 12, 875-885. Shimp, C. P. Probabilistically reinforced choice behavior in pigeons. Journal of the Experimental Analysis of Behavior, 1966, 9, 433-455. 398 SILBERBERG, HAMILTON, ZIRIAX, AND CASEY Shimp, C. P. Magnitude and frequency of reinforcement and frequency of interresponse times. Journal of the Experimental Analysis of Behavior, 1968, 11, 525-535. Shimp, C. P. The concurrent reinforcement of two interresponse times: The relative frequency of an interresponse time equals its relative harmonic length. Journal of the Experimental Analysis of Behavior, 1969,12, 403-411. (a) Shimp, C. P. Optimum behavior in free-operant experiments. Psychological Review, 1969, 76, 97112. (b) Shimp, C. P. Sequential dependencies in freeresponding. Journal of the Experimental Analysis of Behavior, 1973, 19, 491-497. (a) Shimp, C. P. Synthetic variable-interval schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 1973,19, 311-330. (b) Shimp, C. P. Short-term memory in the pigeon: The previously reinforced response. Journal of the Experimental Analysis of Behavior, 1976, 26, 487-493. Shull, R. L., & Pliskoff, S. S. Changeover delay and concurrent schedules: Some effects on relative performance measures. Journal of the Ex- perimental Analysis of Behavior, 1967, 10, 517527. Siegel, S. Nonparametric statistics for the behavioral sciences. New York: McGraw-Hill, 1956. Silberberg, A., & Williams, D. R. Choice behavior on discrete trials: A demonstration of the occurrence of a response strategy. Journal of the Experimental Analysis of Behavior, 1974, 21, 315322. Skinner, B. F. Science and human behavior. New York: Macmillan, 1953. Stubbs, D. A., & Pliskoff, S. S. Concurrent responding with fixed relative rate of reinforcement. Journal of the Experimental Analysis of Behavior, 1969, 12, 887-895. Weiss, B. The fine structure of operant behavior. In W. N. Schoenfeld (Ed.), The theory of reinforcement schedules. New York: Appleton-Century-Crofts, 1970. Williams, D. R. The structure of response rate. Journal of the Experimental Analysis of Behavior, 1968, 11, 251-258. Received January 20, 1978 Revision received June 8, 1978 •