on the ofeffect' j. herrnstein · on thelawofeffect 2803 z 0--0177 rp *-4*1t rp 240, z / w laj...

24
JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR ON THE LAW OF EFFECT' R. J. HERRNSTEIN HARVARD UNIVERSITY Experiments on single, multiple, and concurrent schedules of reinforcement find various corre- lations between the rate of responding and the rate or magnitude of reinforcement. For con- current schedules (i.e., simultaneous choice procedures), there is matching between the relative frequencies of responding and reinforcement; for multiple schedules (i.e., successive discrimi- nation procedures), there are contrast effects between responding in each component and rein- forcement in the others; and for single schedules, there are a host of increasing monotonic re- lations between the rate of responding and the rate of reinforcement. All these results, plus several others, can be accounted for by a coherent system of equations, the most general of which states that the absolute rate of any response is proportional to its associated relative reinforcement. A review of the evidence for the law of effect would quickly reveal that the simple no- tion of "stamping-in" (Thorndike, 1911, e.g., p. 283) does not suffice. Animals do not just repeat the first successful act; they are likely to improve upon it until they find something like the optimal performance. In Thorndike's puzzle box, in the maze, or in Skinner's op- erant conditioning chamber, animals tend toward faster, easier, and more congenial movements, unless the performances are virtu- ally optimal to begin with. Although some theorists find enough stereotypy to suggest a quasi-mechanical process of stamping-in (e.g., Guthrie and Horton, 1946), others have re- mained unconvinced (e.g., Tolman, 1948). Something more than the static form of the law of effect is needed for a really persuasive theory. The temptation to fall back on com- 'The preparation of this paper and the original work described herein were supported by grants to Harvard University from the National Science Foundation, the National Institute of Mental Health (NIMH-15494), and the National Institutes of Health (NIH-GM- 15258). This paper is dedicated to B. F. Skinner in his sixty-fifth year, in partial payment for an intellectual debt incurred over the past decade and a half. There are also more recent debts to be acknowledged, which I do gratefully. J. A. Nevin and H. S. Terrace have generously made available unpublished data, some of which have been induded. W. M. Baum and J. R. Schneider have criticized and undoubtedly improved the paper, in substance and in style. Reprints may be obtained from the author, Psychological Laboratories, William James Hall, Harvard University, Cambridge, Massachusetts 02138. mon sense and conclude that animals are adaptive, i.e., doing what profits them most, had best be resisted, for adaptation is at best a question, not an answer. And it is not hard to find evidence that violates both the Thorn- dikian principle of stamping-in and common- sense motions of adaptation, as the following two examples show. Ferster and Skinner (1957) reported that an animal, when shifted from an interval to a ratio schedule, typically showed a change in its rate of responding. As regards stamping-in, the rate should remain unchanged for ratio schedules reinforce all rates of responding with equal probability (Morse, 1966). Although the deviation from the theory is large and re- producible, its direction is somewhat unpre- dictable. For example, in an experiment with pigeons (Ferster and Skinner, 1957, pp. 399- 407), one subject's rate of responding increased while the other's virtually ceased, when the schedule changed from a variable interval to a variable ratio matched for numbers of re- sponses per reinforcement. While both of these findings-both the increase and the decrease in the rate of responding in the shift from interval to ratio schedule-violate the Thorn- dikian law of effect, only the increase is plausibly seen as adaptive. By responding faster on the ratio schedule, one animal in- creased its reinforcements per unit time, but, by the same token, the other one reduced its rate of reinforcement by responding more slowly. If the acceleration is adaptive, then the 243 1970, 13, 243-266 NUMBER 2 (MARCH)

Upload: others

Post on 05-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR

ON THE LAW OF EFFECT'

R. J. HERRNSTEIN

HARVARD UNIVERSITY

Experiments on single, multiple, and concurrent schedules of reinforcement find various corre-lations between the rate of responding and the rate or magnitude of reinforcement. For con-current schedules (i.e., simultaneous choice procedures), there is matching between the relativefrequencies of responding and reinforcement; for multiple schedules (i.e., successive discrimi-nation procedures), there are contrast effects between responding in each component and rein-forcement in the others; and for single schedules, there are a host of increasing monotonic re-lations between the rate of responding and the rate of reinforcement. All these results, plusseveral others, can be accounted for by a coherent system of equations, the most general ofwhich states that the absolute rate of any response is proportional to its associated relativereinforcement.

A review of the evidence for the law ofeffect would quickly reveal that the simple no-tion of "stamping-in" (Thorndike, 1911, e.g.,p. 283) does not suffice. Animals do not justrepeat the first successful act; they are likelyto improve upon it until they find somethinglike the optimal performance. In Thorndike'spuzzle box, in the maze, or in Skinner's op-erant conditioning chamber, animals tendtoward faster, easier, and more congenialmovements, unless the performances are virtu-ally optimal to begin with. Although sometheorists find enough stereotypy to suggest aquasi-mechanical process of stamping-in (e.g.,Guthrie and Horton, 1946), others have re-mained unconvinced (e.g., Tolman, 1948).Something more than the static form of thelaw of effect is needed for a really persuasivetheory. The temptation to fall back on com-

'The preparation of this paper and the original workdescribed herein were supported by grants to HarvardUniversity from the National Science Foundation, theNational Institute of Mental Health (NIMH-15494),and the National Institutes of Health (NIH-GM-15258). This paper is dedicated to B. F. Skinner in hissixty-fifth year, in partial payment for an intellectualdebt incurred over the past decade and a half. Thereare also more recent debts to be acknowledged, whichI do gratefully. J. A. Nevin and H. S. Terrace havegenerously made available unpublished data, some ofwhich have been induded. W. M. Baum and J. R.Schneider have criticized and undoubtedly improvedthe paper, in substance and in style. Reprints may beobtained from the author, Psychological Laboratories,William James Hall, Harvard University, Cambridge,Massachusetts 02138.

mon sense and conclude that animals areadaptive, i.e., doing what profits them most,had best be resisted, for adaptation is at best aquestion, not an answer. And it is not hard tofind evidence that violates both the Thorn-dikian principle of stamping-in and common-sense motions of adaptation, as the followingtwo examples show.

Ferster and Skinner (1957) reported that ananimal, when shifted from an interval to aratio schedule, typically showed a change inits rate of responding. As regards stamping-in,the rate should remain unchanged for ratioschedules reinforce all rates of responding withequal probability (Morse, 1966). Although thedeviation from the theory is large and re-producible, its direction is somewhat unpre-dictable. For example, in an experiment withpigeons (Ferster and Skinner, 1957, pp. 399-407), one subject's rate of responding increasedwhile the other's virtually ceased, when theschedule changed from a variable interval to avariable ratio matched for numbers of re-sponses per reinforcement. While both of thesefindings-both the increase and the decreasein the rate of responding in the shift frominterval to ratio schedule-violate the Thorn-dikian law of effect, only the increase isplausibly seen as adaptive. By respondingfaster on the ratio schedule, one animal in-creased its reinforcements per unit time, but,by the same token, the other one reduced itsrate of reinforcement by responding moreslowly. If the acceleration is adaptive, then the

243

1970, 13, 243-266 NUMBER 2 (MARCH)

Page 2: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

R. J. HERRNSTEIN

deceleration is not, and both filadings are wellsubstantiated.A related finding, also violating both the

Thorndikian law of effect and adaptiveness,has been obtained with the conjunctive sched-ule, schematically shown in Fig. 1. This graphplots on the coordinates of a cumulative recordthe region over which responses are unrein-forced. For the conjunctive schedule, it is theentire plane minus the shaded area within theright angle. In other words, the conjunctiveschedule reinforces the first response after theoccurrence of a certain number of responses(n on the figure) and the passage of a certainperiod of time (t). The schedule is specified byits component members: fixed interval andfixed ratio in the present instance. The con-junctive schedule in Fig. 1 would be calleda "conjunctive fixed-interval t, fixed-ration + 1." In the simple fixed-interval schedule,rapid responding is implicitly penalized, forfaster responding increases the work per rein-forcement. In contrast, ratio schedules exactno such penalty for responding quickly, forthe amount of work per reinforcement is heldconstant. In fact, ratio schedules may favorrapid responding by arranging a direct propor-tionality between the rate of responding andthe rate of reinforcement. The conjunctiveschedule concatenates these features of ratio

AwZnz n0CLLn:

and interval schedules, since the rate of rein-forcement is directly proportional to the rateof responding only for rates of responding nolarger than n/t, beyond which the rate ofresponding covaries with the responses perreinforcement.The relevant finding was obtained in an ex-

periment (Herrnstein and Morse, 1958) thatheld the interval component constant at 15min, but varied the ratio component fromzero (which is a simple fixed-interval schedule)to 240. Figure 2 shows the relation betweenrate of responding and the number require-ment imposed by the ratio component. Al-though the pigeons were responding morethan an average 300 times per reinforcementon the fixed-interval schedule, a number re-quirement as small as 10 (for one of thepigeons) or 40 (for either), caused a detectableslowing down of responding. The range ofrates of responding within individual fixedintervals is large enough so that number re-quirements even this small are likely to makecontact with the behavior. Larger require-ments caused progressively larger decrementsin responding. This falling rate of respondingreduced the rate of reinforcement, as Fig. 3shows. Here for the two pigeons are the aver-age interreinforcement times as the numberrequirement was increased. For the fixed-interval schedule, the interreinforcement timewas as small as the procedure permits, whichis to say, 15 min. Even the smaller require-ments produced some reduction in the rate ofreinforcement. For one pigeon, the rate of re-

U)w

I- 3004

In

X% 100

U)wU)z0a. 100Unw

It 0

0- -O 177 RP_-* 178 RP

*KA

TIM E-Fig. 1. The shaded area shows the region of rein-

forced responding on a conjunctive schedule of rein-forcement. The ordinate is cumulated responding; theabscissa is elapsed time.

NUMBER REQUIREMENTFig. 2. For two subjects, the rate of responding as a

function of the size of the number requirement in aconjunctive schedule with a time requirement of 15min throughout.

244

Page 3: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

ON THE LAW OF EFFECT

2803Z 0--0 177 RP

*-4* 1T RP

240,

z /w

LAJ

U.

Z 40 -

w

w

z0.

0 50 l00 s50 200 250NUMBER REQUIREMENT

Fig. 3. For two subjects, the interreinforcement timeas a function of the size of the number requirement ina conjunctive schedule with a time requirement of15 min throughout.

inforcement fell very sharply as the numberrequirement was increased; for the other, thedecline was more gradual, but in either case,the requirement took its toll throughout.The conjunctive schedule lends itself poorly

to a Thorndikian analysis, for what could begetting stamped-in when responding and re-

sponse requirement vary inversely? Moreover,the conjunctive schedule makes poor sense as

regards the animal's best interests, for theanimal may be emitting substantially more (inFig. 2, it was 30-fold more) behavior on thefixed interval than the number requirementdemands, and yet the behavior is neverthelessdepressed.These problem cases are troublesome only

within the confines of theory. In the broadersphere of common sense, reinforcement is af-fecting what might be termed the "strength"of behavior, as reflected in its rate. For ex-

ample, consider the change from interval toratio schedules. If the first effect is a higherrate of reinforcement, the rate of respondingmight increase. But this would further in-crease the rate of reinforcement, further"strengthening" responding, causing it to riseagain, which again pushes the rate of rein-forcement upwards, and so on. If, on the otherhand, the first effect is a lower rate of rein-forcement, then the rate of responding shouldfall. The rate of reinforcement would thenalso fall, further weakening the responding,and so on again. This dynamic process occurs

with ratio, and not interval, schedules becauseonly the ratio schedule arranges a propor-tionality between the rate of reinforcementand the rate of responding. The proportional-ity is the basis for an instability in ratio sched-ules that should produce either maximal re-sponding or none at all, an implication that isconfirmed by the tendency of ratio respondingto be "two-valued" (Ferster and Skinner, 1957,chap. 4).The conjunctive schedule similarly exempli-

fies the notion of strength. However slightlythe number requirement increases the inter-reinforcement interval, it should reduce thestrength, and therefore the output, of respond-ing. If the strength of responding is sufficientlyreduced so that the rate of responding is lessthan n/t. (see Fig. 1), then the conjunctiveschedule becomes identical to a ratio schedule,and responding may reasonably be expectedto vanish altogether, as it does with too-largeratios. One of the pigeons in Fig. 2 had, infact, virtually stopped responding at the larg-est ratio studied, even though the number re-quirement was still less than the responses perreinforcement freely emitted on the simplefixed-interval schedule.Those two examples, and others like them,

show that neither stamping-in, nor adaptation,nor the two of them together, can account forwhat is here being called the strength of be-havior. This paper specifies more formallythan heretofore the shape of this intuitivelyobvious concept, while staying within the gen-eral outlines of the law of effect.

REINFORCEMENT ASSTRENGTHENING

Reinforcement as strengthening is not beingoffered as a new idea, for "to reinforce" meansto strengthen, and only by metaphor, tostrengthen behavior. The earliest psychologi-cal usage concerned Pavlovian conditioning,where "reinforcement" of a reflex in a physio-logical sense was already familiar in classicalwork on facilitation and inhibition. The useof "reinforcement" in the vocabulary of in-strumental learning was promoted in the mid-1930s, particularly by Skinner and primarilyas a substitute for the traditional term "re-ward", whose very age tainted it with thesuspicion of mentalism. Mentalism notwith-standing, "reward" was more neutral than

245

Page 4: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

R. J. HERRNSTEIN

"reinforce", for while reward simply names aclass of events that have some effect on theorganism, "reinforcement" implies what theeffect is, namely a strengthening. The extraconnotation was tolerable only so long as itwas not contrary to fact, which it was not. Theleading advocates of the law of effect-Thorn-dike, Skinner, and others-had from the begin-ning spoken in terms of a "strengthening" ofbehavior.What, though, does it mean to strengthen

behavior? Thorndike's answer was the notionof stamping-in, which may, in fact, be ade-quate for the acquisition of new behavior. Butfor behavior already learned, stamping-inseems inappropriate. The response form assuch is then no longer changing, and yet, asthe examples in the previous section show,reinforcement is still affecting what might beconsidered the strength of the behavior. Theanswers of others, like Skinner and Hull, ad-dressed themselves sensibly if not successfullyto the underlying problem, which is one ofmeasurement. To say that behavior isstrengthened is to imply some dimension ofbehavior along which it changes when itsstrength changes.The measurement problem is empirical, not

conceptual, which is not to deny the virtue ofclear and original thinking. It is, rather, topoint out that the only persuasive argumentfor any measure of response strength is toshow orderly relations between the parametersof reinforcement-its frequency, quantity,quality, and so on-and the designated parame-ter of behavior. The traditional measures ofresponse-probability, rate, amplitude (i.e.,work or effort), latency, resistance to extinc-tion-have all failed to gain unequivocal sup-port simply because orderly data with quanti-tative and general significance have not beenforthcoming. Although there is no doubt thatbehavior is affected by its consequences, thelaw of effect is still expressed qualitatively,rather than as a relation between measurablevariables, which it clearly must be at somelevel of analysis.The notion of response probability comes

closest to being a generally accepted measureof strength; cutting, as it does, across theoriesas diverse as those of Tolman (1938) and Hull(1943), Brunswik (1955) and Skinner (1953).But the agreement is more apparent than real,for the abstractness of "probability" masks the

diversity of methods used for its extraction.For example, in some experiments, particu-larly those concerned with acquisition, thechanging probabilities of response are esti-mated by the proportion of subjects doingsomething at successive points in training. Inother experiments, single subjects are the basisfor estimation of the probability by integrat-ing over successive trials. In still others, theprobability is estimated by the proportion oftrials, or proportion of subjects, showing thechoice of one response alternative out of aknown set of alternatives. Not even the use ofrelative frequencies-the measure in modernprobability theory-is common to all theorists,for according to Skinner, the rate of respond-ing is the proper estimate of the organism'sprobability of responding. This is not an esti-mator in the formal sense-a mathematicalprobability is a dimensionless quantity be-tween 0 and 1.0, and response rate is neitherdimensionless nor bounded in principle-butrather an index of the animal's disposition torespond over some interval of time. Given thepresent state of knowledge, this abundance ofmeasures is more likely to confuse than toenrich.To reduce the confusion, and hopefully to

advance the state of knowledge, the presentapproach focuses initially on a single relative-frequency measure as its index of strength. No"probability" will be inferred simply becauseto do so might suggest an equivalence withother empirical measures for which there is noevidence. The measure is exemplified by anexperiment in which pigeons had two keys topeck (Herrnstein, 1961). The keys were avail-able continuously during experimental ses-sions and pecking was reinforced with twovariable-interval schedules, mutually indepen-dent and running simultaneously. The rela-tive frequency is obtained by dividing thenumber of pecks on one key by the sum toboth. In the context of operant conditioningthis is a concurrent schedule, but it is clearlya version of the familiar "choice" experiment.It is, however, different in two significantrespects. First, it uses continuous exposure tothe alternatives instead of discrete trials.Second, reinforcements come on interval, in-stead of ratio, schedules. In the typical choiceexperiment, as well as in gambling casinos,the over-all probability of winning is constantfor a given play (or response alternative), so

246

Page 5: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

ON THE LAW OF EFFECT

that the number of wins is proportional tothe number of plays. With interval schedules,there is no such proportionality, as notedearlier. Instead, the probability of winning onany given play is inversely related to the rateof play (response), and the number of wins isvirtually independent of the number of plays,given a high enough rate of play.The pigeons, then, had a pair of keys to

peck, and the experiment rewarded their ef-forts with brief access to food at irregularintervals. The schedules set a maximum rateof reinforcement throughout the experimentat 40 per hour, but the number allocated toone key or the other was systematically varied,to see how the distribution of responses wasaffected. The question was whether, to use thevocabulary of this paper, response strength asrelative frequency was some plausible func-tion of reinforcement frequency. The answerwas both plausible and attractively simple, asshown in Fig. 4. The ordinate is the propor-tion of responses on the left key; the abscissais the proportion of reinforcements deliveredthereon. The points fall near the diagonal,which is the locus of perfect matching be-tween the distribution of responses and of re-inforcements. A simple equation summarizesthe finding (P is number of pecks, R is number

of reinforcements, and the subscripts denotethe two alternatives).

PL _ RLPL + PR - RL + RE

(1)

Unless the number of pecks far exceeds thenumber of reinforcements, the matching func-tion (equation 1) is trivially true. For example,if the variable-interval schedules were assign-ing reinforcements to the two keys morequickly than the pigeons pecked, then everypeck would be reinforced. The data pointwould necessarily fall on the diagonal, butthe result would have little empirical content.If reinforcements were being assigned at therate of one for every other response, or onefor every third response, the possible range ofvariation for the points would still be nar-rowly constrained around the diagonal, andthe findings would still be essentially vacuous.In fact, the possible range of variation is ex-actly fixed by the actual numbers of reinforce-ments and responses. Since there are at least asmany responses on each key as there are rein-forcements, the smallest relative frequency ofresponse for a given relative frequency of re-inforcement RL/(RL + RR) is

RLPL + PR

(2)

This fraction approaches RL/(RL + RR) as thetotal number of responses approaches the totalnumber of reinforcements. The largest rela-tive frequency of response for a given relativefrequency of reinforcement is also dependentupon the fact that there can be no fewer re-sponses than reinforcements. Thus, the ratioof responses for a given RL/(RL + RR) can gono higher than 1.0 minus the minimum valuefor the other key-RR/PL + PR)-which may bewritten as

PL + PR - RRPL + PR (3)

P. .OP IO .R . .E

P>ROP>ORTION OF REINFORCEMENTS

Fig. 4. The relative frequency of responding to one

alternative in a two-choice procedure as a function ofthe relative frequency of reinforcement thereon. Vari-able-interval schedules governed reinforcements forboth alternatives. The diagonal line shows matchingbetween the relative frequencies. From Herrnstein(1961).

This fraction, too, will approach RL/(RL + RR)as the total number of responses approachesthe total number of reinforcements, which isto say when the responses on the separate keyseach equal the reinforcements thereon. Theresult, then, is that as the number of responsesapproaches the number of reinforcements, thepossible range of variation for the ratio of re-sponses converges on to the matching relationin Fig. 4. In the case of the experiment sum-

247

Page 6: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

R. J. HERRNSTEIN

marized here, however, the ratio of responsesto reinforcements was approximately 100. At0.5 on the abscissa, therefore, the possiblerange of variation of responding was from0.005 to 0.995. At other values of the abscissa,the possible range was comparably broad. Theclose agreement between the matching relationand the distribution of responding says some-thing, therefore, about the animal, not justabout the procedure itself. What this is, andhow it operates in a variety of situations, willoccupy the remainder of the paper.

RESPONSE STRENGTH ANDCHOICE

The experiment summarized in Fig. 4 hadan additional procedural wrinkle. Each timethe pigeon shifted from one key to the other,there was a brief period of time during whichany reinforcement called for by the variable-interval schedule was delayed. This "change-over delay" (COD) lasted for 1.5 sec, and wasimposed in order to prevent the pigeons fromswitching after virtually every response. It wasfound, without the COD, that the distributionof responses tended to stay around 50-50,without regard to the distribution of rein-forcements. If the matching relation were anaccident of the duration of the COD, it wouldhardly be a principle of either responsestrength or choice. The most direct test of theCOD is contained in an experiment in whichit was varied systematically to see whether thematching relation was a peculiarity of only acertain range of durations.The experiment, by Shull and Pliskoff

(1967), varied a number of conditions besidesthe duration of the COD. Instead of pigeons,their experiment used albino rats. The rein-forcer, instead of brief access to food for ahungry animal, was the opportunity for therat to get electric current passed through itsbrain in the region of the posterior hypothala-mus. A reinforcement consisted of the lightingof a small light in the presence of which therat could press the lever 20 times, each timeobtaining 125 milliseconds of a 100-Hz trainof sine waves of 150 to 300 microamps acrossthe electrodes implanted in its head. Thevariable-interval schedule resumed at the endof the last burst of current. The schedule itselfwas a variation of the simple concurrent pro-cedure, one that had been originally described

by Findley (1958). Instead of a pair of responsealternatives associated with a pair of variable-interval schedules, the Findley procedure hadthe two variable intervals associated with apair of stimuli, but responses on only one ofthe two levers could produce reinforcement.At any one time, only one of the two stimuliwas present, and while it was present, rein-forcements from only its associated variableinterval were forthcoming. The second leverin the chamber switched from one stimulus tothe other, along with its associated variable-interval schedule. Actually, the two variable-interval programmers ran concurrently andcontinuously, just as they do in a conventionalconcurrent procedure. Shull and Pliskoffvaried the COD, which is here the minimumpossible interval between a response to theswitching lever and the first reinforced re-sponse. Their finding was that matching oc-curred as long as the COD was greater than acertain minimal duration, as was found in theearlier study, but that beyond that value,matching was maintained whatever the dura-tion of the COD in the range examined (0 to20 sec). As the COD is made larger, however,it begins to affect measurably the obtainedrates of reinforcement by interacting with theschedules themselves, as might be expected.Matching is always with respect to obtained,rather than pre-specified, reinforcement rates.The experiment by Shull and Pliskoff ex-

tended the generality of the matching relationmore than merely by showing that the CODis not the controlling variable. It extended thefinding to rats, to Findley's procedure, and tointracranial stimulation as the reinforcer.Other studies have extended it further.Reynolds (1963a) showed matching with threeresponse alternatives instead of two. Holz(1968) found matching even when each of theresponses was punished with electric shock, inaddition to being reinforced by the usualvariable-interval schedules. Holz varied the in-tensity of the punishment until it was sosevere that the pigeons stopped respondingaltogether. However, as long as they were re-sponding, and as long as the punishment forthe two responses was equally intense, thedistribution of responses matched the distribu-tion of reinforcements. Catania (1963a) andNeuringer (1967b) found matching with re-spect to total amount of food when the tworeinforcers differed not in their rate of occur-

248

Page 7: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

ON THE LAW OF EFFECT

rence, but in the grams of food per reinforce-ment. In another study, Catania (1963b), usingthe Findley procedure, found matching bothfor the proportion of responses and for pro-portion of time spent with each of the twostimuli. Baum and Rachlin (1969) showedmatching (correcting for a position bias) whenthe "responses" consisted of standing on oneside of a chamber or the other. The propor-tion of time spent in each location was foundto be distributed as the associated proportionof reinforcements. Along these lines, Brown-stein and Pliskoff (1968) found that theFindley procedure can be further modified sothat when the animal selects one stimulus con-dition or the other, reinforcement comes alongindependently of any response. The pigeonshere are simply choosing between one rate ofreinforcement or the other. Their finding, too,is described by the matching relation, suitablyadapted. The proportion of time spent in eachstimulus condition is equal to the proportionof reinforcement received therefrom. Nevin(1969) noted that matching is found in humanpsychophysical studies when the proportion of"yes" responses is plotted against either theproportion of trials containing a signal or therelative size of payoff (i.e., the frequency ormagnitude of reinforcement). Shimp (1966),and the present author in unpublished work,have found matching in discrete-trial pro-cedures of various sorts.The list of confirmatory studies could be

extended, but without profit at this point. Ithas been consistently found that respondingis distributed in proportion to the distributionof reinforcement, as long as the respondingand the reinforcements across the alternativesare not unequal qualitatively. Thus, thematching relation would not be expected ifthe reinforcer for one response were a pre-ferred food and that for the other were a non-preferred food, unless the scale values for thereinforcers expressed the difference quantita-tively. Nor would it be expected if the tworesponses differed in some important way, e.g.,that one involved considerably more workthan the other. In fact, the matching relationmay be used to construct equivalences be-tween qualitatively different responses or rein-forcers, although no such undertaking hascome to the author's attention. It should, how-ever, be possible to scale reinforcers againsteach other or responses against each other by

assuming that the subject must be conformingto the matching relation whenever it is in achoice situation of the general type employedin these experiments, and by adjusting themeasures of response or reinforcement accord-ingly.The main opposition to the matching rela-

tion is found in the literature on so-called"probability learning". If an experiment ar-ranges a certain probability (excluding 1.0,0.5, and 0) of reinforcement for each of a pairof response alternatives, and if the subject dis-tributes its responses in proportion to thesepre-assigned probabilities, then the matchingrelation, as defined here, is violated. Imaginethat the two probabilities are 0.4 and 0.1. In asequence of 100 responses, probability learn-ing requires 80 responses to the better alter-native and 20 responses to the poorer one.The number of reinforcements would be80 x 0.4 = 32 for the one, and 20 x 0.1 = 2 forthe other. With respect to the matching for-mula, this is a violation, for

80 3280 + 20 =32 + 2 (4)

The literature does not, however, claim strictconformity to probability learning. Instead,responding is often confined exclusively to oneor the other of the two alternatives, typicallytoward the alternative with the higher proba-bility of reinforcement. But even when thetwo reinforcement probabilities are equal,responding tends to become exclusive for oneof the choices. These deviations from proba-bility learning are said to be instances of''optimizing" or "maximizing", since exclusivepreference is the optimal strategy in the sensethat the subject will, on the average, maximizeits winning if it stays with the better bet. Evenwhen the two probabilities are equal, nothingis lost by exclusive preference, and perhapssomething is gained, for the subject is therebyspared the effort of switching from one alter-native to the other.

Maximization in the probability type ex-periment actually conforms to the presentmatching function. Equation (1) is satisfiedwhen all or none of the responses and all ornone of the reinforcements occur for one ofthe alternatives and is therefore consistentwith all of the experiments that deviate fromprobability learning and find maximizationinstead. Not all the experiments, however, find

249

Page 8: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

R. J. HERRNSTEIN

exclusive preference, and it is not yet clearwhether apparently minor procedural factorsare inadvertently affecting outcomes, as somehave suggested (Bush and Mosteller, 1955), orwhether there is, in addition, a phyletic factor.Bitterman (1965) argued that the higher or-ganisms tend to maximize, while the lowerones, like fish and pigeons, tend to "probabil-ity learn". However, the experiments byBitterman and his associates often use pro-cedural features, such as forced trials to thealternatives, that complicate the calculation ofthe obtained frequencies of reinforcement, ifnot the psychological processes at work. In anyevent, pigeons do not invariably show proba-bility learning; Herrnstein (1958) showedmaximizing in pigeons given a choice betweendiffering probabilities of reinforcement.

It is, in other words, not clear how much ofthe literature of probability learning actuallyviolates equation 1, since it is not clear howmuch of this literature can be taken as firmevidence for probability learning. Neverthe-less, suppose for the sake of argument thatthere is some valid evidence for probabilitylearning, which is to say, that the responsesare in the same ratio as the probabilities ofreinforcement. How does this differ from thefindings with rate of reinforcement, accordingto which the responses are in the same ratio asthe numbers of reinforcement? The two find-ings turn out to be closely related mathemati-cally, as the following set of equations shows,starting with the equation for probabilitylearning:

RLPL = PL (5a)PR Rt

PL2RR = PR2RL (5b)

PLRa =PRVRAL (5c)PLr+RPLALP = PR A +PRVr+PLVRL (5d)

PL =(6)

PL+ PR NRiCI.+ \E (

Natapoff (in press) has shown that the match-ing of response frequencies to reinforcementfrequencies (equation 1) is just one, and insome sense the simplest, of a family of func-tions that may relate these two variables underthe assumptions of symmetry of choice, whichis to assume that the factors that affect choiceoperate invariantly across the alternatives.

Another, and closely related, function, accord-ing to Natapoff, would be equation 6 above,according to which the match is between re-sponse frequencies and the square root of rein-forcement frequencies. This latter finding is,as the mathematical argument demonstrates,merely another version of probability learn-ing. Although we do not know with certaintyif, and when, probability learning actually oc-curs in simple choice procedures, it may even-tually be useful to be able to relate the twokinds of matching to each other mathemati-cally.The only other significant departure from

matching known to the present author is anexperiment with pigeons (Herrnstein, 1958) inwhich reinforcement was dependent upon arequired number of responses on the two keys.The sum of the two requirements was keptconstant at 40, but the two components varied:2, 38; 5, 35; 10, 30; and 20, 20. No furtherrestrictions were imposed on the order or man-ner in which the requirements were reached.The pigeon could work in any sequence ofalternation between the two keys, and couldemit any number in excess of the requirementon one of the keys, but, having met one re-quirement, it was rewarded as soon as it metthe other. Figure 5 shows the proportion ofresponses on the left key as a function of theproportion of the total requirement on thatkey. It also shows the proportion of rein-forcements on that key. The procedure clearlyproduced a kind of matching, but not to thedistribution of reinforcements, since the pro-portion of reinforcements was so much morevariable than the proportion of responses.Instead, the pigeons responded so as to mini-mize the number of responses (and also proba-bly the time) per reinforcement.There are undoubtedly many other proce-

dures that would give comparable departuresfrom matching. For example, if reinforcementwere made conditional upon a pair of rates ofresponding on the two keys (instead of upon apair of numbers of responses- thereon), thereprobably would be some accommodation inresponding and matching in the present sensewould probably be violated. Such proceduresneed not be particularly complicated, the twoexamples so far notwithstanding. For example,consider a procedure that simply reinforcedalternation of left and right responses, withreinforcement occurring only for right re-

Page 9: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

ON THE LAW OF EFFECT

(I)

za Wwozw

ao

cowi .

ozz a72 <t

/U)O w0 (nO Z

ir

0 r

r 7Io I-

:0]lL-

4 -r

r,'

,/ , O~~~~REINFORCEMENTS

0, I II-O .2 .4 .6 .8 1.0

PROPORTION OF REQUIRED RESPONSES

Fig. 5. The filled circles and solid brackets show theaverage and range of the relative frequency of respond-ing to one alternative in a two-choice procedure. Rein-forceinent was for the occurrence of at least a minimumnumber of responses to each of the two alternatives.The abscissa shows what fraction of the total responserequirement was allocated to the alternatives beingplotted. The open circles and broken brackets are forthe relative frequency of reinforcement against thesame abscisia. The diagonal line shows matching be-tween the abscissa and the ordinate. From Herrnstein(1958).

sponses. The distribution of responses wouldhere approach 50-50, but the distribution ofreinforcements would be 0-100. This last ex-ample is instructive, for it suggests why match-ing neither occurs nor should be expected insuch cases. Reinforcement for alternation islikely to give not two separate responses, butrather a single, albeit biphasic, response-respond-left-respond-right-that is reinforcedevery time it occurs. In the more complex casesummarized in Fig. 5, there is a similar issueof response definition, as there is in any pro-cedure in which reinforcement is conditionalupon combinations of responses across the al-ternatives.A procedure that reinforces multiphasic re-

sponses across the alternatives is thus not prop-erly described by tallies of the individual re-sponse frequencies. Instead of so many leftresponses and so many right responses, thereshould be so many left-rights or the like. Thepoint is worth noting for it is probably whymatching depends on the COD in some (butnot all) procedures. If the response alterna-tives are situated near each other, the subject's

left-right or right-left sequences may be ad-ventitiously reinforced. If so, the usual tally ofleft and right responses cuts across the re-sponse classes actually being maintained bythe reinforcement and matching in the ordi-nary sense is no longer an appropriate expec-tation. By interposing a delay between a re-sponse to one alternative and reinforcementfor the other, the COD discourages theseadventitious response clusters.

In virtually all of the published experimentsshowing matching, the alternative responseswere reinforced by schedules that differed onlyin the reinforcement parameter, whether fre-quency or amount. This may not, however, benecessary. For example, it remains to be shownwhether matching would be found if the con-current procedure used a variable-intervalschedule pitted against a variable ratio. Theprocedure is peculiar not only because of itsasymmetry, but also because it is intermediateas regards the degree to which the subject'sbehavior governs the distribution of reinforce-ment, and is therefore of some general interest.With two variable-interval schedules, the dis-tribution of reinforcements is not likely to beaffected by the distribution of responses, giventhe usual rates of responding. On the otherhand, with two variable-ratio schedules, thedistribution is virtually entirely under the sub-ject's control. In the combination procedure,the animal is in control of reinforcement fre-quency for one alternative, but not for theother, again assuming that response rate ishigh enough. Finally, the experiment is pecu-liar because matching, while well establishedfor interval schedules and their analogues, can-not occur for ratio schedules except trivially.For example, suppose ratios of 50 and 200 arescheduled for two alternatives. Equation 1, thematching function, can be satisfied only if theanimal ceases responding to one of the alterna-tives, thereby obtaining all of its reinforce-ments thereon. The ratio schedules assure thatthe distribution of responses will follow therelation:

PL _ 5ORLPL + PR - 5ORL + 200RB (7)

The coefficients on the reinforcement frequen-cies express the fact that a ratio schedule fixesa proportionality between numbers of re-sponses and numbers of reinforcements. In thepresent example, equation 7 agrees with equa-

251

I.1

.1

.1

.1I

Page 10: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

R. J. HERRNSTEIN

tion I only when either RL or RR goes to zero.In general, agreement can be obtained only atthese limiting values, plus the 50% valuewhen the two ratios are equal to each other.In contrast, the present experiment, combin-ing interval and ratio schedules, permitsmatching at all values, as follows:

PL - rRL 8)PL + PR rRL + XRR (8

Here, r is the required proportionality for thealternative with the ratio schedule. For theother alternative, reinforced on a variable-interval schedule, there is no required propor-tionality. By responding faster or slower, thesubject can here emit more or fewer responsesper reinforcement (x). In order to obtainagreement between equation 8 and equation1, the subject must adjust its rate of respond-ing on the variable-interval schedule such thatr = x.An experiment combining the two sched-

ules was performed with four pigeons. As inthe earlier study, there was a change-over de-lay (2 sec) between the two alternatives. Re-sponding to one alternative was reinforced onvariable-interval schedules that were changedfrom time to time (in the range of 15 sec to2 min). Responding to the other was rein-forced on variable-ratio schedules, also variedfrom time to time (in the range of 20 to 160).Since the distribution of reinforcements ishere affected by the distribution of responsesat all rates of responding, the values of thetwo schedules do not fix a particular distribu-tion of reinforcement. The relative frequencyof reinforcement, the nominal independentvariable, was largely up to each subject. Formost of the pairs of values, the findings trivi-ally confirmed matching, by giving respondingexclusively to one alternate or the other. Thishas also been the outcome when both alterna-tives were variable-ratio schedules (Herrnstein,1958). However, for some pairs of values, theresponding was distributed between the tworesponses with something between exclusivepreference and total abstinence. Figure 6shows the proportion of responses to one ofthe two alternatives (the variable-interval key)as a function of the proportion of reinforce-ments actually obtained from that key. Thedata for four pigeons have been averaged here,but the individual points were not systemati-cally different. The finding is that even when

the two alternatives reinforce according to dif-ferent schedules, the distribution of respond-ing still obeys the matching rule, or somethingclose to it. The bunching of the points on thelower part of the function appears to be areliable outcome of the procedure. It says thatpreferences for the variable-interval alterna-tive (i.e., above 0.5 on the ordinate) are ex-clusive, while for the variable-ratio alternative(i.e., below 0.5 on the ordinate) they may beexclusive or non-exclusive.

1.

CowCoz00.Cow

Ir0z0

0.0~

.9

.8

.7

.6

.5

.4

.3

.2

.1

O I I I I I I

0 .1 .2 .3 .4 .5 .6 .7 .8 .9 i.OPROPORTION OF REINFORCEMENTS

Fig. 6. The relative frequency of responding to onealternative in a two-choice procedure as a function ofthe relative frequency of reinforcement thereon. Forone alternative, reinforcement was on a variable-interval schedule, while, for the other, it was on avariable-ratio schedule. The ordinate plots the variable-interval alternative. The diagonal shows matching be-tween ordinate and abscissa.

The matching in Fig. 6 is surprising notonly because of the peculiar procedure, butalso in light of the well-known effects of thetwo kinds of scheduiles on responding in isola-tion. Variable-ratio responding is typically agood deal faster than variable-interval re-sponding, at equal frequencies of reinforce-ment (Herrnstein, 1964). In the present experi-ment, the momentary rates of responding wereabout twice as high for the ratio schedules asfor the interval schedules. Nevertheless, eachof the subjects maintained approximately therequired invariance of responses per reinforce-ment across the ratio and interval schedules(see equation 8) even while the parametervalues of both schedules were being changed.

252

Page 11: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

ON THE LAW OF EFFECT

CHOICE AS BEHAVIOR ANDVICE VERSA

Intuitively, response strength should covaryin some reasonably orderly way with the pa-rameters of reinforcement. The foregoing sec-tion presents the case for replacing intuitionwith the relative frequency of response. Theargument is simply that since relative respond-ing is so directly controlled by relative rein-forcement, the former is the proper measureof the effects of the latter. The investigationof strength need not, however, stop here. Sev-eral theorists (Shimp, 1966; Neuringer, 1967a)have argued that matching is not fundamen-tal, but is to be explained as the outcome ofa more molecular interaction between behav-ior and its consequences, somehow based on''maximizing" in the probability-learningsense. The developments along this line willnot be further examined here, for a persuasivecase one way or the other is yet to be made.Suffice it to note that there is no logical assur-ance that this sort of reductionism will everexplain anything. The issue is empirical, andit is possible that behavior is more (or moresimply) orderly at the level of the matching re-lation than at the level of interresponse timesor sequences of choices, which is where themolecular theories operate. In contrast, logicdemands that if the relative frequency of re-sponse is governed by the frequency of rein-forcement, it must also govern somehow theabsolute rate of responding. Since relative fre-quencies in the free-operant procedure aremerely ratios of rates of responding, the onecould hardly be affected without the other be-ing affected as well.The link between sheer output of behavior

and what is usually called "choice" is pecu-liarly apparent with the free-operant method.In conventional choice experiments using dis-crete trials, the total number of responses is atrivial by-product of the procedure, dependingsimply on the number of trials per unit time.When the experimenter stipulates what thesheer output of behavior is going to be, he isunlikely to find it interesting. In the operantparadigm, however, where the output is asfree to vary as anything else, if not more so,it becomes an interesting, and inescapable, de-pendent variable. Yet the matching relationholds for both continuous-responding and dis-crete-trial procedures. In the former, the

question of sheer output cannot honestly bedisregarded; in the latter, the question is pre-cluded by the procedure. Since it seems un-likely that the two situations can be funda-mentally different when they both producematching, "choice" in the discrete-trial proce-dure should be explained the same way as theoutput of behavior for the free-operant proce-dure. It is hard to see choice as anything morethan a way of interrelating one's observationsof behavior, and not a psychological processor a special kind of behavior in its own right.For continuous-responding procedures, thecorrespondence between choice and behavioris clear, for the measure of choice is just theratio of the simple outputs for the alternativeresponses.What, then, can be said about these simple

rates of responding? In the first concurrentprocedure considered above (see Fig. 4), thetwo responses were freely occurring operants,reinforced on variable-interval schedules, andthe matching relation was the ratio of theirrates of occurrence. Figure 7 shows the abso-lute rates of responding of which the match-ing relation was the ratio. For absolute ratesof responding there are twice as many degreesof freedom as for relative rates, since the twokeys give values separately in the former in-

n

0I

ir 3000wa-cowU)Z 20000ILU)w

1000

* LEFT0 RIGHT

0

'c0

- 0 10 20 30 40REINFORCEMENTS PER HOUR

Fig. 7. The rate of responding for each of twoalternatives in a two-choice procedure as a functionof the rate of reinforcement for each. Variable-intervalschedules were used for both. From Herrnstein (1961).

253

sooo

Page 12: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

R. J. HERRNSTEIN

stance, but complementarily in the latter.Even though the functions in Fig. 7 are, there-fore, not as firmly demonstrated as in Fig. 4,the trend of the data is clear. (One point fromthe original experiment for one subject hasbeen omitted since there are reasons to believethat it was aberrant.) The absolute rate ofresponding on each key is directly propor-tional to the absolute rate of reinforcement.This elegantly simple relation may be ex-pressed in the equation:

P=kR (9)

Note that this relation is consistent with thebasic matching relation, as it would have tobe unless something were amiss. If each re-sponse alternative were obeying this rule, thenany combination of responses with the sameproportionality to reinforcement (i.e., thesame k) should show matching, as simple al-gebra proves:

PL kRL RL (10)PL+PR kRL+kRR RL+RR

Equation 9 makes a plausible and impres-sively simple fundamental law of responsestrength. Its main trouble is that it has beentried before and failed, for, in another form,it merely restates Skinner's first publishedguess about response strength (1938, p. 130f),when he asserted the constancy of what hecalled the extinction ratio. From his earlywork, he surmised that the number of re-sponses per reinforcement with interval sched-ules might be a constant, which is equation 9solved for k. Thus, if the animal's response isreinforced once every 5 min it should respondtwice as fast as when the response is reinforcedevery 10 min, and so on. Skinner's own datafailed to support this simple principle, andlater (1940) he revised the general principleof the "reflex reserve", so that the constancyof the extinction ratio was no longer of anytheoretical importance in his system. The massof data collected since shows no such relationas equation 9 in single-response procedures, asCatania and Reynolds (1968) demonstratedmost exhaustively.

In the experiment summarized in Fig. 7,the over-all frequency of reinforcement washeld constant at 40 per hour while the propor-tion allocated to one alternative or the otherwas varied. As long as the total number ofreinforcements (RL + RR) is constant, there is

no way to distinguish in a single experimentbetween the prediction of equation 9 and thefollowing:

pI kRLRL- + RR

When the sum RL + RR is itself a constant,equation 11 is equivalent to equation 9, theonly difference being the value of k, whichis an arbitrary constant in either case. As re-gards Fig. 7, then, equation 11 is as good asequation 9. The two equations make divergentpredictions only when RL + RR is not a con-stant, which is actually more typical in con-current procedures. The difference is thatequation 11 predicts a "contrast" effect, whichis to say a reciprocal relation between thereinforcements for one alternative and theresponses to the other, whereas equation 9predicts independence of responding to onealternative and reinforcements for the other.The data unequivocally support equation 11,for contrast effects are reliably found in con-current procedures (Catania, 1966).Although equation 11 conforms at least

qualitatively to the literature on concurrentprocedures, it, like equation 9, seems to runinto trouble with single-response situations,for when there is only one response alterna-tive being reinforced the equation degeneratesinto the following constancy:

P = k (12)Responding in single-response situations is no-toriously insensitive to variations in the pa-rameters of reinforcement, but it is not totallyinsensitive. Equation 12, however, is the resultof a gratuitous assumption, albeit one that iseasily overlooked. It assumes that because theexperimenter has provided just one responsealternative, there is, in fact, just one. A moredefensible assumption is that at every momentof possible action, a set of alternatives con-fronts the animal, so that each action may besaid to be the outcome of a choice. Even in asimple environment like a, single-responseoperant-conditioning chamber, the occurrenceof the response is interwoven with other, albeitunknown, responses, the relative frequenciesof which must conform to the same generallaws that are at work whenever there are mul-tiple alternatives. In fact, it seems safe to as-sume that all environments continually de-mand choices in this sense, even though in

254

(I 1)

Page 13: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

ON THE LAW OF EFFECT

many cases the problem of identifying andmeasuring the alternatives may be insoluble.That problem is, however, the experimenter's,not the subject's. No matter how impoverishedthe environment, the subject will always havedistractions available, other things to engageits activity and attention, even if these are nomore than its own body, with its itches, irrita-tions, and other calls for service.The notion that started this section, that

choice is not a psychological mechanism somuch as a certain measure extracted from ob-servations of behavior, has a complement. Forwhile choice is nothing but behavior set intothe context of other behavior, there is no wayto avoid some such a context for any response.An absolute rate of responding is occurring insuch a context, whether or not the experimen-ter happens to know what the other alterna-tives and their reinforcers are. With this inmind, equation 11 may be rewritten for anominal single-response situation, but recog-nizing the possibility of other sources of rein-forcement, as follows,

kRP_ R+R. (13)

in which R. is the unknown, aggregate rein-forcement for other alternatives.2 In practicalterms, Ro is a second free parameter to be ex-tracted from the data, but it is one that has adefinite empirical interpretation. The ques-tion is whether equation 13 fits the data.

Figure 8 is a test of equation 13 with dataobtained by Catania and Reynolds (1968).The six pigeons in the experiment were sub-mitted to variable-interval schedules with ratesof reinforcement ranging from about 10 toabout 300 per hour. The effects on perform-ance were varied, as the points in Fig. 8 show.However, with few exceptions, the points fallon or near the smooth curves, which are plotsof equation 13. The parameter values for theindividual subjects are shown in each panel,k first and Ro second, with k in responses per

2This same equation for absolute rate of respondingin single-response situations can be found in Norman(1966), except that Norman offers the equation as anapproximation, rather thzan as an exact relationshipbetween the variables. Norman's analysis does not ap-pear to be applicable to multiple-response situationsand his interpretation of the parameters is entirelydifferent from the present one. The two accounts are,therefore, readily distinguishable, notwithstanding theconvergence at this point.

W 40

-.80

z 400)

O r 70,4i 40

100,300

52

75,21....,...., I_

lx80

4069,11 ~68,5

0 50 00 15O 200 250 300 0 50 100 150 200 250 300REINFORCEMENTS / HOUR

Fig. 8. The rate of responding as a function of therate of reinforcement for each of six subjects onvariable-interval schedules. The first number in eachpanel is k in responses per minute and the second isR. in reinforcements per hour for the various smoothcurves. From Catania and Reynolds (1968).

minute and Ro in reinforcements per hour.The fit of points to function for each animalshows that the mathematcial form of equation13 is suitable for the single-response situation,notwithstanding its origin within the contextof choice experiments. The variety of param-eter values further shows that inter-subjectvariability is also amenable to description inthese formal terms. There appears to be noother comparably simple mathematical expres-sion for the relation between absolute inputand output for simple, repetitive responding,let alone one that also predicts the distributionof choices over multiple alternatives.

Figure 9 is another test for equation 13,showing the results of an unusual procedurefor varying reinforcement frequency. The pa-rameter values are again for k in responsesper minute and R. in reinforcements perhour, in that order. Chung (1966) reinforcedhis pigeons' responses after a given durationhad elapsed since the first response after thelast reinforcement-in the terminology of re-inforcement schedules, a tandem fixed ratio 1,fixed interval of various durations. From timeto time, the duration was changed, giving anew determination of rate of responding. Theactual rate of reinforcement was substantiallycontrolled by the subjects, for the sooner afterreinforcement the animals responded, thesooner the required time interval would be-gin. Chung's experiment is worth noting herenot only because the procedure is unusual,but because it achieved reinforcement ratesof about 2000 per hour, seven times Cataniaand Reynolds' (1968) maximum. Nevertheless,

255

Page 14: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

R. J. HERRNSTEIN

12

w101

z

8trwa.

61U)w(nz0 4

w

n r0~~~~~~~~0

0~~~

!0oz 120- 80

n I I0 300 600 900 1200 1500 800 2100

REINFORCEMENTS PER HOUR

Fig. 9. The rate of responding as a function of therate of reinforcement for subjects on a tandem fixed-ratio, fixed-interval schedule. The first number is k inresponses per minute and the second is R. in rein-forcements per hour for the smooth curve. From Chung(1966).

Chung's results are well described by equa-tion 13, although the parameter value for R.(180) is substantially higher here than for fiveof Catania's and Reynolds' six subjects.

Equation 13 is readily expanded into a gen-eral principle of response output, whether theresponse is presumably in isolation or not.The matching relation can be derived by tak-ing into account the reinforcement from bothalternatives:

PL _

PL + PR-

kRL

RL + RR + ROkRr, kRR

RL+R+ROtR.+RR+RO

RL (14)RL+ R(

This assumes that k and Ro are the same forthe responses under observation, a reasonablesupposition as long as the responses are equiv-alent in form and effort. If choice were, how-ever, asymmetric-for example, if one responsewas to depress a lever while the other was topull a chain-the equivalence of neither k nor

perhaps Ro could be assumed. Matching wouldnot be predicted, but the present formulationwould apply in principle nevertheless.

In recent experiments on choice, the sym-metry assumption has been supportable, whichis to say, the matching relation has beenfound. The next question is whether the ab-solute rates of responding also conform to the

present formalization. Equation 14 containsthe expressions for the absolute rates of re-sponding, restated here for convenience (forthe left key):

pL _ kRLL-RL + RR + R. (15)

For the first study of matching (Herrnstein,1961), the agreement between fact and theoryis clear, since that is where the discussion be-gan (see Fig. 7). When the total reinforce-ment is held constant, as was the case in thatexperiment, the absolute rate of respondingshould be directly proportional to the abso-lute rate of reinforcement. When the over-allrates of reinforcement are not constant, thereshould be "contrast" effects, with the rate ofresponding on each alternative varying di-rectly with the associated rate of reinforce-ment, but inversely with the rates of rein-forcement elsewhere.

Catania's work (1963b) provides a quanti-tative evaluation of equation 15. Catania usedpigeons on the Findley procedure. The twoalternatives were the usual variable-intervalschedules, each one signalled by a particularcolor on the response key. The variable inter-vals ran concurrently, but, in accordance withthe Findley procedure, there was only onecolor (and, therefore, but one variable inter-val) making direct contact with the subject atany one time. A peck on the second keychanged the color and brought the secondvariable interval into contact with the subject.The difference between this sort of concurrentschedule and the more familiar type is in the"switching" response: for the Findley proce-dure it is the pecking of a key, for the other,it is actually moving back and forth betweenthe keys. Figure 10 shows that the differencedoes not affect the matching relation. Boththe proportion of responses and the propor-tion of time equal the proportion of reinforce-ment for each alternative.

Equation 15, however, calls for absoluterates of responding, not proportions, andthese are plotted in Fig. 11. There were twoseries of reinforcement rates in the experi-ment. In one, the sum of the reinforcementrates for the two alternatives was always 40per hour, just as in Fig. 7 (Herrnstein, 1961).In the other, the sum varied, but was held at20 per hour for one of the two schedules.Equation 15 predicts a direct proportionality

Page 15: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

ON THE LAW OF EFFECT

* RESPONSESO TIME

0 .1 ,2 .3 .4 .5 .6 .7 .8 .9 1.0

PROPORTION OF REINFORCEMENTSFig. 10. The relative nuimber of responses emitted

(filled circles) and the relative amount of time spent(open circles) on one alternative in a two-choice pro-cedure as a function of the relative frequency of rein-forcement thereon. The diagonal shows matching be-tween ordinate and abscissa. From Catania (1963b).

for the first condition and a curvilinear rela-tion for the second. The left portion of Fig. 11

shows responding as a function of reinforce-

70l

60 - KEY I 0

/ KEY 2*,X

%400)

IL'

co0w 20 k1.75

REINFORCEMENTS / HOUR

Fig. 11. The ratc of responding as a function of therate of reinforcement for each alternative in a two-choice procedure. For the left panel, the overall fre-quency of reinforcement was held constant at 40reinforcements per hour, while varying complementarilyfor the two alternatives. Each point is here plottedabove the reinforcement rate at which it was obtained.For the right panel, the frequency of reinforcementfor Key 2 was held constant at 20 reinforcements perhour, while varying between 0 and 40 for Key 1. Thepoints here are plotted above the reinforcement rateon Key 1 at the time it was obtained. The values of kand R. were used for the smooth curves in both panels.From Catania (1963b).

ment when the total reinforcement rate washeld constant at 40 per hour. For each alter-native, the points approximate a straight linepassing through the origin, as theory predicts.The selected line has the parameters k = 75(responses per minute) and R. = 9 (reinforce-ments per hour). Note that these values areeasily within the range obtained in Fig. 8 fora single-response situation involving variable-interval schedules. The right-hand panelshows responding as a function of reinforce-ment rate on Key 13 for the second series,when the reinforcement for one key (Key 2)was held constant at 20 per hour while Key 1reinforcements ranged from 0 to 40 per hour.The contrast effect is the decreasing rate onKey 2 as reinforcement for Key 1 increases.The smooth curves again plot equation 15,with the parameters again at k = 75 andRo = 9. The crosses on the right panel showthe results of a second procedure with thesame three pigeons. Responding to Key 2 wasstill reinforced 20 times per hour, while thatto Key 1 was again varied from 0 to 40 perhour. The change was that the pigeonsswitched to Key 1 only when a light sig-nalled that reinforcement was due there, sothat the number of responses to Key 1 wasvirtually equal to the number of reinforce-ments. Otherwise, the procedure was un-changed. As the decreasing rate of Key 2 re-sponding shows, and as equation 15 implies,contrast depends upon the reinforcement forthe other alternative, not on the respondingthere. The crosses may actually deviate slightlyfrom the filled circles, for reasons that will beconsidered in the next section, but aside fromthis, Catania's data provide substantial con-firmation for Equation 15 and the system fromwhich it emerges.

Catania accounted for his results with apower function between absolute respondingand the relative number of reinforcements,as follows:

pI kR1(R1 + R2)5/' (16)

Like equation 15, this one, too, predictsmatching when k and the denominators canbe cancelled out, which is to say, when choice

sWith the Findley procedure, the designations, "Key1" and "Key 2", do not identify two separate responsekeys. Rather, they represent the two possible states ofthe key to which responses were reinforced.

1.0

.9

.8or04w

I-

0

Ca

L0

257

Page 16: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

R. J. HERRNSTEIN

is between otherwise equivalent alternatives.Given the variability in the data, it would behard to choose between equation 16 and thepresent formulation as contained in equation15. The most easily testable difference betweenCatania's hypothesis and the present one is inregard to responding in single-response situa-tions, for which equation 16 implies a powerfunction with an exponent of 1/6, while equa-tion 15 implies equation 13. The data, thebest of which we owe to Catania and Reynolds(1968), distinctly favor the latter alternative,since no power function handles the data fromindividual pigeons in single-response proce-dures while equation 13 makes a plausible fit,as shown in Fig. 8 and 9.Equation 13 is for single-response proce-

dures; equation 15, for two-response pro-cedures. The general case, for a situation con-taining n alternative sources of reinforcement,is:

kRpi= n (17)

X RI1=0

This mathematical form is not used through-out because it hides R0, the parameter in thedenominator. It does, however, reveal the mostgeneral implications of the theory. Matchingwill take place over any set of alternative re-sponses for which k and JR are fixed, thetypical choice procedures, for example. In-versely, when matching is not obtained, thealternatives either have different asymptoticrates (k) or are operating within different con-texts of reinforcement (ZR), or both. It is nothard to conceive of situations that exhibiteither or both of these asymmetries, eventhough they are somewhat unfamiliar in ex-periments on choice. The next section sum-marizes research in a field that shows thelatter asymmetry, the absence of a fixed con-text of reinforcement.The idea of a "context" of reinforcement

is fundamental to the present analysis, forthat is what the denominator is best called.Equation 17 says that the absolute rate of re-sponding is a function of its reinforcement,but only in the context of the total reinforce-ments occurring in the given situation. Whenthe total is not much more than the reinforce-ment for the response being studied, then re-sponding will be relatively insensitive, for theratio R1/ZR1 will tend toward 1.0. This ex-

plains, at least partly, why for over three dec-ades, interesting functional relations betweenreinforcement and responding have been soscarce. Experimeters have naturally tried tomake the behavior they were observing impor-tant to the subject, and they have probablysucceeded. If the response was to be reinforcedwith food or water, then the state of depriva-tion, the size of the reinforcer, and the experi-mental isolation of the subject have probablyguaranteed that ZR will be only negligiblylarger than R1. To do otherwise, which isto make ZR substantially larger than R1, isto risk unexplained variability, for then P1will be at the mercy of fluctuations in theother (usually unknown) reinforcers in the sit-uation. Investigators have made a tacit deci-sion in favor of stability, but at the cost ofsensitivity to the independent variable.Equation 17 explains the unique advantage

of concurrent procedures. Because ZR and kare eliminated when comparable responses arebeing measured as relative frequencies, thestability-sensitivity dilemma is avoided. Thepeculiar discrepancy between single and mul-tiple response situations has been noted, withthe one stubbornly insensitive and the othergratifyingly orderly and interesting (Catania,1963a; Chung and Herrnstein, 1967). For mag-nitude, frequency, and delay of reinforcement,it has been shown that matching in multiple-response procedures and virtual insensitivityin single-response procedures are the rule.The discrepancy is a corollary of equation 17,which implies that responses at comparablelevels of strength, as in symmetric choice ex-periments, will not only be sensitive to shiftsin reinforcement between them, but equallyaffected by uncontrolled changes in extraneousreinforcers and therefore relatively orderlywith respect to each other.

INTERACTION AT A DISTANCEEquation 17 says that the frequency of re-

sponding is proportional to its relative rein-forcement. The interval over which these re-sponses and reinforcements are counted hasso far been taken for granted. In simple choice,the interval is the experimental session itself,which is usually homogeneous with respect torates of reinforcement and responding. Theliterature of operant conditioning contains,however, many experiments in which neither

258

Page 17: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

ON THE LAW OF EFFECT

behavior nor its consequences are evenly dis-tributed over sessions, and many of these ex-periments show interactions between succes-sive (as well as simultaneous) conditions ofreinforcement. An early and clear instance wasReynolds' (1961a) experiment, in which therate of responding by pigeons on a variable-interval schedule increased when the schedulealternated with a period of non-reinforcement.The procedure started with 3-min alternationbetween red and green light on the responsekey, but responding was reinforced on thevariable-interval schedule throughout. As ex-pected, the rate of responding was approxi-mately steady. When reinforcement was dis-continued during, let us say, the red periods,responding during the green periods rose byabout 50%, even though the rate of reinforce-ment was unchanged therein. In another partof the experiment, a variable-ratio schedulewas used instead of a variable-interval sched-ule, with comparable results.This "contrast" effect, as Reynolds termed

it, has been widely confirmed, under many cir-cumstances. For present purposes, the questionis whether or not such changes in rate of re-sponding belong with the present account ofresponse strength. To anticipate the answer,the conclusion will be affirmative: the contrasteffect is largely, if not entirely, derivable froman adaptation of equation 17.

If Reynolds had used a concurrent proce-dure, instead of a multiple schedule, a con-trast effect would have been in order. With avariable interval of 3 min (VI 3-min) for re-sponding on each key, the rate of respondingis governed by the equation (measuring inreinforcements per hour),

k20p 40+R, (18)

With extinction on the other key, the rateon this key should rise, since it would nowbe governed by,

k20=20+R. (19)

If RO, the unscheduled, extraneous reinforce-ment, is assumed to be vanishingly small, thecontrast effect is a doubling, i.e., 100%. Thelarger is R., the smaller is the contrast effect.In Reynolds' experiment, the contrast effectwas only about 50%, which implies R0 =20reinforcements per hour, assuming, for the

moment, that equations 18 and 19 are appro-priate. Although R. may have been this largein his experiments, the more typical range forpigeons in the standard chamber is 1 to 10,with only infrequent instances falling outside.This, and other considerations, suggest thatthe multiple schedule, in which the alterna-tives succeed each other, differs in some wayfrom simultaneous choice as regards the analy-sis of response strength.For simultaneous choice, each source of re-

inforcement is assumed to exert a full effect onevery response alternative. However plausiblethis assumption is for concurrent procedures,it is less so for multiple schedules, in whichthe various sources of reinforcement are notsimultaneously operative. As the componentsof a multiple schedule become more separate,the interaction across components is likely todiminish; in the limiting case, diminish to nointeraction at all. Thus, if the componentsalternate slowly and are denoted by distinctivestimuli, the interaction might be smaller thanif the alternation is rapid and the stimulimarginal.There are many ways to translate this con-

tinuum of interactions into a formal expres-sion, but one of the simplest is to assume thatthe reinforcement in one component affectsthe responding in the other by some constantfraction of its full effect. For the two-compo-nent multiple schedule, then, the rate of re-sponding in one component would be givenby:4

kRi1 R+ mR, + Ro (20)

This is identical to the equation for one ofthe alternatives in a two-choice procedure

'Note that the time base for calculating the rate ofresponding in multiple schedules is the duration of thecomponent during which the responding may occur.For concurrent schedules, the time base is the entireexperimental session (excluding feeding cycles, timeouts,etc.). In general, the time base for calculating a rate ofresponding should be the time during which respond-ing may occur, which differs as indicated in multipleand concurrent procedures. Also, note that the quantityR. in equation 20 represents a composite of theextraneous reinforcement during one component (com-ponent #1 in the example in the text) plus theextraneous reinforcement in the other component, withthe latter suitably decremented by the multiplicativefactor m. In a fuller account, R. should be written outas R,1 + mRo2 Such detail is clearly premature at thispoint.

259

Page 18: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

R. J. HERRNSTEIN

(equation 15), except for the additional pa-rameter, m, which should vary between 0 and1.0, depending upon the degree of interactionacross components. Equation 15 may, in fact,be considered a special case of equation 20,where m = 1.The remainder of this section is a consider-

ation of multiple schedules along the linesimplied by equation 20. It should be notedat the outset that this equation is just one wayto extend the present formulation to situationsin which interactions are sub-maximal. As thefollowing discussion shows, the data are notquite adequate either to confirm equation 20or to demand an alternative. For present pur-poses, however, it suffices to demonstrate thatmany of the effects of multiple and concurrentprocedures follow from a single conception ofresponse strength.

Equation 20 assumes not only a simplemechanism to govern the degree of inter-action, but it tacitly describes the most sym-metrical sort of multiple schedule, as shownby the absence of subscripts on the parame-ters, k, m, and R.. This is tantamount toassuming that the response alternatives haveequal asymptotic rates (k), that the interactionone way is the same as the interaction theother (m), and that the unscheduled reinforce-ment is the same during the two components(RO). This is for a multiple schedule, in otherwords, with the same response-form in bothcomponents (k), with the same rule for thealternation from one component to the otheras for the alternation back (m), and with theextra-experimental factors held constant (RO).It is possible to violate any of these conditions,but that should only complicate the analysis,not change it. Reynolds' experiment appearsto satisfy these restrictions, and his resultswould follow from equation 20 if m was be-tween 0.55 and 0.75, assuming Ro between 1and 10 reinforcements per hour.Equation 20 says that the contrast effect de-

pends upon the reinforcement in the othercomponent. Reynolds addressed himself tothis issue at an early stage (1961b). He showedcontrast when a "timeout" period was substi-tuted for extinction. During a timeout, pi-geons do not peck the key, and, in Reynolds'experiment, they received no reinforcement.Timeout and extinction should have the sameeffects on responding in the other componentif they both reduce reinforcement in the inter-

acting component to zero, and if equation 20is correct. Reynolds then went on to show thatas long as reinforcement in the interactingcomponent is sustained, with or without re-sponding, contrast is prevented. In oneprocedure, reinforcement was for 50 sec ofnon-responding, a technique that suppressesresponding, but does not eliminate reinforce-ment. Contrast was not observed. As Reynoldsconcluded, and as equation 20 implies, "Thefrequency of reinforcement in the presence ofa given stimulus, relative to the frequencyduring all of the stimuli that successively con-trol an organism's behavior, in part deter-mines the rate of responding that the givenstimulus controls." (p. 70, his italics)The importance of reinforcement rate is

clearly shown in Bloomfield's study of twomultiple schedules (1967a). One multipleschedule consisted of 2-min alternations of VI1-min with a DRL, the latter a procedure inwhich the response is reinforced only afterspeci.ed periods (varied from 5 to 15 sec) ofnon-responding. In the other schedule, VI 1-min alternated with a fixed-ratio schedule(varied from 10 to 500). The two halves of theexperiment shared a standard variable-intervalschedule alternating with a schedule thatyielded various different rates of reinforce-ment. Moreover, for both fixed-ratio and DRLschedules, the rate of reinforcement is a con-tinuous joint function of the emitted rate ofresponding and the experimental parameter.However, the difference is that the functionsare opposite, for on a fixed-ratio schedule, therate of reinforcement is proportional to therate of responding, while on the DRL, the rateof reinforcement is inversely related to therate of responding given the levels of respond-ing obtained by Bloomfield. Nevertheless,both of Bloomfield's multiple schedules sup-port the implication of equation 20, that rein-forcement rate determines contrast. The rateof responding during the variable-intervalcomponent was an inverse function of the rateof reinforcement in the other component, with-out regard to whether the other schedule wasa fixed ratio or a DRL. Neither the directionnor the magnitude of the contrast effect wasapparently affected by the rate of respondingor the schedule per se in the interacting com-ponent.The question now is whether the depen-

dence of contrast on reinforcement is in quan-

260

Page 19: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

ON THE LAW OF EFFECT

titative, as well as qualitative, agreement withequation 20. Reynolds (1961c), using multipleVI FR schedules, concluded not only that therelative rate of responding was a function ofthe relative rate of reinforcement, but that thefunction was linear, with a positive interceptand slope less than 1.0. This was to be distin-guished from concurrent procedures, wherethe matching relation holds (slope = 1.0; in-tercept = 0).

Reynolds' linear relation is at variance withequation 20. The predicted relation betweenthe relative frequency of responding and therelative frequency of reinforcement is consid-erably more complex than linear, as follows:5

kR1P1 R1+mR2+R,

P1 + P2 kR1 kR2R2+mR2+R + R2 + mR, + R.

11 R2(R1+ mR2+ R.) (21)+R1(R2 +mR1 + R.)

As a function of the relative frequency of re-inforcement, R,/(R, + R2), this expressionplots as a family of reverse-S-shaped functions,the curvature of which is dependent upon therelative magnitudes of R1,2, Ro, and m. Com-plex as this is, it would be even worse withoutthe assumptions of symmetry that qualifyequation 20. The more general formulationwill not, however, be explicated here, sincethere appear to be virtually no data to test it.Figure 12 shows a representative samplingfrom the family expressed in equation 21. Theordinate is the relative rate of responding inone component of a multiple schedule andthe abscissa is the relative rate of reinforce-ment therein. When the interaction betweencomponents is absent, which is to say, whenthere is no contrast effect, then the parameterm = 0. When interaction is maximal, as inconcurrent procedures, m = 1.0. The shape ofthe function depends jointly on the degree ofinteraction (m) and on Ro, which is the rein-forcement from sources other than the mul-tiple schedule itself. If both this quantity andthe interaction term, m, go to zero, then therelative rate of responding during a compo-nent will be insensitive to changes in the rela-

5This assumes that the quantities Pi and R1 areeither absolute numbers taken over equal periods oftime or are themselves rates.

U)

w

'I)z .7SA .6

Ua.

z

0

PRPRTO OF REINFOCEMENT

0.

Fig. 12%yohtclcl eaigterltv

0.

.1~~ ~ ~~~~~~

PROPORTION OF REINFORCEMENTS

Fig. 12. Hypothetical curves relating the relativefrequency of responding and the relative frequencyof reinforcement in a two-component multiple sched-ule. The rate of reinforcement summed across the twocomponents is assumed to be held constant at 12 rein-forcements per unit time, while the other parametersvary as indicated on the figure. See text for discussionof equation 21.

tive rate of reinforcement. For any given valueof m, the function is made steeper by largervalues of Ro. As long as m is less than 1.0, thefunction approaches the matching line (7)asymptotically as R. increases. On the otherhand, with m = 1.0, the matching line is ob-tained, notwithstanding the value of Ro. Theeffect of Ro at m < 1.0 depends upon the rela-tive magnitudes of Ro and the over-all sched-uled rates of reinforcement, which is 12 rein-forcements per unit time for Fig. 12. WhenRo is relatively small, it exerts a smaller effectthan when it is relatively large.

Points on the functions in Fig. 12 may easilybe taken for linear, particularly if they fallin the middle of the range, as Reynolds'(1961c) did. The most complete set of datato test equation 21 is Reynolds' experiment(1963b) on multiple VI VI. Three pigeonswere each run through three sequences ofvalues. In the first, one of the schedules washeld constant at VI 3-min while the othervaried; in the second, both varied; in the last,one was again held constant, but at VI 1.58-min. Figure 13 shows for each pigeon howwell equation 21 handles the data for all threeseries. The smooth curves plot equation 21,with different parameters for the three pi-

261

Page 20: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

R. J. HERRNSTEIN

1.0 - geons. For 37, m = 0.2; for 52, m = 0.3; and37 for 53, m = 0.1. Ro was arbitrarily set at 0 for

all pigeons for convenience in calculation and.8- presentation. If non-zero values of Ro had

been used, then no single predicted curvecould have been drawn for each pigeon, since

.6 - the over-all frequency of reinforcement wasnot held constant, and the effect of Ro dependsupon the over-all frequency of reinforcement.

.4 - In any case, the typical value of Ro can exertonly negligible effects on the relative rate of

/ responding, at these rates of reinforcement..2 mm.2 The fit of the points to the data justifies both

r the omission of R0 and the general formula-

0 I tion underlying equation 21.I0°- Although the fit in Fig. 13 supports the

53 present account, reverse-S curves are not nec-cn t essarily evidence for contrast. Figure 12 shows

z .8 that curves of this general shape may be ob-o tained even when m = 0, which is to say,en / when the reinforcement in each. componentw

.6 affects the responding in that component only.E* To show contrast effects directly, absolute, noto relative, rates of responding are in order. Al-Z .4 - * though Reynolds' absolute rate data containo 0 evidence for contrast, they were too variable,

both within and between subjects, to permit0X. .2 m- I a useful test of the present formulation. The0 most comprehensive data for this purpose ap-

pear to be in an unpublished experiment byC J. A. Nevin.

1.0 - Nevin used pigeons on a three-component52 multiple schedule, of which two were conven-

_8tional VI 1-min and VI 3-min, signalled by

.8 fi / the color of the illumination on the responsekey. Each lasted for 1 min, after which thechamber was darkened for 30 sec. The inde-

.6 / - pendent variable was the rate of reinforcement*> during the timeout, which was set at five nomi-

4-

nal values, from zero to one every 10 sec. The* absolute rate of responding during a compo-

nent should follow from equation 20, modified.2 mmm.3 to take into account the presence of three, in-

stead of two, components. For the componentwith VI 1-min (60 reinforcements per hour),

°O .2 .4 .6 . the equation is, to a first approximation,C0 .2 .4 .6 .8 1.0 k60PROPORTION OF REINFORCEMENTS P1= 60+m(20+xx +R; (22)

Fig. 13. The relative frequency of responding as a k2Jfunction of the relative frequency of reinforcement in for the VI 3-min (20 reinforcements per hour)one component of a multiple variable-interval, vari- component it isable-interval schedule, for each of three subjects. The c o ttsmooth curves plot the theoretical functions, with p k20(23)Ro=0 and m set at the values indicated. From 20 + m602+x )+RReynolds (1963b). 2

262

Page 21: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

ON THE LAW OF EFFECT

The quantity, x, is the independent variable,which is to say, the reinforcement rate duringtimeout. The interaction term, m( ), is theaverage rate of reinforcement during the twocomponents that might cause contrast. Thisformulation disregards at least two factors thatare likely to have some effect. First of all, ittacitly assumes equivalent interactions be-tween responding in a given component andreinforcement in either of the two other com-ponents. A more realistic assumption would bethat the component closer in time exerts agreater effect, as, in fact, some work has indi-cated (Boneau and Axelrod, 1962; Pliskoff,1963). Secondly, it tacitly assumes that thedurations of the components (1 min as op-posed to 30 sec for the timeout) are immaterialas regards contrast, whereas a component's ef-fect probably increases with its duration. Plac-ing values on these two complicating factorswould be totally arbitrary at present, as wellas somewhat unnecessary, since they may atleast be presumed to operate in opposite di-rections, the nearer component being thebriefer.

Figure 14 shows the absolute rate of re-sponding during the two variable-intervalcomponents, averaged over the four subjects.The two curves are plots of equations 22 and23. Narrow bands, rather than lines, are drawnfor the predicted values because of variationin the obtained rates of reinforcement on the

80

0 k* 8VI

I

W 0 2 3

w

E 50 <PREDICTION FOR VI3

REINFORCEMENTS / HOUR

Fig. 14. The rate of responding in one componentas a function of the rate of reinforcement summedacross the other two components in a three-componentmultiple schedule. The filled circles and the upper

curve are for the component held constant at a

variable interval of 3 min. Both curves drawn withlower curve are for the component held constant at a

variable-interval of 3 min. Both curves drawn withthe parameters as indicated in the box. From Nevin(unpublished).

nominal VI 1-min and VI 3-min. The abscissais the rate of reinforcement summed across twoof the three components, omitting the compo-nent from which responding is plotted (i.e.,two times the term in parenthesis in equations22 and 23). The numbers alongside the pointsidentify the two values from each of the fivemultiple schedules studied. Generally, thepoint on the lower curve is shifted 40 rein-forcements per hour to the right of the upperpoint, which is the difference between the VI1-min and the VI 3-min. These decreasingfunctions are pure contrast, since the rates ofreinforcement were close to constant in eachof the variable-interval components (exceptfor the last point on the lower curve). Equa-tions 22 and 23, with parameters set at R0 = 3reinforcements per hour, m = 0.50, and k = 80responses per minute, appear to be satisfactoryexcept for the first point for VI 3-min.

Contrast is generally (e.g., Reynolds, 1963b)more pronounced with lower constant rates ofreinforcement. Thus, the upper function inFig. 14 is shallower than the lower. In fact, asthe reinforcement rate during the timeout in-creases, the proportion between the rates ofresponding in the two components should, ac-cording to equations 22 and 23, approachasymptotically the ratio between the rates ofreinforcement, in this instance 3:1. The ratiosof responding are, moving to the right alongthe abscissa, 1.19:1, 1.6:1; 2.24:1, 2.54:1; 2.69:1.The greater susceptibility to contrast of lessfrequently reinforced behavior has been notedby Bernheim and Williams (1967), as well asin various experiments on pigeons in multipleschedules. Their finding, based as it wason rats in running wheels, serves particu-larly to extend the applicability of the pres-ent analysis.Although it is premature to argue for the

exact function by which the interaction be-tween components is diminished, the simplemultiplicative factor of equation 20 does wellwith most of the published literature. Parame-ters have been calculated for three other mul-tiple schedules in which reinforcement variedover a range large enough to allow some as-sessment. The fit of the data to theory was noworse than in Fig. 14, with no deviationgreater than six responses per minute, and 15(out of 18 independent data points) less thanthree responses per minute. Table 1 shows theparameter values for the three experiments,

263

Page 22: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

R. J. HERRNSTEIN

all of which used pigeons as subjects. Landerand Irwin's (1968) was a conventional two-component multiple schedule. Nevin's (1968)had as its independent variable the durationof reinforced non-responding in one of its twocomponents. The shorter the duration, thehigher the rate of reinforcement. The param-

eters predict the rate of responding in theother component, which was a conventionalvariable interval. Rachlin and Baum's (1969)experiment used two keys, of which one alwaysprovided reinforcement on VI 3-min. The re-

inforcer was 4-sec access to food. The light on

the other key was extinguished except whena second VI 3-min programmer completed an

interval, at which time it was lit until the nextresponse collected the reinforcement.6 Theamount of these reinforcements was varied16-fold during the course of the experiment.Responding on the other key varied inverselywith magnitude of reinforcement, as if amountand frequency of reinforcement could betraded off against each other in equation 20.Equation 20 appears to run into trouble

mainly in regard to transitory effects. For ex-

ample, Terrace (1966a, -b) found a transientcontrast effect going from a simple variable-interval schedule to the same variable intervalalternating with extinction. Equation 20 doesnot predict any change in rate here, since theinteraction term is zero before and after thechange. Terrace's procedure differs from Rey-nolds', which goes from a multiple schedule

"Catania's findings (1963b) with just this procedureare plotted as x's on Fig. 11. It was noted before (seep. 257) that the x's are slightly above the predicted line,which is where they should be if m < 1.0; in fact,m < 1.0 (m = 0.45 in the Rachlin-Baum study) becausethe alternating conditions of reinforcement and stimu-lation make this more properly a multiple schedulethan a concurrent, notwithstanding the use of tworesponse keys.

consisting of a pair of matched variable inter-vals to one consisting of the variable intervalalternating with extinction and for whichequation 20 predicts contrast. Terrace's con-trast effect vanishes with mere exposure tothe conditions or with repeated alternationsbetween the variable interval in isolation andthe multiple schedule consisting of the varia-ble interval and extinction. On the otherhand, Bloomfield (1967b), using Reynolds'paradigm and addressing himself specificallyto the persistence of contrast, found the effectundiminished even with several alternationsback and forth between mult VI VI and multVI EXT. Equation 20, then, is apparently con-sistent with the long-term effects, but not withthe temporary contrast in the Terrace para-digm.

Terrace (1966b) concluded that the tempo-rary contrast effect is emotional, based on theaversiveness of extinction. He noted the one-to-one correspondence between contrast andpeak shift in discrimination learning, and alsothat the peak shift is often taken as evidencefor an inhibitory or aversive effect of non-reinforcement. Any procedure that producesa peak shift also produces contrast; any pro-cedure that prevents or diminishes the peakshift, prevents or diminishes contrast, accord-ing to Terrace. Aversiveness as negative rein-forcement solves the problem, for adding a

negative quantity to the denominator of kRA/(RA + R0) increases the value of the expres--sion, given that it remains positive. And thereis other evidence that extinction is temporarilyaversive as, for example, summarized by Ter-race in his defense of the view. There is alsoevidence for representing aversiveness as anegative quantity. Brethower and Reynolds(1962) used a mult VI VI with equal reinforce-ment schedules in the two components, andadded punishment with electric shock to one

ble 1

k R.(Responses (Reinforcements

per Minute) m per Hour) Procedure

Lander and Irwin 77 0.2 4 VI 3-min VI x-min(1968)

Nevin (1968) 85 0.35 2 VI 3-min DRO xRachlin and Baum 70 0.45 10 VI 3-min CRF

(1969) (varying amount ofreinf)

264

Page 23: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

ON THE LAW OF EFFECT 265

of the components. They found a contrast ef-fect in which magnitude was an increasingfunction of the shock intensity.A corrollary of the foregoing is that equa-

tion 20 predicts a difference between the rateon a simple variable-interval schedule and therate on a multiple schedule in which eachcomponent uses this same variable interval,as follows:

kRA kRA(4

RA + R. RA +mRA+RO (24)

Only if m = 0 will the two expressions beequal; otherwise the variable interval in iso-lation will produce the higher rate. This sur-prising prediction appears, in fact, to becorrect. In Bloomfield's experiment (1967b),responding on a variable interval alone wasfaster than responding on a multiple compris-ing two components of the same variable in-terval, although the experiment was not spe-cifically designed for this comparison. A directtest of the inequality above (24) is in an un-published study by Terrace. Four pigeonswere trained initially with a simple VI 1-minin the presence of a given stimulus. Then,when responding was judged to be stable, thisstimulus alternated every 1.5 min with a dif-ferent stimulus, but reinforcement was con-tinued on VI 1-min. For each pigeon, and forat least several weeks of daily sessions, therate of responding dropped during the stim-ulus common to the first and second proce-dures. The average decrement was about 30%,which for these reinforcement rates implies0.31 < m < 0.36 (setting R, = 0-10). At thispoint, however, the discussion begins to gobeyond the boundaries set by well establisheddata.

ENVOITemporary changes in responding clearly

require further study, as the foregoing sectionshows. There are other, even more short-term,changes in responding (e.g., Williams, 1965;Nevin and Shettleworth, 1966; Bernheim andWilliams, 1967) in multiple schedules thatmust for now also remain on the boundaryof the present formulation. Another boundary,touched on earlier, is the definition of re-sponse classes (see p. 251). The key pecking ofpigeons, for example, is relatively unambigu-ous, but even its properties, as some experi-

ments show (Herrnstein, 1958), can get en-tangled with the measurement of responsestrength. Nevertheless, within these bounda-ries, the present formulation is a quantitativelaw of effect that cuts across traditional dis-tinctions between choice and sheer output ofbehavior, as well as between simultaneous andsuccessive conditions of work. The territorycircumscribed is sizeable, expandable, and sus-ceptible to precise measurement.

REFERENCESBaum, W. H. and Rachlin, H. C. Choice as time al-

location. Journal of the Experimental Analysis ofBehavior, 1969, 12, 861-874.

Bernheim, J. W. and Williams, D. R. Time-dependentcontrast effects in a multiple schedule of food rein-forcement. Journal of the Experimental Analysis ofBehavior, 1967, 10, 243-249.

Bitterman, M. E. Phyletic differences in learning.American Psychologist, 1965, 20, 396410.

Bloomfield, T. M. Behavioral contrast and relativereinforcement in two multiple schedules. Journalof the Experimental Analysis of Behavior, 1967, 10,151-158. (a)

Bloomfield, T. M. Some temporal properties of be-havioral contrast. Journal of the ExperimentalAnalysis of Behavior, 1967, 10, 159-164. (b)

Boneau, C. A. and Axelrod, S. Work decrement andreminiscence in pigeon operant responding. Journalof Experimental Psychology, 1962, 64, 352-354.

Brethower, D. M. and Reynolds, G. S. A facilitativeeffect of punishment on unpunished behavior.Journal of the Experimental Analysis of Behavior,1962, 5, 191-199.

Brownstein, A. J. and Pliskoff, S. S. Some effects ofrelative reinforcement rate and changeover delay inresponse-independent concurrent schedules of rein-forcement. Journal of the Experimental Analysis ofBehavior, 1968, 11, 683-688.

Brunswik, E. Representative design and probabilistictheory in a functional psychology. PsychologicalReview, 1955, 62, 193-217.

Bush, R. R. and Mosteller, F. Stochastic models forlearning. New York: Wiley, 1955.

Catania, A. C. Concurrent performances: a baselinefor the study of reinforcement magnitude. Journalof the Experimental Analysis of Behavior, 1963, 6,299-300. (a)

Catania, A. C. Concurrent performances: reinforce-ment interaction and response independence. Jour-nal of the Experimental Analysis of Behavior, 1963,6, 253-263. (b)

Catania, A. C. Concurrent operants. In W. K. Honig(Ed.), Operant behavior: areas of research andapplication. New York: Appleton-Century-Crofts,1966. Pp. 213-270.

Catania, A. C. and Reynolds, G. S. A quantitativeanalysis of the responding maintained by intervalschedules of reinforcement. Journal of the Experi-mental Analysis of Behavior, 1968, 11, 327-383.

Chung, S.-H. Some quantitative laws of operant be-

Page 24: ON THE OFEFFECT' J. HERRNSTEIN · ON THELAWOFEFFECT 2803 Z 0--0177 RP *-4*1T RP 240, z / w LAJ U. Z 40- w w z 0. 0 50 l00 s50 200 250 NUMBER REQUIREMENT Fig. 3. For two subjects,

266 R. J. HERRNSTEIN

havior. Unpublished doctoral dissertation, HarvardUniversity, 1966.

Chung, S.-H. and Herrnstein, R. J. Choice and delayof reinforcement. Journal of the ExperimentalAnalysis of Behavior, 1967, 10, 67-74.

Ferster, C. B. and Skinner, B. F. Schedules of reinforce-ment. New York: Appleton-Century-Crofts, 1957.

Findley, J. D. Preference and switching under con-current scheduling. Journal of the ExperimentalAnalysis of Behavior, 1958, 1, 123-144.

Guthrie, E. R. and Horton, G. P. Cats in a puzzle box.New York: Rinehart, 1946.

Herrnstein, R. J. Some factors influencing behavior ina two-response situation. Transactions of the NewYork Academy of Sciences, 1958, 21, 35-45.

Herrnstein, R. J. Relative and absolute strength ofresponse as a function of frequency of reinforce-ment. Journal of the Experimental Analysis of Be-havior, 1961, 4, 267-272.

Herrnstein, R. J. Secondary reinforcement and rate ofprimary reinforcement. Journal of the ExperimentalAnalysis of Behavior, 1964, 7, 27-36.

Herrnstein, R. J. and Morse, W. H. A conjunctiveschedule of reinforcement. Journal of the Experi-mental Analysis of Behavior, 1958, 1, 15-24.

Holz, W. C. Punishment and rate of positive rein-forcement. Journal of the Experimental Analysisof Behavior, 1968, 11, 285-292.

Hull, C. L. Principles of behavior. New York: Apple-ton-Century, 1943.

Lander, D. G. and Irwin, R. J. Multiple schedules:effects of the distribution of reinforcements betweencomponents on the distribution of responses be-tween components. Journal of the ExperimentalAnalysis of Behavior, 1968, 11, 517-524.

Morse, W. H. Intermittent reinforcement. In W. K.Honig (Ed.), Operant behavior: areas of researchand application. New York: Appleton-Century-Crofts, 1966. Pp. 52-108.

Natapoff, A. How symmetry restricts symmetricchoice. Journal of Mathematical Psychology, inpress.

Neuringer, A. J. Choice and rate of responding in thepigeon. Unpublished doctoral dissertation, HarvardUniversity, 1967. (a)

Neuringer, A. J. Effects of reinforcement magnitudeon choice and rate of responding. Journal of theExperimental Analysis of Behavior, 1967, 10, 417-424. (b)

Nevin, J. A. Differential reinforcement and stimuluscontrol of not responding. Journal of the Experi-mental Analysis of Behavior, 1968, 11, 715-726.

Nevin, J. A. Signal detection theory and operant be-havior. Review of D. M. Green and J. A. Swets'Signal detection theory and psychophysics. Journalof the Experimental Analysis of Behavior, 1969, 12,475-480.

Nevin, J. A. and Shettleworth, S. J. An analysis ofcontrast effects in multiple schedules. Journal of theExperimental Analysis of Behavior, 1966, 9, 305-315.

Norman, M. F. An approach to free-responding on

schedules that prescribe reinforcement probabilityas a function of interresponse time. Journal ofMathematical Psychology, 1966, 3, 235-268.

Pliskoff, S. S. Rate-change effects with equal potentialreinforcements during the "warning" stimulus.Journal of the Experitnental Analysis of Behavior,1963, 6, 557-562.

Rachlin, H. and Baum, W. M. Response rate as afunction of amount of reinforcement for a signalledconcurrent response. Journal of the ExperimentalAnalysis of Behavior, 1969, 12, 11-16.

Reynolds, G. S. An analysis of interactions in amultiple schedule. Journal of the ExperimentalAnalysis of Behavior, 1961, 4, 107-117. (a)

Reynolds, G. S. Behavioral contrast. Journal of theExperimental Analysis of Behavior, 1961, 4, 57-71.(b)

Reynolds, G. S. Relativity of response rate and rein-forcement frequency in a multiple schedule. Journalof the Experimental Analysis of Behavior, 1961, 4,179-184. (c)

Reynolds, G. S. On some determinants of choice inpigeons. Journal of the Experimental Analysis ofBehavior, 1963, 6, 53-59. (a)

Reynolds, G. S. Some limitations on behavioral con-trast and induction during successive discrimination.Journal of the Experimental Analysis of Behavior,1963, 6, 131-139. (b)

Shimp, C. P. Probablistically reinforced choice be-havior in pigeons. Journal of the ExperimentalAnalysis of Behavior, 1966, 9, 443-455.

Shull, R. L. and Pliskoff, S. S. Changeover delay andconcurrent performances: some effects on relativeperformance measures. Journal of the ExperimentalAnalysis of Behavior, 1967, 10, 517-527.

Skinner, B. F. The behavior of organisms. New York:Appleton-Century, 1938.

Skinner, B. F. The nature of the operant reserve.Psychological Bulletin, 1940, 37, 423 (abstract).

Skinner, B. F. Science and human behavior. NewYork: Macmillan, 1953.

Terrace, H. S. Behavioral contrast and the peak shift:effects of extended discrimination training. Journalof the Experimental Analysis of Behavior, 1966, 9,613-617. (a)

Terrace, H. S. Stimulus control. In W. K. Honig (Ed.),Operant behavior: areas of research and application.New York: Appleton-Century-Crofts, 1966. Pp. 271-344. (b)

Thorndike, E. L. Animal intelligence. New York:Macmillan, 1911.

Tolman, E. C. The determiners of behavior at achoice point. Psychological Review, 1938, 45, 1-41.

Tolman, E. C. Cognitive maps in rats and men.Psychological Review, 1948, 55, 189-208.

Williams, D. R. Negative induction in instrumentalbehavior reinforced by central stimulation. Psycho-nomic Science, 1965, 2, 341-342.

*Received 3 December 1969.