modeling games from the 20th century

20
Behavioural Processes 54 (2001) 33 – 52 Modeling games from the 20th century P.R. Killeen * Department of Psychology, Arizona State Uniersity, Tempe, AZ 85287 -1104, USA Received 11 July 2000; accepted 5 January 2001 Abstract A scientific framework is described in which scientists are cast as problem-solvers, and problems as solved when data are mapped to models. This endeavor is limited by finite attentional capacity which keeps depth of understand- ing complementary to breadth of vision; and which distinguishes the process of science from its products, scientists from scholars. All four aspects of explanation described by Aristotle trigger, function, substrate, and model are required for comprehension. Various modeling languages are described, ranging from set theory to calculus of variations, along with exemplary applications in behavior analysis. © 2001 Elsevier Science B.V. All rights reserved. Keywords: Causality; Complementarity; Models; Theories; Truth www.elsevier.com/locate/behavproc 1. Introduction It was an ideal moment for an aspiring young man to enter the field. Half a century of labora- tory research had generated an unparalleled backlog of data that demanded understanding. Very recent experiments had brought to light entirely new kinds of phenomena. The great twenty-[first] century upheavals that were to rock [psychology] to its foundations had barely begun. The era of classical [psychology] had just come to an end.Abraham Pais, Neils Bohrs Times 52–53. Society supports science because it is in soci- ety’s interest to do so. Every grant application has scientists underline the redeeming social qualities of their work; most students are most interested in applications; and scientists often describe their work to their neighbors in terms of its implica- tions for every man. Outstanding discoveries with practical consequences, such as that of the elec- tron, have ‘coat-tails’ that support generations of more esoteric inquiries. But application is not the goal of science; it is the goal of its sibling, technol- ogy. Technology uses scientific structures to change the world, whereas science uses technology to change its structures. This is an essay on the interplay between scientific structures – the theo- ries and models that constitute knowledge – and their map to the empirical world. * Tel.: +1-602-9652555; fax: +1-602-9658544. E-mail address: [email protected] (P.R. Killeen). 0376-6357/01/$ - see front matter © 2001 Elsevier Science B.V. All rights reserved. PII:S0376-6357(01)00148-6

Upload: others

Post on 27-Oct-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modeling games from the 20th century

Behavioural Processes 54 (2001) 33–52

Modeling games from the 20th century

P.R. Killeen *Department of Psychology, Arizona State Uni�ersity, Tempe, AZ 85287-1104, USA

Received 11 July 2000; accepted 5 January 2001

Abstract

A scientific framework is described in which scientists are cast as problem-solvers, and problems as solved whendata are mapped to models. This endeavor is limited by finite attentional capacity which keeps depth of understand-ing complementary to breadth of vision; and which distinguishes the process of science from its products, scientistsfrom scholars. All four aspects of explanation described by Aristotle trigger, function, substrate, and model arerequired for comprehension. Various modeling languages are described, ranging from set theory to calculus ofvariations, along with exemplary applications in behavior analysis. © 2001 Elsevier Science B.V. All rights reserved.

Keywords: Causality; Complementarity; Models; Theories; Truth

www.elsevier.com/locate/behavproc

1. Introduction

It was an ideal moment for an aspiring youngman to enter the field. Half a century of labora-tory research had generated an unparalleledbacklog of data that demanded understanding.Very recent experiments had brought to lightentirely new kinds of phenomena. The greattwenty-[first] century upheavals that were torock [psychology] to its foundations had barelybegun. The era of classical [psychology] hadjust come to an end.Abraham Pais, Neils Bohr’sTimes 52–53.

Society supports science because it is in soci-ety’s interest to do so. Every grant application hasscientists underline the redeeming social qualitiesof their work; most students are most interested inapplications; and scientists often describe theirwork to their neighbors in terms of its implica-tions for every man. Outstanding discoveries withpractical consequences, such as that of the elec-tron, have ‘coat-tails’ that support generations ofmore esoteric inquiries. But application is not thegoal of science; it is the goal of its sibling, technol-ogy. Technology uses scientific structures tochange the world, whereas science uses technologyto change its structures. This is an essay on theinterplay between scientific structures – the theo-ries and models that constitute knowledge – andtheir map to the empirical world.

* Tel.: +1-602-9652555; fax: +1-602-9658544.E-mail address: [email protected] (P.R. Killeen).

0376-6357/01/$ - see front matter © 2001 Elsevier Science B.V. All rights reserved.

PII: S 0376 -6357 (01 )00148 -6

Page 2: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–5234

Science does not cumulate; science evolves. Justas telling students is less than teaching them,telling them what we know is less than teachingthem to know. Science is the crest of the wave ofknowledge: frothy, dangerous, and contemporary.Without an accumulated mass of water beneath acrest, it would be mere foam; without an accumu-lated mass of knowledge beneath a dissertation, itwould be mere foam. But, however important thatmass is not science, but its product. It is not thething that makes science addictive.

The history of science is cumulative, but itspractice is evolutionary. Memory is finite: stu-dents cannot learn all that their mentors knowand have time for creative work. Those studentsdocile to intense library-work are often refractoryto intense laboratory-work. Good scientists areproblem-solvers, not pedants. Their is is not thecomprehension of the scientific structure of a dis-cipline in toto, but rather the master of a smallpart that they can perfect. The gift we give ourstudents is not what we have seen, but a betterway of looking; not solutions, but problems; notlaws, but tools to discover them.

Great tools create problems; lesser tools solvethem. There are many uncertainties in the worldthat are not considered problematic. Great toolstransform such nescience into ignorance, recon-struing them as important gaps in knowledge.Method then recasts the ignorance as a series ofproblems and initiates a complementary researchprogram. The bubble-chamber created problemsfor generations of physicists. The double-helixwas important as a fact but more important as acornucopia of problems that fed the careers ofmolecular biologists. Salivating dogs were only amess until Pavlov recognized the significance oftheir ‘psychic secretions’; his conditioningparadigm unleashed 100 year of problems andassociated research programs. Experimentalchoices were made primarily by humans, until the2-key experimental chamber made it easy to studythe choices of pigeons and rats, which then domi-nated the operant literature for a generation.Contrast was primarily a confound — until con-ditions of reinforcement were systematically alter-nated with techniques such as multiple schedules,

yielding an embarrassment of problems largelyunsolved today. Constraints on learning were notpart of a research program — until Garcia sick-ened his rats and found that they learned despiteresponse-punishment delays of hours. Fabricatingthese problem-originating tools is a creative art;the original experiments in which they were de-ployed, however flawed, were seminal. Suchparadigm creation eludes the present discussion,which focuses on the nature of the scientific prob-lems they create, and the quantitative techniquesthat have been deployed to solve them. Discussionstarts with the intellectual context of science —its framework. It then reviews the role of theoriesand some of their products — models — thatmay be useful in the analysis of behavior.

2. The complementarity, distribution,relativization, and truthfulness of explanations

2.1. The complementarity of explanation

Attention limits our ability to comprehend anexplanation/theory/model. The limits can be ex-tended by graphical and mathematical techniques,and by chunking — constructing macros that actas shorthand — but not indefinitely. Neils Bohrpromulgated complementarity theory as an expres-sion of such constraints. The name comes fromthe complementary angles created when linesintersect.

Complementarity occurs whenever some quan-tity is conser�ed. When lines intersect, the 180°measure of a line is conserved, as the angles oneither side of the intersection must sum to thatvalue. Bohr noted many scientific complements,such that the more one knows about one aspect,the less one can know about the other. Positionand momentum are his classic complements. Pre-cision and clarity, or intelligibility, are others.These are complementary because our ability tocomprehend — to hold facts or lines of argumenttogether — is limited. Detailed and precise expo-sition is a sine qua non of science; but if the detailsdo not concern a problem of personal interest,they quickly overwhelm. Conversely, the large

Page 3: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–52 35

picture without the detailed substrate is a gloss.Both are necessary, but the more of one, the lessof the other.

The more parameters in an equation, the moreprecisely it describes a phenomenon. Hundreds ofparameters are used to describe the orbit of asatellite around the earth. But the more parame-ters, the less certain we can be what each is doing,and the more likely it is that one is doing thework of some of the others. The more parameters,the greater the likelihood that their interactionswill generate emergent phenomena. Precision iscomplementary to comprehension; and both arenecessary.

Understanding the principle of complementar-ity is essential so that students do not discreditmodels for their complexity, or discredit glosseson them for their superficiality. Complementarityarises from a constraint on our processing abili-ties, not a shortcoming of a particular theoreticaltreatment. In a microscope, field of view is con-served; precise visualization of detail must sac-rifice a larger view of the structure of the object.In a scientists’s life, time is conserved, so thatefforts at understanding the relation of one’sproblem to the larger whole is time away fromperfecting technique. One can survey the land-scape or drill deeper, but one cannot do both atthe same time.

Scientists have yet to develop a set of tech-niques for changing the field of view of a theorywhile guaranteeing connectedness through theprocess: theoretical depth of focus is discrete, notcontinuous. Ideally, all models should demon-strate that they preserve phenomena one level upand one level down. Non-linear interactions, how-ever, give rise to ‘emergent phenomena’ not well-handled by tools at a different level. One mightshow, for instance, that verbal behavior is consis-tent with conditioning principles. But those prin-ciples by themselves are inadequate to describemost of the phenomena of speech.

Constraints on resources exacerbate theoreticaldistinctions. To provide lebensraum for new ap-proaches, protagonists may deny any relevance tounderstanding at a different level, much as euca-lyptus trees stunt the growth of competing flora.

Complementarity of resources — light and mois-ture in the case of trees, money and studentplacements in the case of scientists — thus accel-erates the differentiation of levels and helps createthe universities of divergent inquiries so commontoday.

2.2. Distribution of explanation

A different complementarity governs what weaccept as explanation for a phenomenon. It isoften the case that a single kind of explanationsatisfies our curiosity, leaving us impatient withattempts at other explanations that then seemredundant. But there are many types of validexplanation, and no one kind by itself can providecomprehension of a phenomenon. Belief that onetype suffices creates unrealistic expectations andintellectual chauvinism. Comprehension requires adistribution of explanations, and in particular, thedistribution given by Aristotle’s four (be)causes:1. Efficient causes. These are events that occur

before a change of state and trigger it (suffi-cient causes). Or they do not occur before anexpected change of state, and their absenceprevents it (necessary causes). These are whatmost scholars think of as cause. They includeSkinner’s ‘variables of which behavior is afunction’.

2. Material causes. These are the substrates, theunderlying mechanisms. Schematics of under-lying mechanisms contribute to our under-standing: the schematic of an electronic circuithelps to troubleshoot it. Neuroscientific expla-nations of behavior exemplify such materialcauses. Assertions that they are the best oronly kind of explanation is reductionism.

3. Final causes. The final cause of an entity orprocess is the reason it exists — what it doesthat has justified its existence. Final causes arethe consequences that Skinner spoke of whenhe described selection by consequences. Asser-tion that final causes are time-reversed efficientcauses is teleology : results cannot bring abouttheir efficient causes. But final causes are adifferent matter. A history of results, for in-stance, may be an agent. A history of condi-tioning vests in the CS a link to the US; the

Page 4: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–5236

CS is empowered as an efficient cause byvirtue of its (historical) link to a final causethat is important to the organism. Explana-tions in terms of reinforcement are explana-tions in terms of final causes. Wheneverindividuals seek to understand a strange ma-chine and ask ‘‘What does that do?’’, they areasking for a final cause. Given the schematicof a device (a description of mechanism), wecan utilize it best if we are also told thepurpose of the device. There are many finalcauses for a behavior; ultimate causes have todo with evolutionary pressures; more proxi-mate ones may involve a history of reinforce-ment or intentions.

4. Formal causes. These are analogs, metaphorsand models. They are the structures withwhich we represent phenomena, and whichpermit us to predict and control them. Aris-totle’s favorite formal cause was the syllogism.The physicist’s favorite formal cause is a dif-ferential equation. The chemists’ is a molecu-lar model. The Skinnerian’s is the three-termcontingency. All understanding involves find-ing an appropriate formal cause — that is,mapping phenomena to explanations having asimilar structure to the thing explained. Oursense of familiarity with the structure of themodel/explanation is transferred to the phe-nomenon with which it is put in correspon-dence. This is what we call understanding.

Why did Aristotle confuse posterity by callingall four of these different kinds of explanationcauses? He did not. Posterity confused itself (San-tayana characterized those translators/interpretersas ‘learned babblers’). To remain consistent withcontemporary usage, these may be called causal,reductive, functional and formal explanations, re-spectively. No one type of explanation can satisfy:comprehension involves getting a handle on allfour types. To understand a pigeon’s key-peck, weshould know something about the immediatestimulus (Type 1 explanation), the biomechanicsof pecking (Type 2), and the history of reinforce-ment and ecological niche (Type 3). A Type 4explanation completes our understanding with atheory of conditioning. Type 4 explanations arethe focus of this article.

2.3. Relati�ization of explanation

A formal explanation proceeds by apprehend-ing the event to be explained and placing it incorrespondence with a model. The model iden-tifies necessary or sufficient antecedents for theevent. If those are found in the empirical realm,the phenomenon is said to be explained. An ob-server may wonder why a child misbehaves, andsuspect that it is due to a history of reinforce-ment for misbehavior. If she then notices that aparent or peer attends to the child contingent onthose behaviors, she may be satisfied with anexplanation in terms of conditioning. Effect (mis-behavior)+model (law of effect)+map betweenmodel and data (reinforcement is observed)=ex-planation. Explanation is a relation between themodels deployed and the phenomena mapped tothem.

The above scenario is only the beginning of ascientific explanation. Confounds must be elimi-nated: although the misbehavior appears to havebeen reinforced, that may have been coincidence.Even if attention was the reinforcer which main-tains the response, we may wish to know whatvariables established the response, and what vari-ables brought the parents or peers to reinforce it.To understand why a sibling treated the sameway does not also misbehave, we must determinethe history and moderator variables that wouldexplain the difference. All of this necessary detailwork validates the map between the model andthe data; but it does not change the intrinsicnature of explanation, which is bringing a modelinto alignment with data.

Prediction and control also involve the align-ment of models and data. In the case of predic-tion a causal variable is observed in theenvironment, and a model is engaged to foretellan outcome. A falling barometer along with amanual, or model, for how to read it, enables thesailor to predict storm weather. Observation thatstudents are on a periodic schedule of assign-ments enables the teacher to predict post-rein-forcement pausing. The demonstration ofconformity between model and pre-existing datais often called prediction. That is not pre-diction,however, but rather post-diction. Such alignment

Page 5: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–52 37

signifies important progress and is often the bestthe field can do; but it is less than prediction.This is because, with outcome in hand, variousimplicit stimuli other than the ones touted by thescientist may control the alignment; as may vari-ous ad hoc responses, such as those involved inaggregation or statistical evaluation of the data.Those stimuli and responses may not be under-stood or replicable when other scientists attemptto employ the model. Journal editors should,therefore, require that such mappings be spokenof as ‘the model is consistent with/conforms to/gives an accurate account of/the data.

In the case of control, the user of a modelintroduces a variable known to bring about acertain effect. A model stating that water vaporis more likely to condense in the presence of anucleus may lead a community to seed the pass-ing clouds to make rain. A model stating thatconditioned reinforcers can bridge otherwise dis-ruptive delays of reinforcement may lead a petowner to institute clicker training to control thebehavior of her dog. The operation of a modelby instantiating the sufficient conditions for itsengagement constitutes control. Incomplete spe-cification or manipulation of the causal variablesmay make the result probabilistic.

Explanation is thus the provision of a modelwith (a) the events to be explained as a conse-quent, and (b) events noticeable in the environ-ment as antecedents; successful explanations makethe consequents predictable from the antecedents,given that model. Prediction is the use of a modelwith antecedents being events in the environmentand a consequent to be sought in the environ-ment. If the consequents are already known, therelation is more properly called ‘accounting for’rather than ‘prediction of’. Control is the arrange-ment of antecedents in the context of a model thatincreases the probability of desired consequences.

2.4. The truth of models

Truth is a state of correspondence betweenmodels and data. Models are neither true norfalse per se; truth is a relative predicate, one thatrequires specification of both the model and thedata it attempts to map. He is 40-years-old has

no truth value until it is ascertained to whom the‘he’ refers. 2+2=4 has no truth value. It is aninstance of a formal structure that is well-formed. 2 apples+2 peaches=4 pieces of fruit istrue. To make it true, the descriptors/dimensionsof the things added had to be changed as wepassed the plus sign, to find a common setwithin which addition could be aligned. Some-times this is difficult. What is: 2 apples+2 arti-chokes? Notice the latency in our search for asuperset that would embrace both entities? Find-ing ways to make models applicable to appar-ently diverse phenomena is part of the creativeaction of science. Constraining or reconstruingthe data space is as common a tool for improv-ing alignment as is modification of the model.

Not only is it necessary to map the variablescarefully to their empirical instantiations, it isequally important to map the operators. Theysymbol ‘+ ’ usually stands for some kind ofphysical concatenation, such as putting things onthe same scale of a balance, or putting them intothe same vessel. If it is the latter, then 2 gallonsof water+2 gallons of alcohol=4 gallons of liq-uid is a false statement, because those liquids mixin such a way that they yield less than 4 gallons.Reinforcement increases the frequency of a re-sponse. This model aligns with many data, butnot with all data. It holds for some hamsterresponses, but not others. Even though you en-thusiastically thanked me for giving you a book,I will not give you another copy of the samebook. That’s obvious. But why? Finding a for-mal structure that keeps us from trying to applythe model where it does not work is not alwaysso easy. Presumably here it is ‘A good doesn’t’act as a reinforcer if the individual is satiated forit, and having one copy of a book provides in-definite satiation’. Alternatively, one may definereinforcement in terms of effects rather than op-erations, so that reinforcement must alwayswork, or it is not called reinforcement. But thatmerely shifts the question to why a proven rein-forcer (the book) has ceased to be reinforcing.Information is the reduction of uncertainty. Ifuncertainty appears to be dispelled without infor-mation, one can be certainty that it has merelybeen shifted to other, possibly less obvious,

Page 6: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–5238

maps. Absent information, uncertainty is con-served.

The truth of models is relative. A model is true(or holds) within the realm where it accuratelyaligns with data, for those data. A false modelmay be made true by revising it, or by restrictingthe domain to which it applies. Just as all proba-bilities are conditional (upon their universe ofdiscourse), the truth of all models is conditionalupon the data set to which they are applied. Lifeis sacred, except in war; war is bad, except whenfought for justice; justice is good, except whenuntempered by humanity. Assignment of truthvalue, like the assignment of moral value, or ofany label to a phenomenon, is itself thus a model-ing enterprise, not a discovery of absolutes.

Truth is the imposition of a binary predicate ona nature that is usually graded; it is relative to thelevel of precision with which one needs to know,and to competing models. The earth is a sphere isin good enough alignment with measurement tobe considered true. It accounts for over 99.9% ofthe variance in the shape of the earth. Oblatespheroid is better (truer), and when that modelbecame available, it lessened the truthfulness ofsphere. Oblate spheroid with a bump in Nepal anda wrinkle down the western Americas is better yet(truer), and so on. Holding a correspondent to ahigher level of accuracy than is necessary for thepurposes of the discussion is called nitpicking.Think of the truth operator as truth (m, x, p,a)= �{T, F, U}; it measures the alignment be-tween a model (m) and a data set (x) in thecontext of a required level of precision (p) andalternative models (a) to yield a decision from theset True, False, Undecided.

A model shown to be false may be more usefulthan truer ones. False models need not, pacePopper, be rejected. Newtonian mechanics is usedevery day by physicists and engineers; if they hadto choose one tool, the vast majority wouldchoose it over relativity theory. They would ratherreject Popper and his falsificationism than rejectNewton and his force diagrams. It is trivial toshow a model false; restricting the domain of themodel or modifying the form of the model tomake it truer is the real accomplishment.

3. Tools of the trade

A distinction must be made between modelingtools, which are sets of formal structures (e.g. therules of addition, or the calculus of probabilities),and models, which are such tools applied to a datadomain. Mechanics is a modeling tool. A descrip-tion of the forces and resultants on a baseballwhen it is hit is a model. The value of toolsderives from being flexible and general, and,therefore, capable of providing models for manydifferent domains. They should be evaluated noton their strength in accounting for a particularphenomenon, but on their ability to generatemodels of phenomena of interest to the reader.We do not reject mechanics because it cannot dealwith combustion, but rather we find differenttools.

3.1. Set theory

Behavioral science is a search in the empiricaldomain for the variables of which behavior is afunction; and a search in the theoretical domainfor the functions according to which behaviorvaries. Neither can be done without the other. Thefunctions according to which beha�ior �aries aremodels. Models may be as complex as quantummechanics. They may be as simple as rules forclassification of data into sets: classification ofblood types, of entities as reinforcers, of positiveversus negative reinforcement. Such categoriza-tion proceeds according to lists of criteria, andoften entails panels of experts. Consider the crite-ria of the various juries who decide whether tocategorize a movie as jejune, a death as willful, ora nation as fa�ored. Or those juries who categorizedissertations as passing, manuscripts as accepted,grants as funded.

A model for classifying events as reinforcers is:‘If, upon repeated presentation of an event, theimmediately prior response increases in frequency,then call that event a positive reinforcer’. That isnot bad; but it is not without problems. If aparent tells a child: ‘‘That was very responsiblebehavior you demonstrated last week at school,and we are very proud of you’’, and we find anincrease in the kind of behavior the parents re-

Page 7: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–52 39

ferred to, can we credit the parents’ commenda-tion as a reinforcer? It was delayed several daysfrom the event. Their description ‘re-minded’ thechild of the event. Is that as good as contiguity?Even a model as simple as the law of effectrequires qualifications. In this case, it requireseither demonstrating that such commendations donot act as reinforcers; or generating a differentkind of model than reinforcement to deal withthem; or discarding the qualifier ‘immediate’; orpermitting re-presentations, or memories, ofevents to be reinforced and to have that strength-ening conferred on the things represented or re-membered (and not just on the behavior ofremembering). These theoretical steps have yet tobe taken.

If our core categorical model requires morework, it is little surprise that stronger models alsoneed elaboration, restriction, refinement, or rede-ployment. Such development and circumscriptionis the everyday business of science. It is not thecase that the only thing we can do with models isdisprove them, as argued by some philosophers.We can improve them. That recognition is animportant difference between philosophy andscience.

3.1.1. The ‘generic’ nature of the responseSkinner’s early and profound insight was that

the reflex must be viewed in set-theoretic terms.Each movement is unique. Reinforcement acts tostrengthen movements of a similar kind. A set is acollection of objects that have something in com-mon. A thing-in-common that movements havethat make them responses could be that they lookalike to our eyes. Or it could be that they lookalike to a microswitch. Or it could be that theyoccur in response to reinforcement. Skinner’s defi-nition of the operant emphasized the last, func-tional definition. This is represented in Fig. 1.Operant responses are those movements whoseoccurrence is correlated with prior (discrimina-tive) stimuli, and subsequent (reinforcing) stimuli.After conditioning, much of the behavior hascome under control of the stimulus.

Whereas Skinner spoke in terms of operantmovements as those selected by reinforcement, heprimarily used the second, pragmatic definition in

most of his research: responses are movementsthat trip a switch. Call this subset of movements‘target’ responses, because they hit a target theexperimenter is monitoring. Target responses area proper subset of the class of functionally-definedoperant responses. Other operant responses thatthe experimenter is less interested in are calledsuperstitious responses or collateral responses orstyle. When Pavlov conditioned salivation, he re-ported a host of other emotional and operantresponses, such as restlessness, tail-wagging, andstraining toward the food dish, that were largelyignored. The target response was salivation. Thecollateral responses fell outside the sphere of in-terest of the experimenter, and are not representedin these diagrams.

Many of the analytic tools of set theory aremore abstruse than needed for its applications inthe analysis of behavior. But it is important to bemindful of the contingent nature of categorizationinto sets, and the often-shifting criteria accordingto which it is accomplished. Representation in

Fig. 1. The process of conditioning increases the frequency oftarget movements (M) that are emitted in the presence of adiscriminative stimulus (S). The correlation of these withreinforcement (R) is determined by the contingencies of rein-forcement.

Page 8: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–5240

Fig. 2. Different arrangements of sets of stimuli, responses andreinforcers correspond to traditional conditioning paradigms.

3.2. Probability theory

Probabilities, it has been said, are measures ofignorance. As models of phenomena are im-proved, probabilities are moved closer to certain-ties. But when many variables interact in theconstruction of a phenomenon, a probabilistic, orstochastic, account may be the best we can everdo. This is the case for statistical thermodynam-ics, and may always be the case for meteorology.A stochastic account that provides accurate esti-mates of probabilities and other statistics may bepreferred over a deterministic account that is onlysometimes accurate.

Probabilities are long-run relative frequencies.They may be inferred from samples, deducedfrom parameters, deduced from physical arrange-ment, or estimated subjectively. Counting 47heads in 100 tosses of a coin may lead you to inferthat the probability of a head is ‘about 0.5’. Thismeans that the inferred population of an indefin-itely large number of tosses would show abouthalf heads. I may have performed the same exper-iment with the same coin 1000 times already, withthe result of 492 heads. From this knowledge Ican deduce that your sample is likely to show49�5 heads. I deduce your statistic from myinference of the population parameter. The proba-bilities may be deduced from principles of fabrica-tion or mixing: Extreme care was taken in millingthis almost perfectly symmetric coin. I deducethat there is no reason for it to land on one sidemore often than another, and that the probabilityof heads is therefore 1/2. Finally, I may estimatethe probability of rain as 0.5. This may be inter-preted as ‘in my experience, half the time theclouds and wind have been this way, it soonrained.’ Obviously, our confidence in these esti-mates varies with the method of estimation. Asso-ciated with the methods are models: models forphysically counting, or mixing red and green ballsin an urn, or categorizing clouds.

Of course, the model may not hold; I mayforget to replace the items I have sampled, or theurn in which the balls are contained may not havebeen thoroughly mixed, or I may prefer the wayred balls feel, and select them differentially, andso on. Like all models, probability models stipu-

these diagrams helps remind us of not only theproperties of the responses that are measured, buttheir relation to other salient events in the envi-ronment, as shown in Fig. 2.

The top diagram depicts a discriminated par-tial-reinforcement schedule: reinforcement is onlyavailable when both an appropriate stimulus andmovement have occurred, but does not alwaysoccur then. In Pavlovian conditioning, the targetmovement is uncorrelated with reinforcement(even though other movements may be necessaryfor reinforcement), as in the free-operantparadigm no particular stimulus is correlated withthe delivery of reinforcement (other than the con-textual ones of experimental chamber, and so on).Under some arrangements, a stimulus is a muchbetter predictor of reinforcement than a move-ment, and tends to block conditioning of themovement.

Page 9: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–52 41

late many of the conditions under which they areuseful. They may be useful beyond those condi-tions, but then ca�eat emptor. If a statistic is notnormally distributed, conventional statistical testsmay still be useful; but they may not be telling theuser exactly what he thinks, and may lead him tofalse conclusions.

Appreciation of recherche statistical modelsmay be deferred until confrontation by a problemthat demands those tools. But basic probabilitytheory is important in all behavioral analyses. It isthumbnailed here.

All probabilities are conditional on a uni�erse ofdiscourse. The rectangle in Fig. 3 represents theuniverse of discourse: All probabilities are mea-sured given this universe. Other universes holdother probability relations. The probability of theSet A is properly the probability of Set A gi�enthe universe, or p(A �U). It is the measure of Adivided by the measure of the universe. Herethose measures are symbolized as areas. Considerthe disk A to be laid atop the universe, and throwdarts at the universe from a distance such thatthey land randomly. If they hit A, they will alsogo through to hit U. Then we may estimate theprobability of A as the number of darts thatpierce A divided by the number that pierce U.This is a sample. As the number of darts thrownincrease, the estimates become increasingly accu-rate. That is, they converge on a limiting value.

The term gi�en redefines the universe that isoperative. The probability of A given B is zero inthe universe of Fig. 3, because if B is given, itmeans that it becomes the universe, and none ofthe area outside it is relevant to this redefineduniverse. No darts that hit B can also hit A. The

probability of A given C=1: all darts that gothrough C must also go through A, and because Cis given, none that go elsewhere are counted.

The area of D is about 1/50 that of the uni-verse, and about 1/20 that of A. Therefore, theprobability of D given A is greater than theprobability of D (given U). A and D are said to bepositively correlated. When probabilities are stip-ulated without a conditional given, then they arecalled base rates, and the given is an implicituniverse of discourse. It is usually helpful to makethat universe explicit. The probability that whatgoes up will come down is 1.0 (gi�en that it isheavier than air, is not thrown too hard, isthrown from a massive body such as the earth,has no source of propulsion, etc.).

The probability of E given A, p(E �A), is lessthan the base rate for E, p(E �U), so E and A aresaid to be negatively correlated. The p(F �A)equals the p(F �U), so the events F and A are saidto be independent.

Instrumental conditioning occurs when theprobability of a reinforcer given a response isgreater than the base rate for reinforcement; be-havior is suppressed if the probability is less thanthe base rate. What is the universe for measuringthe base rate? Time in the chamber? When a trialparadigm is used, stipulation of the universe isrelatively straightforward. When behavior occursin real-time, the given could entail a universe ofthe last 5, 50 or 500 s. But a response 5 minremote from a reinforcer will not be conditionedas well as one occurring 5 s before it. Our goal isto define the relevant universe (context) the sameway as the animal does, so that our model pre-dicts its behavior. Exactly how time should bepartitioned to support a probability-based (i.e.correlation-based) law of effect is as yet an un-solved problem. Just as all probabilities are condi-tional (on some universe), all conditioning is, asits name implies, conditional: If no stimuli aresupplied by the experimenter, some will nonethe-less be found by the organism.

Table 1 gives the names of experimentalparadigms or the resulting behavior that is associ-ated with various correlations. The second rowassumes a (positive) reinforcer. The bottom rowindicates that movements that co-occur tend toFig. 3. Probability as a dart game.

Page 10: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–5242

Table 1Conditions associated with correlations among parts of an act

0Correlation −+

Rescorla controlS–R Inhibitory conditioningClassical conditioningM–R Positive reinforcement Superstitious conditioning Negative punishment

NoiseS–S ContrastCompoundOrthogonal/parallelOperant/style Competition: Ford effectM–M

become part of the same operant class; this maybe because they are physically constrained to doso (topographic effects), because the are stronglyco-selected by reinforcement (operant move-ments), or because many different constellationsof movements suffice, and whatever eventuates isselected by reinforcement (style). Responses thatare independent can occur in parallel with no lossof control. Responses that compete for resourcessuch as time or energy often encourage an alter-nation or exclusive choice. Responses that appearindependent may be shown to be competing whenresources are restricted, a phenomenon known inpolitics as the Ford effect.

3.2.1. BayesOne stochastic tool of general importance is the

Bayesian paradigm. Consider the probabilityspaces shown in Fig. 3. Whereas p(A �C)=1,p(C �A)�1. Here, the presence of C implies A,but the presence of A does not imply C. Thesekinds of conditional probabilities are, however,often erroneously thought to be the same. If weknow the base rates, we can calculate one giventhe other, with the help of a chain rule for proba-bilities. Consider the area that is enclosed by bothA and D. It is called the intersection of A and D,and can be calculated in two ways: First,p(A.D)=p(A �D)p(D). The probability of both Aand D occurring (given the universe) is the proba-bility of A given D, times the probability of D(given the universe). Second, p(A.D)=p(D �A)p(A). From this we can concludep(A �D)p(D)=p(D �A)p(A), or p(A �D)=p(D �A)p(A)/p(D). This last relation is Bayes’ rule.The probability of D given A equals the probabil-ity of A given D times the ratio of base rates of Aand D. If and only if the base rates are equal will

the conditional probabilities be equal. The proba-bility that you are a behaviorist given that youread The Behavior Analyst (TBA) equals theprobability that you read TBA given that you area behaviorist, multiplied by the number of behav-iorists and divided by the number of TBA readers(in the relevant universe). Because there are manymore behaviorists than readers of TBA, theseconditional probabilities are not equal. In thecontext of Bayesian inference, the base rates areoften called prior probabilities (prior to evaluat-ing the conditionals), or simply priors.

3.2.2. Foraging problems in�ol�e Bayesianinference

The probability that a patch still contains foodgiven that none has been found during the lastminute equals the probability none would befound in the last minute given that it still containsfood, times the relevant base rates. If the opera-tive schedule is a depleting VI 300 s schedule, theconditional probability is good; if it is a depleting15 s schedule, then it is bad. This is because 1 minof dearth on a VI 300 is typical and gives us littleinformation; the same epoch on a VI 15 is atypi-cal, and informs us that it is likely that the sourcehas gone dry. This is the logic; Bayes gives thepredictions.

Learning in the context of probabilistic rein-forcement involves Bayesian inference. It is some-times called the credit allocation problem. A pellet(R) is delivered after a lever-press (M) with aprobability of p/(R �M)=1/5. It is not enough toknow that to predict the probability of L given P,which is p(M �R)=p(R �M)×p(M)/p(R). If thebase-rates for reinforcers, p(R), is high, condition-ing is unlikely. In terms of mechanisms, it is saidthat the background is being conditioned, and

Page 11: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–52 43

little credit is left to be allocated to M. Thus,p(M �R) is a better model for the conditional mostrelevant to conditioning than is p(R �M).

Reinforcers or threats may elicit certain behav-ioral states, in the context of which specific re-sponses (e.g. prepared or species-specific appetitiveor defensive responses, such as pecking or flightfor pigeons), may be vested by evolution withhigh priors, and other responses (e.g. contrapre-pared responses) with low priors.

Scientific inference involves Bayesian inference.What is the probability that a model, or hypothe-sis, is correct, given that it predicted some data,p(H �D)? Experiments and their statistics give usthe probability that those data would have beenobserved, given the model, p(D �H). These are notthe same conditionals. Inferential statistics ascommonly used tell us little about the probabilityof a model being true or false, the very questionwe usually invoke statistics to help us answer.Null hypothesis statistical testing confounds theproblem by setting up a know-nothing model thatwe hope to reject. But we cannot reject models,unless we do so by using Bayes theorem. A p-levelof less than 0.05 means that the probability of thedata being observed given the null hypothesis isless than 5%. It says nothing about the probabil-ity of the null hypothesis given those data, and inparticular it does not mean that the probability ofthe null hypothesis is less than 5%. To calculatethe probability of the hypothesis, we need tomultiply the p-level by the prior probability of themodel, and divide by the prior probability of thedata: p(H �D)=p(D �H)p(H)/p(D). If the priorsfor the data are high — say, if my model predictsthat the sun will rise tomorrow [p(S �H)=1] —the probability of the model given a sunrise is notenhanced. This is because p(S)�1, so that,p(H �S)�1×p(H)/1. The experiment of rising tocatch the dawn garners data that do little ornothing to improve the credibility of the model. Ifthe priors are low, however — if the modelpredicts that at 81° ambient temperature, the sunwill flash green as it rises, and it does — themodel gains appreciably.

The difficulty in applying Bayesian inference isthe difficulty in specifying the priors for the modelthat one is testing. Indeed, the very notion of

assigning a probability to a model is alien tosome, who insist that models must be either trueor false. But all models are approximations, andthus their truth value must be graded and never1.0; burdening models with the expectation thatsome must be absolutely true is to undermine theutility of all models. So the question is, how toconstrue models so that the probability calculusapplies to them. One way to do this is to think ofa universe of models — let us say, all well-formedstatements in the modeling language we are using.Then to rephrase the question as: ‘What is theprobability that the model in question accountsfor more of the variance in the empirical datathan does some other model of equal complexity?’Techniques for measuring computational com-plexity are now evolving.

Some models have greater priors than others,given the current state of scientific knowledge,and all we may need to know is order-of-magni-tude likelihood. It is more likely that a greenworm has put holes in our tomatoes than that analien has landed to suck out their vital fluids.Finding holes in our tomatoes, you go for the bugspray, not the bazooka. Are the priors higher thata behavioral theory of timing or a dynamic theoryof timing provides the correct model of FixedInterval performance? This is more difficulty tosay, a priori. Because we may not be able to say,does not entail that we should not take theBayesian perspective; for the problem is not aby-product of Bayesian inference. Bayes theoremmerely gives voice to the problem of inverse infer-ence most clearly. There are ways of dealing withthese imponderables that is superior to ignoringthem. Calculating the ratio of conditional proba-bilities for two competing hypotheses based on acommon data set eliminates the need to estimatethe priors on the data. Other techniques (e.g.entropy maximization) helps determine the priorson the models.

Another approach is simply to feign indiffer-ence to Bayesian and all other statistical inference.This is more of a retreat than an approach; yetthere remain corners of the scientific world whereit is still the mode. But science inherently concernsmaking and testing general statements in light ofdata, and is thus intrinsically Bayesian. Whether

Page 12: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–5244

or not one employs statistical-inference models, itis important to understand the probability that ageneral statement is true, based on a particulardata set is, p(H �D)=p(D �H)p(H)/p(D). This pro-vides a qualitative guide to inference, even if nonumbers are assigned.

3.3. Algebra

3.3.1. A question of balanceWe use algebra regularly in calculating the pre-

dictions of simple models of behavior. Think ofthe equals-sign in an equation as the fulcrum of abalance. Algebra is a technique for keeping thebeam horizontal when quantities are moved fromone side to another. Angles of the beam otherthan 0° constitute error. A model is fit to data byputting empirical measurements on the left-handside of the beam, and the equation without they-value on the right-hand side of the beam. If theleft side moves higher or lower than the right, themodel mispredicts the data. The speed with whichthe beam deviates from horizontal as we repeat-edly place pairs of x and y measurements in theright and left sides indicates the residual error inthe model. Descartes’ analytic geometry gives us ameans to plot these quantities as graphs.

3.3.2. Inducing algebraic modelsAssume that the preference for a reinforcer

increases as some function of its magnitude, and adifferent function of its delay. Imagine a proce-dure that lets us pair many values of amount withdelay, and contrast them with an alternative ofamount 1 at delay 1. The subjects generate a dataset (Table 2), where the entries are the ratios ofresponses for the variable alternative to those for

the constant alternative. Can we reconstitute thesimple laws governing delay and amount fromthis matrix? In this case, there is no noise, and sothe solution is straightforward (hint: focus on thefirst row and first column; graph their relation tothe independent variables; generate hypotheticalrelations, and see if they hold for other cells).When there is noise in the data, or the functionsthat generate them are more complex, other tech-niques, called conjoint measurement, or func-tional analysis, are available. They constitute theessence of Baconian induction.

For more than two dimensions, or where thedimensions can not be identified before-hand, orin cases where there is an additional transforma-tion before an underlying construct — preference,strength, momentum — is manifest as behavior,other techniques may be used. In one of the mostcreative methodological inventions of the 20thcentury — non-metric multidimensional scaling— Roger Shepard showed how the mutual con-straints in a larger version of such a table couldbe automatically parsed into relevant dimensionsand transformations on them.

3.3.3. A new twistTraditional algebras are so well-ingrained that

we take their reality for granted. Yet, other alge-bras provide better modeling tools for some oper-ations. Pick up this journal and rotate itcounterclockwise 90°, and then flip it 180° fromleft to right. Note its position. Restore it tonormal. Now perform those operations in reverseorder. Its terminal position/orientation is differ-ent. These spatial operations do not commute:their order matters. This is also the case withsome operations in matrix algebra. Conduct an-other experiment with two conditions: present astimulus that can serve as a reinforcer to anorganism (US), and then present a discriminativestimulus (CS). Repeat several times. Next findanother pair of stimuli, and present them in theorder CS, US. The behavior of the organism tothese two sets of operations will be quite different,showing associative learning in the second case,but not in the first; or possibly inhibitor condi-tioning in the first. Do other algebras offer supe-rior models for the operations of conditioning?

Table 2A hypothetical set of preferences based on simple non-interact-ing functions on amount and delay of reinforcement

4Delay: amount 1 52 3

0.331 0.401.00 0.67 0.500.470.570.710.941.412

1.001.332.00 0.674 0.801.221.632.45 0.820.986

0.941.131.411.898 2.83

Page 13: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–52 45

3.4. Calculus

Calculus is a generic term, meaning any model-ing system. The set of laws governing probabilitiesis the probability calculus. But by itself, the termis most closely associated with techniques forfinding the tangent to a curve (differential cal-culus) or the area under a curve (integral cal-culus). It was Newton who recognized that thesesystems were inverses: that the inverse of thedifferentiation operation, anti-differentiation —is equivalent to integration.

3.4.1. DifferentialWhy should such esoterica as tangents to

curves and areas under them have proved such apowerful tool for science? Because change is fun-damental, as noted by Parmenides, and calculus isthe language of change. Behavior is change instance over time. In Skinner’s words, ‘‘it is themovement of an organism within a frame ofreference’’. If we plot position on the y-axis andtime on the x-axis, then the slope of the curve atany point measures the speed of motion at thatpoint in time. This slope is the derivative of thefunction with respect to time. The slope of acumulative record is the rate of responding at thatpoint.

If we plot the proportion of choices of a su-crose solution as a function of its sweetness, thecurve will go through a maximum and then de-cline as the solution becomes too sweet. At thepoint of the maximum the slope goes from posi-tive — increases in sucrose increase preference —to negative — further increases in sucrose de-crease preference. At the balance point the sweet-ness is optimum. We can define that optimum asthe point at which the derivative of choice withrespect to sucrose concentration goes from posi-tive to zero. If the function governing preferencewere p=ax−bx2, then this point would occurwhen 0=a−2bx ; that is, at x=a/(2b). All talkof optimal foraging, or of behavior in a concur-rent experiment as optimizing rate of reinforce-ment, or optimal arrangements of contingencies,can be thought of in terms of derivatives. Thinkof a multidimensional space, with one dimensionbeing the dependent variable, and all the other

dimensions being the various experimental condi-tions. The point at which the derivative of thedependent variable with respect to all the depen-dent variables goes to zero (if one exists) is theoptimum. Finding the right balance in life — theoptimal trade-off between work, study, leisure,exercise, and community activities — is findingthe point at which life-satisfaction has zero-derivatives. Be careful though: a slope of zero alsooccurs at the minimum, where the derivativeschange from negative to zero. If you want todetermine if you are as bad off as you could get,once again, check the derivative!

3.4.2. IntegralBehaviorists are seldom interested in calculating

the area under a curve; yet it is something we doall the time, at least implicitly. The total numberof responses in extinction is the area under therate versus time graph. If reinforcers can increasethe probability of more than one prior response,then the strengthening effect of reinforcement isthe sum of the decaying trace of reinforcementmultiplied by the probability of a response at alldistances from the reinforcer.

To show how integration may be employed,consider the following model for the strength of acontinuous CS: associative strength is the averageassociation of a stimulus with a reinforcer. Thismay be calculated as the sum of each instant ofthe CS weighted by an exponentially decayingtrace of the primary reinforcer, divided by theduration of the CS: S(CS)=1/t�t

0e−kxdx. Thesolution of this equation is S(CS)= (1−e−kt)/(kt) which provides a plausible model of associa-tive conditioning as a function of CS duration(see Fig. 4). The curve is similar to a hyperbolicfunction over the range of interest. If choice be-havior is under the control of conditioned rein-forcers bridging the delay until a primaryreinforcer, this model predicts an almost-hyper-bolic decay in preference as a function of delay.

3.4.3. The calculus of �ariationsOne of the most elegant generalizations of the

calculus is to use it not to find a point on afunction that minimizes some value, but rather tofind a function that minimizes some value. The

Page 14: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–5246

Fig. 4. A model of response strength: an exponentially decay-ing trace with rate constant k=1/8 averaged over CS dura-tions given by the abcissae (symbols). The curve is thehyperbolic y=1/(1+0.08t).

beyond differential calculus. As with any skill,some time spent working problems will conferability. In the analysis of behavior, we might askquestions of the following sort:� Brachistochrone/shaping: given a continuum of

responses, a hyperbolic delay of reinforcementgradient, and a logarithmic satiation function,what is the allocation of reinforcers to behaviorthat will move an organism from an initialstate S to a target state E as quickly as possi-ble? What else needs to be specified?

� Clothesline/foraging: given that an organismmust transit from point S to point E under athreat of predation that decreases as an inversesquare function from some third point, whattrajectory will minimize exposure to the threat?

� Surprise/conditioning: assume that an organ-ism is conditioned by the appearance of unex-pected events, and that conditioning acts tominimize future surprise. It does this becauseresponses or stimuli that predict the US cometo act as CSs. We may operationalize uncer-tainty as the temporal variance in predictingthe US. Assume that temporal epochs (CSs)occur continuously before a US, that the or-ganism has a limited amount of attention �

that it can deploy among the temporal stimuli,and that the standard deviation of predictionincreases linearly with the time between a CSand the US. What is the form of the delay ofreinforcement gradient for conditioning thatminimizes surprise over a standard interval?Will, for instance, it be the exponential decayfunction used in the example above? Hyper-bolic? Something else?

3.5. Decision theory

To be, or not to be; that is a question thatdecision theory can help answer. Decision theoryis a way of combining measures of stimuli andreinforcers to predict which response will be moststrongly reinforced. In the avatar of signal detec-tion theory (SDT), decision theory focused onstimulus presence and predictiveness, and askedhow to maximize reinforcement given payoffs ofstipulated value for errors and corrects. Morerecently it has been generalized to incorporate a

most famous problem in the calculus of varia-tions, and one of the oldest, is the brachis-tochrone problem: Given a starting point S, x ftabove the floor, and an ending point E, on thefloor y ft distant, what is the shape of a surfacethat will cause a ball rolling down the surface toarrive at E in the minimum time? There are manycandidates, an inclined plane being the simplest— but not the fastest. What is needed is a state-ment of constraints, and a statement of the thingto be optimized. Here, the constraints are thecoordinates of S and E, the force of gravity, andthe system of mechanics that tells us how thatforce accelerates bodies. The thing to be opti-mized is transit time. There are an infinity ofsurfaces that will give long or maximum times;only one, the cycloid, that gives a minimum time.Another problem requiring the calculus of varia-tions is that of determining the shape of a hangingclothesline, which distributes the force of gravityevenly over its length.

The brachistochrone problem was posed byJohann Bernoulli as a open problem; he receivedfive solutions — an anonymous one within sev-eral days. Looking at the elegant style, he recog-nized the author, revealed ‘as is the lion by theclaw print’; the print was Newton’s. (The othersolutions came from Johann himself, his brotherJakob, one of their students, L’Hospital, andLeibniz). The calculus of variations is one step

Page 15: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–52 47

behavioral theory of reinforcement value, andalso responses that are not perfectly discrim-inable. Although possibly strange to our ears, thenotion of response discriminability is a centralissue in our field: Whenever a new skill, such asgolf, is being learned, it is often unclear what wasdone that resulted in a great/terrible shot. Re-sponse discrimination is a natural complement tostimulus discrimination.

Decision theory builds on probability theory,on Bayesian inference, and on optimality. Its iconis that of a t-test: overlapping Gaussian densities(Fig. 5), one representing the distribution of per-cepts (P2) that might have arisen from a stimulus(S2), the other the distribution of percepts arisingfrom an alternate stimulus (S1) or irrelevant stim-uli (noise). It is the organism’s challenge to deter-mine the probability of S2 given the percept P2;but all he has a priori is a history that tells himthe probabilities of the percepts given the stimuli,depicted by the densities. Bayes theorem is re-quired: take the ratio of the evidence for S2 overthat for S1 to determine a likelihood ratio

p(S2�P2)p(S1�P2)

=p(P2�S2)p(S2)p(P2�S1)p(S1)

The optimal strategy is to select a criterial valuefor this ratio, determined by the payoffs for cor-rectly identifying S1 and S2 and the penalties formistakes, and to respond ‘2’ if the percept isabove the criterion, and ‘1’ otherwise.

Fig. 6. The multidimensional signal-detection/optimizationproblem faced by real organisms: how to access a particularreinforcer.

There are many variant models of SDT avail-able. The core idea of the theory is to take intoaccount the probabilistic nature of perception,assume that organisms are acting to optimizereward, and to create models that combine in-ferred knowledge of stimuli and reinforcement topredict responding.

Organisms are afloat on a raft of responses in asea of stimuli, reinforcers and punishers (Fig. 6).Survival entails finding correlations between stim-uli, responses and reinforcers that will maximizeaspects of a life-trajector. The particular tools,such as SDT, that are evolving to understandthese relations are simple one- and two-dimen-sional slices of an evolving multidimensionalspace. It is the challenge of the next century toevolve models that more closely map the complex-ities we hope to understand.

3.6. Game theory

Another dimension is added to complexitywhen organisms operate in a closed-loop environ-ment. These are situations in which their actionschange the environment, which in turn changesfuture actions. Many experimental paradigmsopen these loops to establish relatively constant

Fig. 5. Distributions of effects arising from presentation ofstimuli.

Page 16: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–5248

environments. A variable-interval schedule keepsrate of reinforcement relatively invariant oversubstantial changes in rate of responding. If allenvironments were open loop, there would belittle point in learning. Plants largely inhabitopen-loop environments. To the extent that anorganism can benefit from sensitivity to the effectsof its behavior — that is, to the extent thatimportant features of its interaction with theworld are closed-loop — learning will improvethe organism’s fitness. The difficulty in analyzingsuch systems is that small variations from ex-pected reactions at one step quickly become mag-nified by succeeding steps.

The non-linear effects found in closed-loop sys-tems are especially important when individualsinteract. Consider three societies in which bothkindness and meanness are reciprocated; In Soci-ety E each gives as good as he gets; In Society A,each gives 5% more than he gets; and in SocietyD, each gives 95% of what he gets. Which soci-eties will be stable; which polarized? Can yougenerate a spreadsheet model of these societies?

If interactions are on a trials basis, with out-comes of each interaction known only after eachindividual makes a move, game theory provides amodeling system. Classic games include the pris-oner’s dilemma. Depending on the reinforcementcontingencies, the most effective strategies (onesthat optimize reward) may be mixed, with alter-nate responses chosen probabilistically.

Most models of conditioning assume a powerdifferential — the experimenter sets the rules andmotivates the organism. In game theory, however,the players have equal power, and the optimumsequence of responses are ones that not onlyprovide the best short-term payoff, but also con-dition the opponent/cooperator to behave in waysthat sustain payoffs. Since this may mean forego-ing the largest short-term payoff, optimal strate-gies entail self-control. Conversely, self-controlmay be thought of as a game played against one’sfuture self, an entity whose goals are similar to,but not the same as, those of the present self.

If the games are real-time, such as the game ofchicken, dynamic system models are necessary. Ifsignaling is possible, players will seek signs ofcharacter — that is, predictors of future behavior

— in posture, couture, and coiffure; strategies ofbluff, deception, and seduction are engaged; sig-nal detection becomes a survival skill. Due to thepotential instability of such interacting systems,strong reinforcement contingencies are necessaryto avoid chaos. These are provided by charismaticleaders, repressive regimes, or elaborate legislativeand legal systems reinforced by ritual and myth.

3.7. Automata theory

A toggle switch is a simple automaton. It alter-nates its state with each input. Each time it isswitched on it may send a signal to anotherswitch. Since it takes two operations — on andoff to send a signal — the second switch will beactivated at half the frequency of the former. Abank of such switches constitutes a binary counter.Another switch requires two simultaneous inputsto change its state; this constitutes an and gate.Another switch assumes a state complementary toits input; this is a not gate. Wired together prop-erly these elements constitute a computer. Asdescribed, it is a finite-state automaton because ithas a finite amount of memory. If an endless tapewere attached so it had unlimited memory, aTuring machine could be constructed. Turing ma-chines are universal, in that they can solve anyproblem that is, in theory, solvable.

Organisms such as rats and humans may beviewed as finite-state automata, differing primar-ily in the amount of memory that is available tothem. This statement does not mean that they arenothing but automata. All models abstract fromreal phenomena to provide a more comprehensi-ble picture. City maps do not show the trees orlitter on the streets they describe, and are ofreduced scale. Viewed as automata, the primarydifference between rats and humans is the amountof memory available for computations. The morememory that is available, the more informationabout the stimulating context may be maintainedto provide probability estimates. More memorymeans more capacity to retain and relate condi-tional probabilities. Enhanced ability to condi-tionalize permits nuanced reactions.

Automata have been used as metaphors forhuman computational ability for centuries, with

Page 17: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–52 49

analog computers favored 50-years ago, digitalcomputers 30-years ago, parallel architectures 20-years ago, and genetic algorithms/Darwin ma-chines 10-years ago. Within the learningcommunity, they have seldom been used to gener-ate models, with the implications of architectureand memory capacity taken seriously. Is this be-cause they are intrinsically poor models of organ-isms, or because no one has tried?

4. Last words

The mind of science may be claimed by philoso-phy, but its heart belongs to tinkerers and problemsolvers. Good scientists love puzzles, and the toolsthat help unravel them. The difference betweenscientists and anagram fans is the idea that scientificproblems are part of a larger puzzle set; that onepuzzle solved may make other pieces fall into place.But, then, jigsaw puzzles have that feature too.

Another difference is that the Nature, not theNew York Times, created the scientists problems.But, in fact, all Nature did was exist; scientiststhemselves transformed aspects of it into problemsto be solved. Skinner and his students, not Nature,put two keys in an ice-chest. Another difference isthat the puzzles solved by scientists resonate withaspects of the world outside their problem set: Oncea puzzle is solved, the scientist can turn back to theworld from which it was abstracted and findevidence for the mechanisms discovered in thelaboratory in the world at large. A model of speedon inclined planes speaks volumes about the veloc-ity of all motions forced by gravity. But suchfull-cycle research — reinvestment of the solutionin the natural world from which the problem wasabstracted — is preached more often than prac-tised. Most scientists are happy to give scienceaway; putting it to work is not in their job descrip-tion. Because an infinity of puzzles may be posed,society’s selection of which puzzle solvers to sup-port is biased by their perception of communalbenefits. It follows that scientific societies must paytheir way through applications, not assertions ofeternal verities that are often beyond the ken of thecommunity. Scientists must work hand-in-handwith technologists to survive.

One of the practical benefits of science is under-standing, but understanding is itself only poorlyunderstood. Physicists from Aristotle through Bohrto Einstein help us understand understanding.Aristotle taught us about its multidimensionality;Bohr of complementary limits on its depth andbreadth. For Einstein, understanding was the re-duction of a problem to another that we alreadythink we understand — essentially, a successfulmap to a model.

Truth, we learn, is a relation, not a thing, evenwhen proceeded by the and capitalized. It is arelation between a statement/model and a fact.Assignment of a truth value works best for binarystatements concerning binary facts; but most datamust be forced into such polarity; hedges and otherconditionals are, therefore, common (‘Innocent byreason of insanity’). A generalization of the opera-tor truth for continuous variables is provided by thecoefficient of determination, which measures theproportion of variance in the data field that isaccounted for by the model. This is a mainstay injudging the accuracy of models. A complementaryindex, proportion of the variance available from amodel that is relevant to the data, provides ameasure of specificity or parsimony; it is oftencrudely approximated by the number of freeparameters in a model. Accuracy and parsimonymeasured in such fashions are the scientist’s meta-models of truth and relevance.

The utility of models depends on their accuracyrelative to others that are available. A model withmispredictions that still accounts for a significantand useful amount of variance in the data shouldnot necessarily be spurned. It is foolish for aphilosopher to deprive a laborer of a shovel becauseit is dull — unless he offers a better one in its place.The goal of science is not perfect models, becausethe only perfect renditions are the phenomena suigeneris; the goal is better models.

Modeling languages are not models. Algebra andcalculus and automata theory provide tools to craftthe special purpose models, and it is those that arethe cornerstones of scientific progress. The distinc-tion between models and modeling languages isthat of relevance; modeling languages can say toomuch that is not relevant to the data field understudy. Models are proper subsets of all that may be

Page 18: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–5250

said in their language. The more laconic a model,the more likely we can extrapolate it’s predictionsto new situations without substantial tinkering.The most succinct models are called elegant.

There are many modeling tools available forscientists, a small set of which was sketched togive the flavor of their applications. The commu-nity of behavioral scientists has been conservativein exploiting modeling systems that might enrichtheir practice. Just as the microscope opened anew field of science, so also did tools such as thecalculus and the probability calculus. This nextcentury is a fertile one for the deployment of allavailable modeling tools on the problems of be-havioral science; our questions are complexenough to support them. The trick will be toreformulate our questions in ways that make themamenable to using such tools. To do this, all weneed is practice; and we all need practice.

Acknowledgement

I thank Tom Zentall for a close and helpfulreading of the ms, and the National Institutes ofHealth for giving me time to write it under theaegis of Grant K05 MH01293.

Bibliography

Science as problem solvingLaudan, L., 1977. Progress and its problems. Uni-versity of California Press, Berkeley, CA.Hestenes, D., 1990. Secrets of genius. New IdeasPsych. 8, 231–246.Hestenes, D., 1992. Modeling games in the New-tonian world. Am. J. Phys. 60, 732–748.Pais, A., 1991. Neils Bohr’s Times. Oxford Uni-versity Press, New York. The original read twenti-eth century, and physics in place of psychology.Stove, D., 1998. In: Anything Goes: Origins of theCult of Scientific Irrationalism. Maclea Press,Paddington, NSW, Australia.

ComplementarityFrench, A.P., Kennedy, P.J., 1985. Niels Bohr: ACentenary Volume. Harvard University Press,Harvard.

Aristotle on comprehensionHocutt, M., 1974. Aristotle’s four becauses.Philos. 49, 385–399.

Probability theoryEstes, W.K., 1991. Statistical Models in Behav-ioral Research. Lawrence Erlbaum Associates,Mahwah, NJ.Wickens, T.D., 1982. Models for Behavior:Stochastic Processes in Psychology. Freeman, SanFrancisco, 353 pp.

Correlation in behaviorBaum, W.M., 1973. The correlation-based law ofeffect. J. Exp. Anal. Behav. 20, 137–153.Galbicka, G., Platt, J.R., 1989. Response-rein-forcer contingency and spatially defined operants:testing an invariance property of phi. J. Exp.Anal. Behav. 51, 145–162.

Bayes and priors on behaviorGreen, R.F., 1980. Bayesian birds: a simple exam-ple of Oaten’s stochastic model of optimal forag-ing. Theor. Pop. Biol. 18, 244–256.Jaynes, E.T., 1986. Bayesian methods: an in-troductory tutorial. In: Justice, J.H. (Ed.), Maxi-mum entropy and Bayesian methods in appliedstatistics. Cambridge University Press, Cam-bridge, UK.Keren, G., Lewis, C. (Eds.), 1993. A Handbookfor Data Analysis in the Behavioral Sciences.Lawrence Erlbaum Associates, Mahwah, NJ.Killeen, P.R., Palombo, G.-M., Gottlob, L.R.,Beam, J.J., 1996. Bayesian analysis of foraging bypigeons (Columba li�ia). J. Exp. Psych.: Anim.Behav. Process 22, 480–496.Myung, I.J., Pitt, M.A., 1997. Applying Occam’srazor in modeling cognition: a Bayesian ap-proach. Psych. Bull. Rev. 4, 79–95.Shettleworth, S.J., 1975. Reinforcement and theorganization of behavior in golden hamsters:hunger, environment, and food reinforcement. J.Exp. Psych: Anim. Behav. Process 1, 56–87.Skilling, J., 1988. The axioms of maximum en-tropy. In: Erickson, G.J., Smith, C.R. (Eds.),Maximum-Entropy and Bayesian Methods in Sci-ence and Engineering. Kluwer, Dordrecht, pp.173–187.

Page 19: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–52 51

Skilling, J., 1989. Maximum Entropy andBayesian Methods. Kluwer, Dordrecht, TheNetherlands.Staddon, J.E.R., 1988. Learning as inference. In:Bolles, R.C., Beecher, M.D. (Eds.), Evolution andLearning. Lawrence Erlbaum, Hillsdale, NJ, pp.59–77.

Induction/retroductionCialdini, R.B., 1980. Full-cycle social psychologi-cal research. In: Beckman, L. (Ed.), Applied So-cial Psychology Annual, vol. 1. Sage, BeverlyHills, CA.Rescher, N., 1978. Peirce’s Philosophy of Science.University of Notre Dame Press, Notre Dame,IN.

Decision theoryAlsop, B., 1991. Behavioral models of signal de-tection and detection models of choice. In: Com-mons, M.L., Nevin, J.A., Davison, M.C. (Eds.),Signal Detection: Mechanisms, Models, and Ap-plications. Erlbaum, Hillsdale, NJ, pp. 39–55.Davison, M., Nevin, J.A., 1998. Stimuli, rein-forcers, and behavior: an integration. J. Exp.Anal. Behav. 71, 439–482.Levitt, H., 1972. Decision theory, signal detectiontheory, and psychophysics. In: David, E.E. Jr.,Denes, P.B. (Eds.), Human Communication: AUnified View. McGraw-Hill, NY, pp. 114–174.Macmillan, N.A., 1993. Signal detection theory asdata analysis method and psychological decisionmodel. In: Keren, G., Lewis, C. (Eds.), A Hand-book for Data Analysis in the Behavioral Sci-ences: Methodological Issues. Lawrence ErlbaumAssociates, Mahwah, NJ, pp. 21–57.Macmillan, N.A., Creelman, C.D., 1991. Detec-tion Theory: A User’s Guide. Cambridge Univer-sity Press, Cambridge, England, 407 pp.Marston, M., 1996. Analysis of cognitive functionin animals, the value of SDT. Cog. Brain Res. 3,269–277.McNamara, J.M., Houston, A.I., 1980. The appli-cation of statistical decision theory to animalbehavior. J. Theor. Biol. 85: 673–690.McNicol, D., 1972. A primer of signal detection

theory. George Allen & Unwin, Ltd. London, 242pp.Swets, J.A., Dawes, R.M., Monahan, J., 2000.Psychological science can improve diagnostic deci-sions. Psych. Sci. Pub. Interest 1, 1–26.Wixted, J.T., 1993. A signal detection analysis ofmemory for non-occurence in pigeons. J. Exp.Psych.: Anim. Behav. Process. 19, 400–411.

Game theoryBonta, B.D., 1997. Cooperation and competitionin peaceful societies. Psych. Bull. 121, 299–320.Clements, K.C., Stephens, D.W., 1995. Testingmodels of nonkin cooperation — mutualism andthe prisoners-dilemma. Anim. Behav. 50, 527–535.Crowle, P., Sargent, R., 1996. Whence tit-for-tat?Evol. Ecol. 10, 499–516.Green, L., Price, P.C., Hamburger, M.E., 1995.Prisoners-dilemma and the pigeon: control by im-mediate consequences. J. Exp. Anal. Behav. 64,1–17.Mero, L., 1998. Moral calculations: game theory,logic, and human frailty. Copernicus/Springer,New York.Nowak, M.A., Ma, R., Sigmund, K., 1995. TheArithmetics of Mutual Help. Sci. Am. 272, 76–81.Stephens, D.W., Anderson, J.P., Benson, K.E.,1997. On the spurious occurrence of tit for tat inpairs of predator-approaching fish. Anim. Behav.53, 113–131.

AutomataHopkins, D., Moss, B., 1976. Automata. NorthHolland, New York.Minsk, M., 1967. Computation: Finite and Infi-ntite Machines. Prentice-Hall, Englewood-Cliffs,NJ.

Functional/conjoint measurementAnderson, N., 1979. Algebraic rules in psycholog-ical measurement. Am. Sci. 67, 555–563.Shepard, R.N., 1965. Approximation to uniformgradients of generalization by monotone transfor-mations of scale. In: Mostofsky, D.I. (Ed.), Stim-ulus Generalization. Stanford University Press,Stanford, CA, pp. 94–110.

Page 20: Modeling games from the 20th century

P.R. Killeen / Beha�ioural Processes 54 (2001) 33–5252

GeneralBoas, M.L., 1966. Mathematical Methods in thePhysical Sciences. Wiley, New York, 778 pp.Maor, E., 1994. The Story of a Number. Prince-ton University Press, Princeton, NJ.Mesterton-Gibbons, M., 1989. A Concrete Ap-proach to Mathematical Modeling. Addison-Wesley, New York.

Shull, R.L., 1991. Mathematical description ofoperant behavior: an introduction. In: Iversen,I.H., Lattal, K.A. (Eds.), Experimental Analysisof Behavior, vol. 2. Elsevier, New York, pp.243–282.Timberlake, W., 1993. Behavior systems and rein-forcement: an integrative approach. J. Exp. Anal.Behav. 60, 105–128.

..