the effects of counterfactual comparison on learning and...

The effects of counterfactual comparison on learning and reasoning

Ph.D. ThesisBenjamin Timberlake

31st cycle, Cognitive and Brain Sciences

Center for Mind/Brain Sciences, CIMeC,University of Trento

Supervisor : Prof. Giorgio Coricelli

Co-Supervisor :Dr. Nadège Bault

Introduction Timberlake 2

This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit

http://creativecommons.org/licenses/by-sa/4.0/.


Abstract

Howhumansmakechoicesinuncertainandcompetitivesituationsisakey

determinantofviabilityandsuccessfulliving.Improvingthosechoicesrequiressometimes

encounteringundesirableoutcomesandavoidingthem,eventuallyevenanticipatingthemin

novelsituations.Learningdependsonmakingchoices,encounteringerrorsandupdating

evaluationsofoptions.Variousmodelsextendedfromthereinforcementlearningframework

comparedtohumanbehaviordescribeinparthowindividualsheterogeneouslymakechoices.

Topeerintothecomponentsofthesemechanisms,strategicgamesthatemulatereal-world

situationsprovidemeasurableandmanageableenvironmentsinwhichtoexamineslight

differencesinchoicebehavioramongdifferentpeople.Suchdifferencesmaybeendogenousto

participants(e.g.ageorlearningdisposition)whileothersderivefromexternalevents(e.g.

emotionalinductionorbrainstimulation).Wecontrastedsuchbehaviorinthreesituations

involvinglearningorcompetition,leveragingdifferencesinage,emotionalinductionandbrain

stimulation.Weaimedtodescribethevariationsinchoicebehavioracrossthesedifferences

andinvestigated,whenpossible,howpriorconditionsgeneratedatransferoflearningfrom

onedomaintoanother.Theworkherebuildsonrecentinvestigationsofneuralmechanisms

underlyingchoicebehaviorduringstrategicorcompetitiveinteraction.


Table of Contents

Structure of this thesis 7

General introduction 9

Chapter 1: Regret, Responsibility and the Brain 13

Introduction 14

Counterfactual information 15

Learning value 17

Responsibility 18

Neural Circuits of Regret 20

Moral Decision Making 24

OFC Lesions Modulate Regret and Morality 27

Psychopathy 29

A Social Dimension of Regret and Agency 32

Conclusion 34

Chapter 2: The effect of aging on regret in decision making 37

Introduction 38

Emotional assistance 41

Positivity effect 42

Risk and emotion 42

Counterfactual learning 44

Experimental questions 45

Methods 46

Lottery choice task 47

Learning task 48

Emotional rating analysis 50

Choice behavior analysis 51

Learning behavior analysis 53

Learning computational models 53

Results 54

Emotional ratings 54

Choice behavior 57

Learning behavior 60

Chapter 3: Priming regret: inducing counterfactual thinking to influence learning 67

Introduction 68

Experience-Weighted Attraction model 71


The Patent Race 74

Priming task: Wheel of Fortune 77

Mood priming 78

Hypotheses 80

Methods 82

Regret 84

Participants 85

Procedure 86

Computational Learning models 88

Results 89

Modeling 95

Discussion 95

Chapter 4: Electrical brain stimulation effect on level-k thinking 99

Introduction 100

Brain areas of level-k thinking 103

tDCS 104

Methods 105

Participants 105

Experimental Design and Task 106

Time course of experimental tasks 109

Payment 110

tDCS stimulation 111

Protocols 113

Statistical analysis 114

Results 117

Discussion 120

General Discussion 123Moral decision making 123

Aging, regret, risk and learning 124

Regret induction 126

Electrical stimulation of sophisticated thinking 128

References 131


Structureofthisthesis

Thisthesiscomprisesfoursections,eachconcernedwithmodulatingprocessesof

learning,reasoninganddecisionmaking.ThenotionofeacheffortwastoexaminespeciFic

featuresofthedecisionprocessviadeviationsfrombaseline,betheyinducedorselected.In

theFirst,areviewchapter,IcriticallyexaminetheintersectionsbetweendeFicitsinmoral

decisionmakingandregretdecisionmakinginpsychopathicandbrain-lesionpatients,Finding

anumberofsimilaritiesandmutualdeFiciencies.Thenextchapterconsistsofanalysisofa

previouslyconductedexperimentthatconsidersageasanaturalagentofcognitivechange.My

analysisexamineshowtherelationshipbetweenchoicebehaviorandcounterfactuallearning

appearstochangewithage.ThelattertwochaptersconsistofexperimentalstudiesthatI

conductedintheirentireties.TheFirststudyconsistsofinFluencingcounterfactuallearningvia

emotionalpriminginalargegroupofsubjectsandusingcomputationalmodelingto

characterizeunderlyingcognitivemechanisms.TheFinalchaptertakesayetmoredirect

approachtotestthecausalroleoftwodifferentbrainareasinstrategicthinkingbyemploying

transcranialbrainstimulationinanefforttoinducehigherlevelsofstrategicthinkingina

classiceconomicstrategicgame.


Generalintroduction

Humansmakeinnumerablechoicesduringwakinglife,fromsimpletocomplex,leading

topatternsofbehavior.Thosechoicescanbechangedbynewinformation,thatis,learning.

ThesecanbeinFluencedbyotherinternalandexternalfactorsbroughttobearonadecision

maker.Similarsituationswithonlyslightdifferencescangiverisetodifferentdecisionsand

variedpatternsofbehavior.Stateofmind,mood,information,socialsituation,age,

imagination—allmaybeimplicatedinmodulatingchoicewithotherconditionsheld

constant.Fromwhichhandtoopenadoorwithtowhetheritissafetocrossaroad;from

whichcaketoeattowhichjobtochoose—anyindividualwhohasmadeitfarenoughinlife

tobestudiedinalaboratoryexperimenthasmadeamountainofadvantageousdecisions,

thoughwithoutdoubt,accompaniedbyplentyofchoiceswhoseoutcomeswerenot

immediatelybeneFicial.Yet,asmanyamotivationalspeakerwillremindhisaudience,asmuch

learningcancomefromfailureasfromsuccess.ThisideafoundgeneralizationintheRescorla-

Wagnermodel’spreceptthatlearningoccursonlywheneventsviolateexpectations.

YettheremarkableFlexibilityoflearninghasallowedadaptationtocountless

situations.Inrepeatedgames,playersoftenexhibitlearningbehaviorinmakingbestchoices.

Earlyreinforcementlearningmodelsassumedthatplayersrespondedonlytotheresultsof

theirownchoices,repeatingchoicesthatleadtosuccessandavoidingthosethatfailed.These

modelswereparticularlygoodatexplainingbehaviorinthecontextoftheboundedrationality

ofspeciFicdecisionproblems.Later,expandedmodelssuchasFictitiousplaythataccountedfor

foregonechoicesandinsocialsituationsanticipatedactionsofotherplayersprovedmore

descriptiveofobservedbehavior(FudenbergandLevine1995).Yetmorerecently,adaptive

modelsthataccountforsomemixtureofbehaviordescribedbymorethanonemodelalone


haveaccountedforbehaviormoreprecisely.Thesehybridmodels,suchasexperience-

weightedattraction,affordindividual-leveldescriptionofsophisticationoflearning(Camerer

andHo1998).Moresophisticatedplayersmakechoicesinresponsetoanticipatedactionsof

others.Thehighest-levelplayersfurtheranticipatehowtheirownactionswillinFluencethe

strategiesofcompetitorsandmaximizerewardsbysometimestakingshort-termlosses.

Ineconomicmodelsofchoice,agentschooseoptionstomaximizelong-runreward.

Variousmodelsappliedtobehavior,however,canaccountforlong-runrewardindifferent

ways.Intheshortterm,infact,peopleoftengiveupahigherexpectedrewardinfavorof

avoidingtheriskofloss.Inrepeatedprobabilistictasksinlimited-informationfeedback

settings,choicesareguidedbycounterfactualthinkingthatcomparestheoutcomeofachoice

tothebestoutcomethatcouldhavebeenobtainedwiththatchoice.Insettingsinwhich

outcomesofunselectedchoicesareknown,however,subsequentchoicesareguidedbythe

differencebetweentheoutcomereceivedandtheoutcomeofachoicenotmade.Thisaddsto

counterfactualthinkingelementofresponsibility,acrucialcomponentoftheexperienceof

regret.AnticipatedregretissoinFluentialthatinchoiceproblemssimilartothosethathave

broughtregretinthepast,peopleincreasinglyavoidthechoicesthatpresentthegreatest

potentialregret–toanevengreaterextentthantheyavoidrisk(Coricelli,Critchleyetal.

2005).Adiscretesignalinthemedialorbitofrontalcortexbothaccompaniestheoccurrenceof

regretduringataskaswellasitsanticipationduringthesametask,indicatingthat

experiencingregretisadaptive.Couldthismechanismthatfunctionssovigorouslywithinone

settingcarryoverandofferitsinFluenceinanothersimilarbutnovelsituation?Thisdocument

proposestoexplorethatquestionatseveraldepths.

Proposedcircuitsofregretlearningsharesomestructuresandpatternswiththe

systemsimplicatedinmoraldecisionmaking,inparticulartheventromedialprefrontalcortex.

Westarttheconsiderationofdecision-makingregretbycomparingitsimplementationand

neuralcorrelatestothoseinvolvedinmoraldecisionmaking(Chapter1).SpeciFicallyin


instancesinwhichnormalfunctionhasbeeninterruptedinbothdomains,theprocessesshare

someremarkablesimilarities.

Weexaminedevidenceoftransferbetweenalotterychoicetaskandaninstrumental

learningtask(Chapter2).Here,weexaminedchoicebehaviorasanindicatorofhowlearning

mightvarybetweenyoungeradultsandolderadults.Wewantedtoseeiftherelationshipand

potentialtransferbetweentasksvarieddependingonage.Agingismarkedbyselectiveareas

ofcognitivedecline,particularlyinthecontextofdecisionmaking(Tymula,Belmakeretal.

2013).Moreover,adultsolderthan60havebeenobservedinonestudytoincorporate

counterfactualinformationinlearningtoalesserextentthanyoungeradults(Tobia,Guoetal.

2016).Examiningdatafromtwoage-segregatedcohorts,weinvestigatedhowthechoice

behaviorofolderadultsandyoungeradultsinthelotterytaskindicateddifferentlearning

patternsinthesecondtask.

Thequestionoftransferisprobedmorepointedlyinanexperimentinwhichwetried

tomakepeoplefeelverybadrightfromthebeginning,havingthemloseastackofmoneyand

showingthemwhattheycouldhavewoniftheyhadmadeadifferentchoice(Chapter3).Our

hypothesiswasthatthisinductionofregretwouldelicitcounterfactualthinkingandmake

playerslearnfromcounterfactualcomparisoninadifferentgametheyplayedrightafterthis

largeloss.Weemployedalimited-spacestrategicgame,andthencomparedbehaviortotheFit

ofseveralmodelsofreinforcementlearningandbelief-basedlearningthatincorporate

counterfactualthinkingandstrategiclearningtovariousextents(SuttonandBarto1998,

Camerer2003,Zhu,Mathewsonetal.2012).Employmentofbelief-basedlearningdepends

largelyonunderstandingthebroaderstructureofasystem,towhichapersonalreadyattuned

toregretmaybemoresensitiveiftheexperiencetransfers.Inthecontextofthecompetitive

game,wehypothesizedthattheconsiderationofthecounterfactualdemandedbytheprior

regretoutcomewouldencouragethistypeoflearning.


Transfercanbeginwithexperienceinoneactivitybeforecommencinganother.Inthe

twopreviousoverviews,thisisrealizedwiththeoutcomeofadecision.Thatinformationmay

thenmodulateperformanceinthesubsequenttask.Inathirdstudy,weproposedtoskipthe

stepofintroducinginformationwithabehavioralsituationand,instead,encouragethebrain

toreachatargetstateviaelectricalstimulation(Chapter4).Differentpeopleengagein

strategicthinkingatvariouslevelsofsophistication,andmeasuredbrainactivityreFlectsthat

diversity.Imagingstudieshavelocatedsomeneuralcorrelatesofmentalizinginfrontal

corticalareas(Hampton,Bossaertsetal.2008,CoricelliandNagel2009).Ifthoseareasare

moreactiveduringhigherlevelsofstrategicthinking,theymaywellcontributethebehavior.

Wethereforesuspectthatiftheseareasarestimulatedtohigherlevelsofactivity,theycould

giverisetomorestrategicthinking.

Chapter 1: Regret, Responsibility and the Brain

Regret, Responsibility and the Brain

BenTimberlake1,GiorgioCoricelli1,2andNadègeBault1

1CenterforMindandBrainSciences,UniversityofTrento.2DepartmentsofEconomics,UniversityofSouthernCalifornia.

TobepublishedinTheMoralPsychologyofRegret(2019),A.Gotlib(Ed.).London,UnitedKingdom:Rowman&LittleDieldInternational.

Chapter 1 Timberlake 14

Introduction

Regretdescribesanemotionthatarisesfromavarietyofcircumstances.Wefocushere

onaparticulartypeofregret,decisionregret,whichcomestothestudyofdecisionmakingby

wayoftraditionaleconomics,alongwithinsightsfrompsychology.Thisisclearlynottheonly

formaldescriptionofregret,butitbearsresemblancetovariationsstudiedinotherFields.The

beneFitsofthisregretdeFinitionareitsformalization,itsoperationalizedmeasurabilityandits

attendantbodyofliteratureinneuroimaging.Thislastiscriticalforcomparisontotheneural

basesofotherphenomena.

RegretreferstoaspeciFicsetofconditionsandresponses,whichincludelearningfrom

animaginedalternativeoutcomethatcouldhavebeenreachedthroughdifferentactionbythe

personfeelingtheemotion.Thisarisesafteranactororagenthasmadeachoice,seesits

outcome,andthenrealizesthatanotheroutcome—theresultofadifferentchoiceofhers—

ismoredesirable.Decision-basedregretor“decisionregret”isproportionaltothemagnitude

ofthedifferencebetweentheobtainedandmissedoutcomes.Theseelementsarethe

deFinitivecomponentsofdecisionregret:learning,responsibilityandcounterfactual

information.Otheremotionsmayarisefromanyoneortwooftheseelements,butallthree

mustbepresentforregret.Thesesituationalrequirementshavelongguidedthepsychological

descriptionofregret(Zeelenberg,Beattieetal.1996,ZeelenbergandPieters2007),andthey

persistintheeconomicdeFinitionofdecisionregret(LoomesandSugden1982).Decision-

makingstudiesoperationalizethisdescription,usingbothbehaviorandamodiFiedutility

functiontoquantifytheeffectsoftheemotionalexperience(Bell1982,LoomesandSugden

1982).

Likemostdecisionprocesses,moraldecision-makingpitsmultipleoptionsagainstone

anotherinanefforttoarriveatthemostdesirableoutcome.Moralnormsarepersonal

convictionsreFlectingrulesofconductoneoughttoadoptinagivensituation.Theyrepresent

sociallyderived,internalizedvaluesattributedtoapatternofbehaviorthoughttobe


appropriate(Manstead2000).Moralnormsplayanimportantroleindecisionmaking

becauseinternalizedvaluesattributedtoaparticularcourseofactionarelikelytoguide

behavior.Consequently,behavingincontradictiontoone’sownmoralnormsislikelytoelicit

strongnegativeemotions.Insuchasituation,regretislikelytoarise,especiallyifthenorm

violationresultsinanegativeoutcome.Somestudiessuggestthatfeelingsofregretare

anticipatedattheprospectofviolatingone’smoralnorms(Parker,Mansteadetal.1995).

Otherstudieshaveshownthatanticipatedregretandmoralnormsareconfoundedin

explainingchoices,especiallythosewithmoralimplications(Rivis,Sheeranetal.2009,

Newton,Newtonetal.2013).Despitepreliminaryevidencefromsocialpsychologyofa

possibleoverlapbetweenanticipatedregretandmoralnorms,thecognitivemechanisms

linkingthetwoconceptshavenotyetbeendeeplyinvestigated.Evidencefrom

neuropsychology,however,suggeststhatthebrainmechanismsunderlyingregretanticipation

andtheimplementationofmoralnormsmightinvolvesimilarneuralcircuits.

Bytracingthebrainactivityassociatedwithmoraldecisionmakinganddecisionregret

behaviors,itbecomesclearthatsomeofthesamebrainareasaresimilarlyimplicatedinboth

processes,suggestingthatsomeconnectionsbetweenthetwocategoriesofchoicesmaybe

identiFied.Here,weexplorethispotentialconnectionbetweenmoral-andregret-based

decisionsbyreviewingtheirfeaturesandneuralbases.

Counterfactualinformation

Regretarisesfromcomparisontoanalternativeresult:onethathasnotactually

occurred.Itrequirestheimaginationofanalternativerealitythatresultsfromadifferent

choicethantheonemade.Theprocessofdeconstructingthepresenttoimagineadifferent

reality,calledcounterfactualthinking,isatthecoreofregret.Counterfactualthoughtsare

oftengeneratedaftergoalfailure(Byrne2002).Thefunctionalroleofupwardcounterfactual


thinking,andthus,associatedregret,istolearnfrommistakes,togeneratevariantcoursesof

actionsuspectedtoprovemoresuccessfulwhensimilarsituationsareencounteredinthe

future.

InasimpleillustrationofthedeFinitionandmeasurementofdecisionregret,imaginea

gameofchance:aslotmachine.Agamblercanpulltheleverinexactlyonewayandtake

whateverresultcomes.Winorlose,hisactionsmakenodifference(otherthanthechoiceto

playthegameintheFirstplace).Nature,wearingtheguiseofprobability,determinesthe

outcomeeverytime.Ifheloses,thegamblerbydeFinitionfeelsdisappointment(andifhe

wins,satisfaction),butnotregret.Nowimaginetwoslotmachinesnexttoeachother.The

gamblermustchooseonetowhichtostakehisfortunes,yetwhenhepullsthelever,the

wheelsspinonbothmachines,andhecanseebothoutcomes.Nowheseesbothhisactual

winningsorlossesonthemachinehechose,aswellaswhathewouldhavewonorlosthadhe

selectedtheothermachine.Ifhisslotmachineloseswhiletheotherwins,hecanimaginea

worldinwhichhemadeadifferent,winning,choice.ThisidentiFicationofthecounterfactual

precipitatesregret.Anotion,evenanimpreciseone,thatthecounterfactualoutcomewas

bettermaygiverisetoregret,butthediscrepancybetweenspeciFicvaluesofobtainedand

foregoneallowforclearerinterpretationatthispoint.Simulationsofthissituationhavebeen

usedinvariousexperimentalsettingstomeasureandcompareregrettodisappointment

(Camille,Coricellietal.2004,Nicolle,Bachetal.2011,Gillan,Morein-Zamiretal.2014).

Regretisfurthercharacterizedbyanegative-valenceerror,whichdifferentiatesitfrom

relief.Theerroristhedifferencebetweentheobtainedoutcomeandtheimagined

counterfactualoutcome.Thisisanimportantdistinctioninregret:thattheerrormusthave

negativevalence,ratherthantheobtainedoutcomeitself.Thisunderscorestheideathat

regretisthenegativeresultofcomparisonbetweenoutcomes,whichmaygiverisetochanges

inbehavior.Intheslotmachinestudy,evenwhensubjectswonwithacertainchoicebutsaw

thattheycouldhavewonmorehadtheymadeadifferentchoice,thenetemotionalsensation


wasnegative(Camille,GrifFithsetal.2011).Peopledescribetheiremotionsasmorenegative

withabetterforegonechoice,evenwhentheobtainedoutcomeisthesame.Thiscomparison

issoclearthattheemotionfollowingagoodoutcomeofachoicemade(winning$50)

comparedtoaverygoodoutcomeofaforegonechoice($200)canberatedevenlowerthan

thatfollowingabadobtainedoutcome(-$50)comparedtoaverybadoutcomeavoided(-

$200)(Camilleetal.2004).Thatis,despitewinningmoremoney,peoplesaidtheyfeltworse

—becausetheycomparedtheirwinningswithwhattheycouldhavewonhadtheymadea

differentchoice.Thisabilitytoimagineanalternativerealityafterthefactinformsdecision

problemsnotyetencountered.Infact,afterexperiencingregret,subjectsmadechoicesin

subsequenttasksthatwereconsistentwithtryingtominimizethatfeelingofregret(Coricelli,

Dolanetal.2007).

Learningvalue

Inamorecomplexscenariothatemploysregretinlearning,wemightassignthetwo

machinesdifferentprobabilitiesofpayingout.Wecouldtaskthedecisionmakerwithearning

themostmoneyandthereforethegoalofchoosingtheright(i.e.morelikely)machinetoplay

moreoftenoverthecourseofanumberofopportunities.Suchasequentialtask(asemployed

inDaw,O’Dohertyetal.2006)allowstheexplorationoflearningandthecomparisonof

variousmodels,whichcanincludethosethatincorporateregretlearning.Lohrenzand

colleaguesadopttheregret-learningmodelandrenameit“Fictivelearning”todiscard

emotionalconnotationsandtomaintainonlytheerrorsignalofanunobtainedoutcome

(Lohrenz,McCabeetal.2007).Subjectsplayedaninvestmentgame,inwhichtheresearchers

sawthatincorporatingFictiveerror(thedifferencebetweenchosen-obtainedandforegone-

obtained)overgainsbetterpredictedthesubject’ssubsequentbetthansimplereward


predictionerror:thedifferencebetweenwhatthesubjectthoughtshewouldwin/loseand

whatsheactuallywon/lost.

Inthescenarioofsequentialchoicesoftwodifferentgambles,thedifferencebetween

theresultsofthechoicethegamblermadeandthoseoftheonehedidnot—preciselythe

measurewecalldecisionregret—canbedescribedasasignalenlistedtolearntomakebetter

choices.Thatabilitydependsoncomputingthatdifference,thenemployingittoforeseea

possiblerecurrencebeforethenextchoiceismade,andFinallymakingadifferent,presumably

betterchoice(Coricelli,Critchleyetal.2005).Anticipationofregretinducesadispositionto

changebehavioralstrategies(Ritov1996),andcharacterizesanemotion-motivatedlearning

processindecisionmaking(Zeelenberg,Beattieetal.1996).Intheoriesofadaptivelearning

drivenbyregret-basedfeedback(Megiddo1980,FosterandVohra1999,HartandMas-Colell

2000,FosterandYoung2003,Hart2005),learningoccursbyadjustingthepropensityto

chooseanactionaccordingtothedifferencebetweenthetotalrewardsthatcouldhavebeen

obtainedwiththechoiceofthatactionandtherealizedtotalrewards.Thatis,thetendencyof

choosingmachineAdependsonhowmuchwouldhavebeenwonbychoosingthatmachineall

alongcomparedtohowmuchthegamblerhasactuallywon.Asgamblers,humanstendtobe

prettygoodatthis.Followingregret-basedlearningmodels,decisionmakersconvergeto

optimalchoices(CoricelliandRustichini2010).

Responsibility

Peopleshowstrongregularitiesinthenatureoftheeventthey“undo”whenreFlecting

onabadsituation.Oneoftheseregularities,theagencyeffect,isparticularlyatstakeinthe

experienceofregret:thoughpeoplefeelregretbothforactionstakenandinaction–and

althoughnostalgiaandautobiographicalretrospectionoftenhighlightmissedopportunities–

peopleinfactmoreoftengeneratecounterfactualsthatundosomeundertakenaction,rather


thaninaction(Byrne2002).Thus,peoplehavegreaterregretforactionstheyhavetaken,more

sothanforthosetheyfailedtotake—atleastintheshortterm.Whennoactioncouldhave

beentakentopreventabadoutcome,andintheabsenceofagency,peoplereportfeeling

disappointmentratherthanregret.Disappointmentisalsoelicitedbycounterfactualthought,

thoughthecriticaloutcomemustbeduetocircumstancesbeyondtheagent’scontrol,

absolvinghimofresponsibility.Thekeydistinctionisthis:Disappointmentarisesfrom

recognizingthatabetteroutcomemighthavecomegiventhesamechoice;regret,from

identifyingabetteroutcomegivenadifferentchoice(Zeelenberg,vanDijketal.1998).Both

emotionscomefromexaminingoutcomesandseeingthatabetteronecouldhavebeen

obtained,butregretisassociatedwiththeresponsibilityofhavingcausedthesub-optimal

outcomebytakingaspeciFicaction.Becauseregretcomeswiththeoutcomeofaforgone

choice,itdoesbringwithitgreaterinformation,butitseffectonsubsequentdecisions

amountstomorethansimplytheadditionofthatdata.Rather,theincreasedinformation

allowsfortherecognitionofagency,alongwithcounterfactualcomparison.

Zeelenbergandcolleaguessoughttodifferentiateregretfrombothdisappointmentand

ageneralsenseofhappinessbyrepeatingandexpandingonstudiesbyConnolly,Ordoñez,and

Coughlan(1997).TheyaskedcollegestudentstoconsiderscenariosinwhichFictionalcollege

studentschangedtheirclassassignments—eitherbytheirownchoiceorbycomputerFiat.

TheresultsofthesechangesfortheFictionalstudentsrangefromimprovementtoneutralto

downgrade.ThesubjectsratedhowtheFictionalstudentswouldfeelalongscalesmeasuring

happiness,regretanddisappointment,aswellastowhatextentstudentsinthestorieswere

responsiblefortheiroutcomes.Theresearchersfoundthathappinesstrackedoutcomebut

notresponsibility,whiledisappointmentandregretwereassessedinverselydependingon

levelofresponsibility:thatis,themoreresponsibilitysubjectsperceived,thegreaterthe

amountofregrettheybelievedthecharacterwouldfeelindowngradeoutcomes.


Childrenasyoungas5seemtohavesomegraspoftheiragency.Inachoicetask

involvingtwoboxescontainingdifferentamountsofstickers,childrenreportedgreater

happinessorunhappinesswhentheychosewhichboxtoopenthanwhenthechoicewas

determinedbyanexperimenterorarollofdice(WeisbergandBeck2012).Thoughitwaslong

unclearatwhatagethenotionofpersonalresponsibilityinchoicesemerges,recentresearch

suggeststhatagencydoesnotinFluencetheemotionalresponsetooutcomesinchildren

youngerthan6(Guerini,FitzGibbonetal.inpress).UsingamodiFiedWheelsofFortunetask

(withstickersratherthanmoneyasthewinnings)onchildrenbetweenages3and10,Guerini

andcolleaguesfoundthatchildrenweremoresensitivetotheoutcomesofthechoicethey

madethanthosethecomputermadeforthem—butonlyintrialswithcompletefeedback,

andonlysigniFicantlyforchildrenages6andolder.Thatis,bothcounterfactualoutcomeand

responsibilitywererequiredinorderforthechildtofeeltheoutcomewithgreatermagnitude.

Intrialswithjustpartialfeedback,thechildren’ssensitivitytooutcomeswassimilarwhen

theymadethechoiceandwhenthecomputermadethechoice—situationsthatgenerate

disappointmentratherthanregret.Thisevidenceofdifferentiationatyoungagesfurther

supportsthenecessaryroleofagencyinregret.

NeuralCircuitsofRegret

Thecomparisonbetweentheoutcomeofachoiceandtheforegoneoutcomeofan

alternativeoptiontriggersspeciFicbrainresponses.Theventromedialprefrontalcortex

(vmPFC)encodesthedifferencebetweenwhathasbeenobtainedandtheoutcomeofthenon-

chosenoption(Coricelli,Critchleyetal.2005).ThevmPFCisafunctionalareathatincludes

theanatomicalmedialorbitofrontalcortex(mOFC),anareathatencompassesthemost

centralpartsofbothhemispheresattheveryfrontofthebrain.ThevmPFCisbelievedtohold

ontorewardvalueovertime,possiblythroughtonicactivity,thentosendthatsignaltoother


areasinvolvedinchoice,likethedorsolateralprefrontalcortexandthemedialcaudate

(Hampton,Bossaertsetal.2006,Behrens,Huntetal.2008).Findingsfromneuroimaging

studiessupporttheunderstandingthatresponsibilityisanecessarycomponentof

experiencingregret.Indeed,duringthelotterytask,activityoftheOFCinresponsetoagainor

alosswasmodulatedbytheoutcomeofthenon-chosenlottery(Coricelli,Dolanetal.2007).

However,whentheoutcomeofthenon-chosenlotteryremainedunknown,thecounterfactual

processbetweenlosses(orwins)andanymissedoutcomeofthechosenlotterywas

accompaniedbyaweakereffectinOFCactivity.Thus,theOFCappearstoencodethe

counterfactualcomparisonbetweenobtainedandunobtainedoutcomes,butonlywhenthe

resultcomesfromachoice,ratherthanmisfortune.vmPFCsignalsthevalueoftheobtained

outcomecomparedtothatofthenon-obtainedoutcome,suggestingthattheseregretsignals

arerelatedtothewaythebrainevaluateschoicesandtheirconsequences.Itexhibitsactivity

thatcorrelateswithregretatallstagesofthechoiceprocess:preference,expectationand

reward(Montague,King-Casasetal.2006).

Correlatesofregrethavealsobeenmeasuredinpartsofthebrainconsideredtohave

keyrolesinassessingandcommunicatingthevalueofchoice(Nicolle,Bachetal.2011).In

neuroimagingstudies,theanteriorcingulatecortex(ACC)andhippocampushavealsoshown

increasedactivitycorrelatedwithregretduringchoicetasks(Coricelli,Critchleyetal.2005).

Thehippocampus,acorticalfoldingbelowthecerebralcortex,isimplicatedinconsciously

accessibledeclarativememory,whichisimportantformakingfuturedecisionsbasedonpast

events(Coricelli,Dolanetal.2007),suchastryingtoavoidpreviouslyencounteredsub-

optimaloutcomes.Thisabilitytoguidefutureactionsisakeycomponentinanticipating

regretbasedonexperience.

ThevmPFCincreasedactivityduringthereportedexperienceofregretreoccursinthe

periodjustbeforemakingsubsequentchoices—theperiodleadinguptoadecisioninwhich

regretwouldbeanticipated(Coricelli,Dolanetal.2007).Becausethesignalmeasuredinthe


vmPFCappearsinotherareas,thisreoccurrencesuggeststhatthemeasurementisnotmerely

ofhappiness,norsimplyanoutcomevalue(Coricelli,Critchleyetal.2005,VanHoeck,Watson

etal.2015).Itsuggeststhatregretiscomputedbyonebrainareaandthenconveyedtoothers

thatmodulateandimplementitinsubsequentdecisions.Critically,thedifferentiationof

experienceandanticipationisclear,thoughtheybothinvolvethevmPFC/mOFC(Coricelli,

Critchleyetal.2005).Thankstothaterrorsignal,alongwiththeopportunitytomakea

differentchoice,modelingregretanticipationisareliablepredictorofchoiceprobabilityin

certainsequentialdecisiontasks(Coricelli,Critchleyetal.2005,MarchioriandWarglien

2008).MarchioriandWarglienfoundthatincorporatingaregretsignalintoevenasimple

learningneuralnetworkbetterpredictedhumanbehaviorthanlong-employedmodelslike

reinforcementlearningandahybridmodelthatcombinesreinforcementlearningwitha

player’sbeliefsaboutotherplayers.Coricelliandcolleaguesobservedthat,asplayers

experiencedmoreregretincomplete-feedbacktrialsofasequentialWheelsofFortunetask,

theydecreasinglychoseoptionsmorelikelytoleadtoregret.Theyalsosawthatthemorea

givenchoicehadleadtoregretbefore,thelesslikelythesubjectwastochooseitagain

(Coricelli,Critchleyetal.2005).Regret,then,isnotmerelyanegativeemotion,buta

calculatedsignalthatguidesagentsawayfromchoicesthatcouldreproducethatsignal.This

efforttominimizeregretisakeydifferentiatorinitsroleasalearningmechanism:the

emotionalexperiencealonewouldhavelittlemeaningbeyondsensation,wereitnottoguide

futurebehavior.

TheexaminationofchoicebehaviorofpatientswithlesionsinthevmPFCreveals

insightintothecausallinkbetweenregret-relatedbrainactivityandbehavior.vmPFCpatients

aretypicallydescribedasmakingdisastrouslifedecisionsdespiteapparentlyintactcognitive

abilities.AfamousexampleisthecaseofERV,apatientwhohadasuccessfulcareerandstable

maritallifebeforehedevelopedameningiomacompressinghisOFC.Hethenlosthisjoband,

againsthisfamily'sadvice,investedallhissavingsinabusinesspartnershipwithamanof


questionablereputation.Hewentbankrupt,gotdivorcedandthenamonthlatermarrieda

prostitute,aunionthatlastedjustsixmonths.Yethepassedallneuropsychologicaltestsof

intellectual,memoryandverbalskillswithnormalscores(Damasio,Traneletal.1990).

Alongsidesuchcalamitiesintheirdailylives,experimentalevidenceshowsthatpeoplewith

vmPFClesionsdisplayabnormalemotionselicitedbyrewardandpunishment(Bechara,

Traneletal.1996,Bechara,Traneletal.2000).Carefulinvestigationoftheunderlying

computationaldeFicitshasrevealedageneraldeFicitinintegratingvaluesattributedtovarious

actionswiththecurrentgoals(Camille,GrifFithsetal.2011),functionthathasbeenassigned

tothevmPFCinbrainimagingstudies.Patientsareabletoassignasubjectivevaluetooptions;

howevertheywillnotcommittotheoptionwiththehighestvalue.Additionally,vmPFC

lesionsresultinaninabilitytofeelregretafterabadchoice,andconsequentlyinanticipating

futureregretduringthedecisionprocess(Camille,Coricellietal.2004).Bothreported

subjectiveratingsoftheoutcomeoftheirchoicesandtheassociatedskinconductance

responsesofvmPFCpatientsweredifferentfromthatofcontrols.BehaviorofvmPFCpatients

wasnotsigniFicantlychangedbyknowingtheoutcomeofthealternativeoption,anabsenceof

thesignaturefeatureofregret.Whilehealthycontrolsubjectschangedtheirchoicestoavoid

regretoverthecourseofthetask,vmPFCpatientsdidnot.

WhilethefMRIandlesionstudiesmentionedabovehaveidentiFiedcommonneural

mechanismsforexperiencedandanticipatedregret,morerecentFindingssuggestthatpeople

withpsychiatricandneurologicaldysfunctioncanexhibitonestageoftheprocessbutnot

another(Gillan,Morein-Zamiretal.2014,Levens,Larsenetal.2015).Althoughbrainareas

associatedwiththeseveralstagesofprocessingandanticipatingregretoverlap,theyarenot

coextensive.DamagetothevmPFCmayallowtherecognitionandexperienceofregretbutnot

itsapplicationtofuturedecisions(Levens,Larsenetal.2015).Variousdysfunctionsofthis

regretmechanismofferatleastpartialexplanationsofthebehaviorofpeoplewithevidenceof

neurologicaldisorders.Bothobsessive-compulsivedisorderpatientsandpeoplewithhigh


indicationsofpsychopathyreportfeelingregretmorekeenlybutdonotavoiditinfuture

choicestothesameextentashealthysubjects(Hughes,Dolanetal.2013,Gillan,Morein-Zamir

etal.2014).

MoralDecisionMaking

ThevmPFC,whichrepresentsacrucialportionofaproposedregretcircuit,alsoplaysa

keyroleinsomeemotionalcomponentsofmoraldecisions(Moll,Oliveira-Souzaetal.2002,

Blair2007,Koenigs,Youngetal.2007).Brainimagingstudiesofmoraldecisionmakinghave

implicatedsomeofthesameareasandnetworksinthefrontalcortexthatareassociatedwith

emotionanddeliberation–oftenFindingtheseregionstobeincompetitionduringdifFicult

choices.Astudyofmoraljudgment(withoutanydecisioncomponent)implicatedthemOFCas

partofaneuralcircuitthatshowedhigheractivitywhensubjectsreadsentenceswithamoral

component.Thesameareas,whichalsoincludedthetemporalpoleandthesuperiortemporal

sulcus,didnotshowhigheractivationwhensubjectsreadstatementswithemotional

componentsbutnomoralelement(Moll,Oliveira-Souzaetal.2002).Researchershave

developedarangeoftheseproblemstoprobethespectrumofmoraldecisionmaking,andthis

hasyieldeddistinctdifferencesinchoiceandbrainactivity.Amongthemostwell-knownsetof

dilemmasisthefamilythatarisesfromthetrolleyproblem.Subjectsreadaboutahypothetical

situationinwhichtheyarestandingnexttoasetofrailroadtracks,whilesomedistanceaway,

agroupofworkersisstandingonthetrack.Thesubjectsaretoldthattheyseeastreetcar

comingdownthetrackswithnochanceofstoppingbeforestrikingandkillingtheFive

workers.Thesubjectsaretoldtheyarestandingnexttoalever,which,iftheypullit,will

switchthecarandsendthetrainontoasidetrack,wherethereisaloneworkerwhowillbe

struckandkilled.ThoughthiswouldbeadifFicultsituationinreallife,inthehypothetical,itis

characterizedaseasyandimpersonal—becausethesubject’slevelofinvolvementfromthe


consequencesisdistantandmostpeoplepresentedwiththequestionanswerquicklyandin

thesamemanner(Greene,Nystrometal.2004).Mostpeoplechoosetopullthelever,making

asimpleutilitycalculation(Greene,Sommervilleetal.2001).Avariantofthisdilemmathat

bringsthedecisionclosertothesubject,however,isthefootbridgeproblem.Now,thesubject

isonabridgeovertherailroadtracks.Hecanstillseetheworkers,andthereisstillastreet

carbarrelingtowardthem,butinsteadofaswitch,thesubjecthastheopportunitytosavethe

workersbypushingalargeperson,whoisalsoonthebridge,offthebridgeandintothepath

ofthestreetcar,savingtheFiveworkersbutkillingtheinnocentperson.Givensimple

calculationofnumberofpeoplesavedversuskilled,thesesituationsareidentical.Yet

accordingtomeasuresofthreefeaturesofthesedilemmasidentiFiedbyGreene(2007):

expectationofbodilyharm,agencyofactorandspeciFicityofvictims,somedilemmasare

more“upcloseandpersonal.”The“closeness”oftheactionbringstheemotionalsalienceof

theproblemintoconFlictwiththepureutilitariancalculation.Thisantagonismseemstobe

carriedoutinthebraininbothprocessesandareasthatbearresemblancetotheexperience

ofregret(Koenigs,Youngetal.2007).

Anotherfamilyofmoraldecisionsbringsanevensharpercontrast.Itstartswiththe

easilysolvedinfanticidedilemma,whichposesthequestionofwhetherornotateenage

mothershouldkillherunwantednewbornbaby.Theprospectofkillingababyinserviceof

discomfortiseasilyrejected,andsubjectsrespondquicklyanduniformlyinthenegative.

Brainimagingduringthisdecisionshowedlowerlevelsofactivityintheanteriorcingulate

cortex(ACC)andthedorsolateralprefrontalcortex(dlPFC),suggestinglittleconFlictbetween

theoverwhelmingemotionalaversiontothechoicetokillthebabyandthelowlevelofutility.

SubjectsalsoconsideramoredifFicultanalogueofthisproblem:thecryingbabydilemma,in

whichsubjectsareaskedtoimagineagroupofpeoplehidingfromagroupofoutlaws.Among

thepeoplehidingareamotherandhernewbornbaby,whichbeginstocry,whichcouldalert

theoutlawstothepresenceofthehidingpeople,resultinginthedeathofallofthem,including


thebaby.Subjectsareaskedifitismorallypermissibleforthemothertosmotherherbabyto

death,savingthepeoplebutkillingherownbaby.Here,thecalculationleadstoasimple

utilitarianconclusionthatmorepeoplearesavedbykillingthebaby.YetthisstandsinconFlict

withthestarkemotionaloppositiontokillingababy.

Observationsinotherbrainareassupportthisframework.Greeneandcolleagues

observedincreasedactivityinACCanddlPFCduringmoredifFicultdilemmaslikethecrying

babyandthefootbridgeproblems,ascomparedtoeasierdilemmas.Theyarguethatthis

indicatesthattheACCdetectstheseconFlictsandthatthedlPFCthendeliberatesandresolves

them.Supportingthisproposal,thedlPFCshowsevengreateractivitywhentheproblem

resultsinautilitarianjudgmentthatviolatespersonalmorality.Butitisalsopossiblethatthe

dlPFCinstigatesaperiodofcognitivecontrol,delayingthedecisiontoallowtheACCenough

timetoemployautilitariancognitiveresponse,thusoverridingamoreimmediateaffective

response(Greene,Nystrometal.2004).IftheACCisageneralarbiterofantagonism,thenitis

nosurprisethatitwouldbemoreactivebothincasesofdifFicultmoraldilemmasandfor

discrepanciesbetweenpredictionsandrealities,asinexperiencesofregret.Thissharedstep

indecisionmakingconnectsthetwoprocessesandsuggeststhatcognitiveresolutionof

conFlictsofanytypemaybehandledwithsomesimilarity.

Notably,theseveraltypesofmoraldilemma—personalandimpersonal,distantand

close—incorporatedegreesofaction,thoughGreeneetal.(2004)differentiatebetweenthe

greateragencyof“authoring”andtheimpersonaldeFlectionofathreat,describedas“editing”.

Regretsimilarlyrequiresapersonalagency—thatresponsibilityattenuatedonlyifthechoice

givingrisetotheemotionissharedwithothers(Nicolle,Bachetal.2011).Theroleof

responsibilitylinksthetwoconsiderationsandcarriesthequestionofdecision-makingregret

toamorallevel.Themoreapersongaugeshimselfresponsibleforanoutcome,themore

keenlyhefeelsregret(Frijda,Kuipersetal.1989).Bothrangesofmoraldecision—thosethat

favorutilitariandecisionsandthosewithagreateremotionalcomponent—employbrain


areasthatcomposepartoftheregretcircuit.Thisobservationsuggeststhattheabilitytofeel

accountableforone’schoiceandthephenomenonoffeelingregretfulinthecaseofabad

decisionmightbethepremisesformakingnon-utilitariandecisionsinmoraldilemmas.It

doesnotprovetheexistenceofacausallinkbetweenthetwo.Nevertheless,gathering

evidenceapproachingacausallink,wereportthecasesoftwodifferentpopulationsof

patients—patientswithlesionsinthevmPFCandpsychopaths—whichexhibitaco-

occurrenceofdifFicultieswithallpreviouslymentionedprocesses.

OFCLesionsModulateRegretandMorality

Patientswithparticulartypesofbraindamagecandemonstratehowthoseportionsof

thebrainareimplicatedinspeciFicprocesses.Brainlesionsaredisactivationsofsectionsof

thebrainduetoeventsliketumors,strokeorheadinjury.Dependingonthetypeof

precipitatingevent,lesionsmayoccurinsimilarregions.TheirspeciFiclocation,whilenot

uniform,canbeestablishedforeachpatientthroughtheuseofanatomicalMRIandother

brainscanningtechniques.Bycomparingthebehaviorofhealthycontrolstothatofpatients

withlesionsinthesameregion,theroleofthatbrainareaintheprocesscanbedescribed.So

peoplewithlesionstoareasimplicatedinmoraldecisionmakingorregretdecisionmaking

mayexhibitbehaviorsigniFicantlydifferentfrompeoplewhosebrainsarefullyfunctionalin

thatregion.Similarly,peoplewithpsychologicaldisorders,whichhavebrain-basedcausesand

thereforecognitiveimplications,mayexhibitsimilartypesofdifferentbehaviorfromhealthy

controls.

PatientswithlesionsinthevmPFC,likethosewhodemonstrateddifFicultywith

applyinganticipatedregret,alsoexhibittroubleinfollowingsocialnorms.Bothtypesof

unusualdecisionoutcomeaccompanydamagetothevmPFC,implicatingthisareainakeyrole

ofbothmoralandregretchoice.SpeciFically,whenpresentedwiththefootbridgeproblem,


whichdemandsproximateaction,mosthealthypeoplecannotovercometheemotional

aversionoftheproposition.Conversely,vmPFCpatients—whoselesionsdeactivateportions

ofthisbrainarea—exhibitedutilitarianbehavior,choosingtosacriFiceonelifeinfavorofFive,

adecisionthatappearstoconsideronlytheFinaltallyofthechoiceandtoignorethe

emotionalaspects(Koenigs,Youngetal.2007).Inabatteryofhypotheticalsituations,these

patientswerepresentedwithchoicesofsacriFicingonelifetosavemultipleotherlives.Among

thebest-knownnon-emotionallysalientdilemmasisthetrolleyproblem,inwhichthetrolley

isdivertedbyaleverontoatrackwithoneperson,avoidingthedeathofFive.Inthisdilemma,

vmPFClesionpatientsmakethechoicetopulltheleveraboutasoftenashealthycontrolsdo,

makingapurecalculationabouttheimpersonalactionofpullingalever.Giventhatthese

patientshadimpairedautonomicactivityinresponsetoemotionallychargedpictures,the

authorsconcludethattheproblemingenerating“normal”moraljudgmentscomefrom

impairedemotionalprocessing.ThiswassupportedbytwootherstudiesshowingthatvmPFC

patientsdonotexperienceaversiveemotionalresponsestomoralviolations(Ciaramelliand

Pellegrino2011,Gu,Wangetal.2015).Whenapersonalelementisinvolved,healthypeople

choosetointervenemuchlessfrequently(Greene,Sommervilleetal.2001).Notsolesion

patients,whocontinuetomaketheutilitarianchoiceataboutthesamerateastheydidinthe

less-emotionalimpersonalscenario(Koenigs,Youngetal.2007).

Importantly,vmPFClesionsalsoimpairtheexperienceofself-consciousemotionssuch

asshameorembarrassment(Beer,Heereyetal.2003).Moreover,thesocialbehavioroflesion

patientsinsocial-normsreinforcinggameshasbeencomparedtothatofpsychopaths

(Koenigs,Kruepkeetal.2010).Itshouldalsobenotedthatwedonotsuggestthatthemoral

dilemmasdescribedelicitregret.Rather,becausetheoutcomeofthechoicehasconsequences

forotherpeople,theanticipatednegativecounterfactualemotioninvolvedinthesesituations

wouldbetterbedescribedasremorseorguilt:cognitivelydistinctfromregret(Baskin-

Sommers,Stuppy-Sullivanetal.2016).Nonetheless,theresultsfromthevmPFCpatient


studiesmentionedheresuggestthattakingresponsibilityforone’sownactions,questioning

oneself,feelingregretandreinforcingsocialnormsrelyonthesameneuralcircuitry.

Psychopathy

Psychopathyischaracterizedbydiminishedinhibitorycontrol,impulsivebehaviorand

violence.Notably,thepsychiatricconditionisalsoattendedbyunusualmoralityjudgment,

includingtheconFlationofconventionalandmoralviolations(Blair1995).Whilehealthy

peopleseegreatdifferencesinaconventionalviolationsuchaswearinginappropriateclothes

inpublicandamoralviolationsuchashittinganotherperson,psychopathsseelessdifference

betweenthetwotypesoftransgression.Psychopathsarealsomoretolerantofmoral

transgressionsagainstotherpeople,whichmaystemfromalackofsufFicientaversionto

distressinothers(Blair2007).TheydisplayasimilardeFiciencyforaversionincost-beneFit

choiceseries.

Theimpaireddecisionmakingbypeoplewithpsychopathictendencieshaslongbeen

attributedtotheircurtailedexperienceofemotionsinvolvingresponsibility(Koenigs,

Kruepkeetal.2012),butrecentstudiessuggestthatthebreakdowninlearningviaregret

happensfurtherdownstream,atthepointofemployingregretvaluesinsubsequentchoices

(Hughes,Dolanetal.2013,Gillan,Morein-Zamiretal.2014,Baskin-Sommers,Stuppy-Sullivan

etal.2016).Thiswouldsuggestthatpeoplewithpsychopathydoindeedfeelregretbutdonot

incorporatethesignalintofuturedecisions,amodelconsistentwithsomeFindingsaboutthe

moraldecisionmakingofpsychopaths.ConsideringtheimplicationofthevmPFCissuchfeed-

forwardmechanisms,thebreakdownmaywellstemfromadiminishedvmPFC,whichin

psychopathicindividuals,hasbeenshowntobereducedineverydimension:volume,

thicknessandsurfacearea(Yang,Raineetal.2005,Baskin-Sommers,Neumannetal.2016).If

otherconsiderationsareequal,healthypeoplemakethechoicethatcarriestheleastexpected


regret,sometimesevenatthecostofproFit.Yetthehigherpeoplescoredonapsychopathy

scale,thelesslikelytheyweretoavoidregretinarepeatedwheelsoffortunetask(Baskin-

Sommers,Stuppy-Sullivanetal.2016).Itwasnotsimplythatthemissedopportunitybothered

themless–theyreportednegativeemotionsataboutthesamelevelascontrols,and

sometimesevenmore.Infact,thehighestscorersonthepsychopathyself-reportscale

reportednegativeemotionsafterabadoutcomecomparison,yettheyseemedtoignorethat

information.Thebadoutcomecomparisonthatservesasasignaltohealthypeoplewasnot

beingusedbythepeoplewithpsychopathy.Theirbehaviorindicatedthattheyemployedonly

thesimplersignalofexpectedvalue.Thissuggestssomelinkbetweenpsychopathyandregret

avoidance,thoughastudythatsearchedexplicitlyforsuchaconnectionincriminaloffenders

didnotFindone(Hughes,Dolanetal.2013).

Peoplewithpsychopathicindicationsarethusapparentlycapableofimagining

alternativerealitiesandgeneratingandexperiencingthenegativeemotionassociatedwiththe

comparisontoactualreality,suggestingthatpsychopathyischaracterizednotbyadeFicitof

emotionbutbyweakenedgeneralcognitiveprocessesliketheabilitytomaintainprevious

counterfactualinformationandtoapplyittosubsequentdecisions.Soifthesepeoplewere

experiencingtheemotionbutapparentlynotemployingitinchoicetasksimmediately

followingarousal,itraisedthepossibilitythattheinformationwasnotbeingappliedtoguide

futurechoiceinthemannerofpredictivemodels.

Theunderstandingofmoralprocessinginpsychopathsisnotwellunderstood.Though

peoplewithpsychopathyhavelongbeenobservedtoengageinamoralbehavior,the

mechanismofthatdeFiciencyhasonlyrecentlybeenexplored.Psychopathyhasbeenascribed

toadepletedabilitytoempathizewithapersonbeingharmedaswellasadeFicient

mechanismtoinhibitviolence(Blair1995).InastudybyBlair,criminaloffendersconsidered

severalscenariosofmoralandconventionalviolationssetinaschool,showingthat

psychopathssigniFicantlydidnotdifferentiatepermissibilitybetweenthetwotypesof


violations,whilenon-psychopathsdid.Blairrejectsseveralmodelsinwhichpsychopaths

experiencemoralemotionsbutdonotemploytheminmentalizationorfailtotake

perspectivesofothers.Rather,heproposesafaultinaseparatesystem,a“violenceinhibition

mechanism.”Cimaandcolleagues(2010),bycontrast,arguethatwhilepeoplewith

psychopathictraitsmayhavesomeemotionaldeFicits,enoughemotionispreserved(orinfact

maybeunnecessary)tomakesimilarmoraljudgmentstohealthycontrols.Thefactthatthey

canidentifytherightnessorwrongnessofmoralactions,butthenbydeFinitionactin

contrivance,indicatesthattheymaysimplynotcareaboutmorality,thestudysuggests.This

wouldagainbeconsistentwithpsychopathsexperiencingregretbutnotapplyingitto

subsequentchoices.Whateveremotionalcomponentthatislackinginpeoplewith

psychopathymaybetheelementresponsiblefortheapplicationofthemoralunderstanding

towardfuturedecisions.

YetbyreFininggroupsofpeoplebyplacementonthepsychopathyscaleandwith

greaterprecisioninthemoraldilemmaspresented,KoenigsandcolleaguesFindthata

counterfactualmechanismmayindeedbeatfaultforsomeabnormalmoralchoicesbypeople

withpsychopathy(2012).UsinginmatesfromaWisconsinprison,thestudyconsideredonly

thoseparticipantswhoscoredinthehighestandlowestportionsofpsychopathyindications,

furtherreFiningthehighscorersintermsofassessedanxietyinconsiderationofatheorythat

psychopathyistoobroadatermforseveralpossibleconditions.Usingthesamesituationsas

intheGreenestudy,bothhigh-anxiouspsychopathsandnon-psychopathsendorsedthe

utilitarianoutcomeofpersonaldilemmaswithapproximatelythesamelowerfrequency.But

low-anxiouspsychopathsjudgedtheutilitarianchoiceacceptablemoreoftenthaneitherother

group.TheFindingsuggeststhatsomesubtypesofpeoplewithpsychopathicindications

resolvetheemotion-utilityconFlictinasimilarlyunusualmannertothatwithwhich

psychopathicpeopleeschewregret.Wherethebreakdownoccursineitherpopulationandin

eithermechanism—oreventhecertaintythatthecausesarethesame—isstillupfor


debate:psychopathsandlesionpatientsmayexperienceemotionless,ortheymayexperience

emotionandsimplynotapplyit.Eitherway,itisclearthatpeoplewithpsychopathic

tendenciesdonotchangetheirchoicebehaviorinemotionalsituationstothesameextentthat

healthypeopledo,bothafterexperiencesthattypicallygenerateregretandwhenconfronted

withmoraldilemmas.

ASocialDimensionofRegretandAgency

Theconsiderationofothersconnectswithregretnotonlyinrepresentinglevelsof

responsibility.Theregretcircuitco-locateswithneurologicalphenomenathatinvolve

considerationofothersviasocialversusprivatesituations(Bault,JofFilyetal.2011,Zhu,

Mathewsonetal.2012).Studiesonlevelsofstrategicthinkinghaveshownhigherlevels

associatedwiththesameareasascounterfactualemotionslikeregret(Bault,JofFilyetal.

2011).Inanexperimentalgamecalledthe“beautycontest”orguessinggame,thechoicesa

playermakesindicatetheextenttowhichheisthinkingaboutotherplayersandhowmuchhe

thinkstheyarethinkingabouthim.Increasedamountsofthisrecursivethinkingare

associatedwithhigherlevelsofbrainactivityinthemOFC(CoricelliandNagel2009),the

locationofmostofthevmPFC,akeycomponentoftheregretcircuit.Aswithsomanyco-

locatedbrainactivities,however,itisnecessarytonotethatanatomicalproximitydoesnot

necessarilyindicateafunctionalrelationship.Nevertheless,thenotionofthinkingaboutthe

activityinotherbrains(inthecaseoftherecursivethinkingdemandedinthebeautycontest)

isdifferentfromothertypesofinputinasimilarwaythatthecalculationandexperienceof

counterfactual-basedemotions(asinthecaseofregret)variesfromotherinput—thatis,itis

largelyinternal.

StudieshaveassociatedthevmPFC/mOFCwiththoughtsaboutothers(FrithandFrith

1999,GallagherandFrith2003,Hampton,Bossaertsetal.2006,Suzuki,Jensenetal.2016).


Theseareasbecomeactivenotonlywhenthinkingaboutothers—whenevaluatingviolations

ofsocialnorms,forexample—butalsowhenitcomestorepresentingourownmentalstate,

includingemotion(GallagherandFrith2003).Whensubjectsweredirectedtothinkabouta

friendorsomeonewhowassimilartothem,thevmPFCshowedstrongeractivations(Mitchell,

Macraeetal.2006).GiventhevmPFC/mOFCassociationwithprocessinginformationrelevant

totheself,Mitchellandcolleaguessuggestthatthinkingaboutrelatedothersmaydependon

self-evaluationsinthevmPFC.Thisintroducesthepossibilityofaconnectionbetweeninternal

andexternalconsiderations:betweenregret’sinternallyorientedself-evaluationandthoughts

aboutothers.

Infact,despiteregret’sessentialinterioraspect,ithasbeenshowntobemodulatedby

theactionsofothers.Ifanindividualexperiencesregretthatcomesasthepartialresultofthe

actionsofothers,thebrainappearstoshiftsomeoftheblamefortheless-then-optimal

outcometotheseothers—thusreducingatleasttheanticipationofregret(Nicolle,Bachetal.

2011).Asdescribedabove,measurableregretisdeFinedbythenotionofagency.Itisusually

addressedinapolarmanner,however:withagency,thenegativefeelingassociatedwitha

differentoutcomeisregret;andinitsabsence,disappointment(Zeelenberg,vanDijketal.

1998).Butwithinthosecategorizations,thereappearstoberoomforgradation.Nicolleetal.

hadparticipantscompleteataskinwhichtheymadesimilargamblingchoicesasinstandard

regrettasks,butonsometrials,thechoicewasdeterminednotbytheparticipantalone,butby

vote(theyweretold)ofagroupofwhichtheywereamember,rangingfrom2to8peoplein

all.Inthiscase,theparticipant’sactionalonedidnotdeterminethechoiceanditsattendant

result.Themeasuredeffectsawreducedactivityintheamygdala,comparedtotrialsinwhich

theparticipantwassolelyresponsibleforchoices.Theamygdala,implicatedinemotional

memory,isassociatedwithactivityinvolvingpersonallyrelevantinformation.Itisalsoknown

tointegratetherelationshipbetweenstimulusandrewardandtosenditontothevmPFC,

wheretheinformationisusedinsubsequentchoices(Coricelli,Critchleyetal.2005).So


increasedactivityduringinstancesofregretinwhichtheparticipantistheonlydecision

makersuggestsakindof“self-blameregret”,Nicolleandcolleaguesargue.Thediminished

senseofresponsibilityattenuatesthenegativefeelingofregret,andthatconsequently

appearsalsotodampenthelearningeffect.Abetterresponseinanalternativereality

becomesclearerintheamygdalawithgreaterindividualresponsibility.Arelatedquestion,

unexploredtothispoint,ishow,ifatall,sharedresponsibilityforpositiveoutcomesmight

modulatebrainactivitycomparedtothatofnegativeoutcomes,orforpositiveoutcomesthat

resultfromsolochoices.

Conclusion

Thegoalofanydecisionprocessistoarriveattheoptimaloutcome,giventhe

conditions.ButwhenseveralimportantfactorscomeintoconFlictinadecision,thebrainmust

mediateamongthem.Separately,theprocessesformoraldecisionmakingandchoices

involvingdecisionregrethavebeenfurtherexploredviabrainimagingandlesionstudies.

Thesehaveshownthatsegmentsoftheseprocessessharesomeanatomyandevensimilar

dysfunctionamongpeoplewithpsychopathyorlesionstothevmPFC.Ourunderstandingof

bothsystemsstillneedsclaritybeforetheycanbeconsideredtoplayanypartineachother,

butsomerecentresearchproposesframeworksthathintathowtheymightbejoined.Blair

arguesthatthelearningsystemsinthevmPFCarethefoundationsofmoraldecisionsthat

concernharmtootherpeople(2007).Thesesamesystemsundergirderrorsignalsthat

includedecisionregret,showingheightenedactivityduringboththeexperienceand

anticipationofregret.TheworkonmoraldecisionsbyGreeneandcolleaguessuggeststhat

thevmPFCmightserveinaregulatoryrole,delayingdecisionsduringhigh-conFlictordifFicult

dilemmas—especiallythoseinvolvingcompetitionbetweenemotionalandutilitarian

outcomes.


MollanddeOliveira-Souza,however,pushbackontheGreenemodel,sayingthis

conFlictframeworkistoocomplex.Theyholdinsteadthatthelesionsattenuatetheprosocial

inFluenceofthevmPFC,thusallowingutilitariandecisionswithouttheinterferenceof

emotion.Theinverselogicisthatinhealthypeople,bycontrast,thevmPFCencourages

greaterconsiderationofotherpeople,incontrivanceofpurelynumericconsiderations.Yet

thisrunsagainstthetonicactivityofthevmPFCthatmaintainsvalueinformationduringa

seriesofchoices.Moralandregretdecisionprocessesappeartosharepatterns,butifthose

arereFlectionsofsharedpathwaysinthebrain,studiestothispointpresentcontradicting

rolesfortheseareas.

Thosewhoseethegreatestconnectionsbetweenlearningsignalsandmoraldecisions

includeThomasandcolleagues,whoarguethatthevmPFC’sroleissimilaracrossreasoning

processes—includingmoralandcomplexdecisionmaking.Intheirmodel,thevmPFC

integratesemotionintojudgmentsofcomplicateddecisions,actingasadjudicatorwhen

consideringfutureconsequences(Thomas,Croftetal.2011).ThevmPFCwouldbe

responsibleforassimilatingtheemotionaleffectsofregretexperienceorimaginationofharm

toanotherintoadecisionthatwouldotherwiseaddressonlytheutilitarianconcernsof

economicvalueornumberofpeopleprotectedfromharm.Suchabroadfunctioncould

incorporateeitheroftheGreeneorMoll/deOliveira-Souzaproposals.

Separatingthesecompetinggoalsandobservinghowspecialpopulationsdeviatein

theirdecisionsfromtheytypicalallowsustoseethatregretandmoralityareatleast

occupyingsomeofthesamespaceinthebrain.Moraldecisionsplayseriousemotional

consequencesagainstpreservingthelives(orlimbs)ofothers.Similarly,decisionregretpits

thepossibleemotionalpainofmakingasub-optimalchoiceagainstmaximizinggains.Inboth

cases,theefforttoavoidnegativeemotionscomesintocompetitionwiththeachievingthe

mostutilitarianoutcome.Thoughtheimplicationsofmoralversuseconomicdecisionsareon

differentscales,thehumanbrainappearstoprocesssimilarlysomeportionofthem.Crucially,


theybothrequirethepreviousexperienceorunderstandingofemotionaloutcomesandthe

incorporationoftheirpossiblereoccurrenceintoanewdecision.Thus,thesecomplextypesof

decisionrequiretheabilitytoconsidertheimpactofthechoicebeforeitismade—they

demandtheconceptionofrealitiesbothencounteredandimagined.Theseprocessesusethe

pastandaconceptualfuturetoputnewrealitiesinconFlictwitheachothertojudgeonethe

mostdesirable.

Authornote

ThisworkwassupportedbyaEuropeanResearchCouncilConsolidatorGrant"Transfer

Learningwithinandbetweenbrains"(TRANSFER-LEARNING;agreementNo.617629).

Chapter 2: The effect of aging on regret in decision making

The effect of aging on regret in decision making


Motivatingquestions

Howdotheexperienceandanticipationofregretduringchoicesevolvewithage?

Isthepropensitytoexperienceregretafterabaddecisionlinkedtocounterfactuallearning?

Introduction

Olderadultsmakedifferentchoicesthanyoungeradultsundercertainconditions,even

inidenticalsituations,suggestinganinternalchangeinthedecisionprocesscorrelatedwith

aging.Thesechangesareevidentinmyriadbehavioraldecision-makingstudies,mostoftento

worseningeffect(RiggleandJohnson1996,Denburg,Coleetal.2007,Löckenhoffand

Carstensen2007).OlderadultsmakehastierinvestmentdecisionsandhavemoredifFiculty

justifyingthosechoices(Shivapour,Nguyenetal.2012).Olderadultsarealsomoreproneto

bedupedbyscamsandaremoresusceptibletodeceptiveadvertising(Yoon,Coleetal.2009).

Agingismarkedbyselectiveareasofcognitivedecline,includingperformanceinepisodic

memoryandexecutivefunction.Whilesomementalprocessesremainstablethroughout

adulthood,otherschangeinwaysthatresultinless-advantageousoutcomes(Tymula,

Belmakeretal.2013).Performanceinbothlong-termmemoryandworkingmemorytasks,

whichdependonprocessingcapacity,appeartodeclinebeginningalmostassoonas

adulthoodisreached(Park,Lautenschlageretal.2002).Duetotheseseveralcognitive

constraints,olderadultsaremorelikelytorelyonheuristicprocessingtomakedecisions

(RiggleandJohnson1996).

Suchage-relateddeclineintasksthatinvolverewardandlearningareconsistentwith

reduceddensityofdopaminereceptorsinbrainareasimplicatedinencodingreward

predictionerrorandinlearning.Midbraindopamineneuronshavebeenrobustly

demonstratedtoencodeforrewardpredictionerror(Schultz,Tremblayetal.1998,Bayerand

Glimcher2005).Olderadultsexhibitlowerdopaminetransporterdensityinthestriatum,


whichcorrelatedwithreducedperformanceintestsassessingepisodicmemoryandexecutive

function(Erixon-Lindroth,Fardeetal.2005,Troiano,Schulzeretal.2010).Throughthe

combinedlensesofimagingandbehavioralstudies,thesechangescanbereadasmarkersof

cognitivedeclinewithage.

Purelycognitiveprocesseslikeworkingmemoryandreward-basedlearninghavebeen

studiedthoroughly,ifnotexhaustively,buttheeffectofagingonotheraspectsoflearningand

decisionmakingarelesswellexplored–inparticulartheinFluence,ifany,ofemotionalaffect.

Becausetheneuronaldeclineisdifferentinareasimplicatedinemotion,theymayattenuate–

orexacerbate–thedeclineseeninstrictlycognitivetasks.Cognitivelyenhancedemotions

suchasregretandenvyemploycounterfactualreasoning,anadditionalvectoroflearning

processes(CoricelliandRustichini2010).Importantly,theyappeartofollowdiscrete

pathways,suggestingthattheirinFluenceonlearningoverthecourseofagingmaybenotonly

differentfromsimplermodels,butalsofromeachother.

Slippingperformanceaccompaniesageintasksthatcallonepisodicmemory(recallof

wordsandFiguresandforfacerecognition)aswellasthosethatemployexecutivefunctioning

(visuospatialworkingmemoryandverbalFluency),whichcorrelatewithreduceddensityof

striataldopaminetransporter,akeyneurotransmitterindiscerningreward(Erixon-Lindroth,

Fardeetal.2005,Troiano,Schulzeretal.2010).Age-relateddeclinealsoattendsstructural

connectionsbetweenFirst-orderreward-processingareasinthestriatumandbasalgangliato

higher-orderareasliketheprefrontalcortex(Samanez-Larkin,Levensetal.2012).When

theseconnectionsaredepleted,theprocessesthatrepresentvalueandrewardpredictionsare

attenuated,impedingidentiFicationandexploitationofrewardingdecisionsandmore

successfulstrategies(O'Doherty2004).Thecombinedeffectofareductionindopamine

densityanddiminishedstructuralconnectionintherewardsystemsuggestatleastapartial

explanationforchangingchoicebehaviorinolderadults.


Similarlyimpedingtheseprocesses,thelevelofdetailofbothpasteventsandfuture

scenariosdeclinesinolderadults(Addis,Wongetal.2008).SuddendorfandCorballisargue

thatmentaltimetravelplaysacrucialroleinpredictingfuturesituationsbecauseitallowsthe

recollectionofpreviouseventsandtheanticipationofoutcomeswhenthoseeventsare

reencountered(2007).Inanevaluationofautobiographicalmemory,olderadults

demonstrateadiminishedcapacityforthisformofmentalization,bothinself-projectioninto

futureeventsandinsituatingeventsinthefuturewithregardtothepresent(Anelli,

Ciaramellietal.2016).Thistrendhasbidirectionalimplicationsforlearning:bothareduction

inrecallofconsequencesofpreviousactionsandlesserabilitytopredictoutcomesoffuture

choices.

Thetypesofdecisioncontextsthatrevealdifferentperformancewithagecanbe

characterizedbytheseveraltypesofprobabilitysituationsthatdecisionmakersencounter

(Mata,Josefetal.2011).MataandcolleaguesoutlinethatAprioriprobabilitiesfeatureknown

probabilitiesandaremarkedbyrelativelyeasymathematicalcalculations.Statistical

probabilities,bycontrast,demandanempiricalgaugeinformedbyexperience.Athirdtype

involvingrareeventsbringsextremeuncertaintyandpromptsindividualstomakeestimates.

Changesincognitivecontrolmaymodulatetheassessmentofthetypeofprobabilitysituation

andthereforehowtorespondtoit.Declinesinworkingmemorymakestrategyselectionand

applicationmoredifFicult,whichcompoundstheeffectsofagingsinceolderadultstendtorely

onsimplerstrategiesthatrequirereducedinformationsearchandintegration(ChenandSun

2003).Inaquintessentialexample,inagamblingtask,whileyoungerpeopleusedmore

cognitiveskillslikelearningandmemory,olderadultsreliedonvalenceofrecentoutcomes

(Wood,Busemeyeretal.2005).Thedeclineinhighercognitivefunctionismademoreclear

whenage-relatedperformancedifferencesarecomparedtoperformanceindecisiontasks

thatdonotfeatureakeylearningcomponent,inwhicholderadultsandyoungeradultsdonot

signiFicantlydiffer(BrandandMarkowitsch2010,Hosseini,Rostamietal.2010,Mata,Josefet


al.2011).Meanwhile,olderadultsperformaboutaswellasyoungeradultsinmemorytasks

thatdemandonlystorage,suchasshort-termmemoryspantasks,comparedtosigniFicant

deFiciencyintasksthatrequirebothstorageandprocessing,asinworkingmemorytasks

(BoppandVerhaeghen2005).Suchdifferencesemergeintasksthatcallonsubjectstolearn

fromfeedbackovertime.

Emotionalassistance

Thestoryofaging,however,isnotoneofbroad,continuousdecline.Somemeta-

featuresofdecisionmakingimprovewithage,suchasperformanceassessment,inwhich

olderadultsseemtohavegreaterunderstandingofthelimitsoftheirknowledge(Hershey

andWilson1997).ThoughmuchworkhasbeendonetospeciFicallycharacterizecognitive

declinethataccompaniesaging,theeffortisnotyetexhaustive.Inparticular,muchremainsto

beunderstoodabouttheinteractionbetweenaffectiveandcognitiveprocessesinlearningand

decisionmaking.Thoseprocessesthatremainrelativelyintactandthatmayattenuate

declinesindecisionmakingcouldbeteasedout.Keycomponentsofchoicebehavior,suchas

riskpreference,havebeenmeasuredasnotsigniFicantlydifferentinoldersubjectsinsome

contexts(Tymula,Belmakeretal.2013).Likewise,someaffectiveprocessesarerelatively

resistanttoeffectsofage(Carstensen,Turanetal.2011).Decision-makingprocessesthat

incorporatebothcognitiveandaffectivefunctions,withtheirgreaterandlessersusceptibility

toage,maythereforedeclinetovaryingextents.Infact,olderadultshavebeenshowntofocus

onpositiveoutcomesandevents,perhapsattheexpenseofcomparisonsthatencourage

learning(MatherandCarstensen2002).Wehypothesizethatdecisionsmodulatedby

cognitivelyenhancedemotions,suchasregret,maymaintainstabilitywithage,comparedto

moredrasticdeclinesinmorepurelycognitiveprocesses,suchasmemory,andthathowthese

emotionsstabilizeotherprocessesdependsonaffectivevalence.


Positivityeffect

Astrikingdifferencebetweenolderandyoungeradultsisthepositivityeffect:the

tendencyforolderadultstofeelpositiveoutcomesmorestrongly,aswellastorecallthem

better(Reed,Chanetal.2014).Yetthevariableevidenceofsubsequentstudiessuggeststhat

thisphenomenondoesnothaveasingulareffectonlearning.Variousstudieshave

investigatedhowage-relateddepletioninthedensityofdopamineneuronschangesreward-

basedlearning.Akeyconsiderationoffeedbackisvalence–bothasitappliestorewarditself

aswellastothevarioustypesoferrorsthatfeedbackinforms.Whiletheabsolutevalenceofa

rewardtendsnottobeperceiveddifferentlydependingonage,thevalenceofitscomparison

tootheramountsinfactmaybedifferentforolderadultsthanforyoungeradults.Asmall

negativeresultmayhaveanegativepredictionerrorcomparedtowhatwaspredicted,butit

mayhaveapositiveerrorifitiscomparedtosomeotherworseoutcomethatcouldhavebeen

obtained,givenadifferentresultofprobabilityorchoice.

Riskandemotion

Olderadultsmakeless-advantageouschoicesunderuncertainty,yetdependingonthe

typeofriskthataccompaniesthetask,toleranceofrisk(measuredasthevarianceof

probabilityofpossibleoutcomes)ofolderadultsmaybethesameasthatofyoungeradults,or

lower,orevenhigher,asintasksthatcallfordecisionsfromexperience(Mata,Josefetal.

2011).OneexplanationforthisvariabilityisthatolderadultsmayhavedifFiculty

representingchangingoptionvalues,whichcanleadtoinconsistentchoices(Mata,Josefetal.

2011,Tymula,Belmakeretal.2013,Samanez-LarkinandKnutson2015).IntheIowa

GamblingTask(IGT),thehighestrewardcomesfromlearningthatoftworiskyoptions,the

initiallyless-attractive,lower-riskchoiceismorefavorable.TheBalloon-AnalogRiskTask


(BART),bycontrast,encourageslearningthathigherrewardscomewithriskierchoices(Mata,

Josefetal.2011).InsomeIGTstudies,mostolderadultsstartoutasrisk-seekingandbecome

morerisk-averseoverthedurationofthetask(Denburg,Coleetal.2007).Participantschoose

fromfourdecksofcardsinwhichtherearebothgainandlosscards.Twocontainlarger

single-wincardsandlargerlossesbutaveragenetlosses,andtheothertwo,smallerindividual

gainsandlosseswithaveragenetgains.Mostparticipantsstartoutchoosingfromthehigh-

gain/high-lossdecks,andhealthy,unimpairedsubjectseventuallysettlingonthedecksthat

providelong-termnetgains.Thatdeckalsorepresentsthelower-varianceandtherefore

lower-riskdeck.TheIGTwasdevelopedspeciFicallytoexaminetheeffectofemotionon

cognition,includinginanearlyrepresentativecasestudyofaventromedialprefrontalcortex

(vmPFC)lesionpatientinwhichtheauthorsmeasuredsensitivitytoreward,insensitivityto

punishmentorinsensitivitytoconsiderationofconsequences(Bechara,Damasioetal.1994).

IndividualswithvmPFClesions,whodemonstrateanimpairedabilitytointegratecognition

andemotion,continueselectingtheriskierdeckinpursuitofhighgains(Bechara,Damasioet

al.1997,Bechara,Traneletal.2000).Ameta-analysisshowsthatrisk-preferencedifferences

aremorecontext-dependent:olderadultstookfewerrisksthanyoungeradultsintasksthat

involvedlearning,butmorerisksintasksthatdidnot(Mata,Josefetal.2011).Whenhigher

rewardsrequireincreasedrisktolerance,suchasintheBalloon-AnalogRiskTask(BART),

olderadultsaremorerisk-aversethroughout,incontrasttotheIGT,inwhichhigherrewards

comefromembracingless-lucrativeandless-riskyoptions.Otherevidencesuggeststhatrisk

preferencemayappeartoshiftduetodifferentiationingainandlosscontexts(Tymula,

Belmakeretal.2013).Thesedifferentperformanceshighlightthevariabilityincognitive

demand.Althoughbothtasksengagestatisticalprobabilitiesanddemandexperience-based

responses,thoseresponsesdifferdependingonoutcomevalence:intheIGT,participants

shouldlearntoavoidtheinitiallyattractiveriskyoptiontoreachtheoptimalchoice,whilethe

BARTrewardslearningtobecomemorerisk-seeking.Complicatingthiscomparisonisthe


BART’sgreaterdemandsoncalculatingstatisticalprobabilitiesthatleadtotakinghigherrisks.

AdeFiciencyinthistypeoflearninginolderadultsmeanstheydonotmakeriskierchoices

demandedforhigherreward(Mata,Josefetal.2011).

Counterfactuallearning

Instandardmodelsofreward-basedlearning,agentsreceivefeedbackonchoices

made,promptingthemtoadjustfutureactions.Althoughinformativemodelscanbebuilt

basedonlyontheinformationfromthechoicemade,whentheoutcomesofalternative

choicesnottakenareknown,modelsthatincorporatethatinformationbetterdescribe

decisionbehavior.Thiscounterfactualinformation,alongwiththecomparisonbetween

unobtainedandobtainedoutcomes,guidelearning(Zeelenberg,Beattieetal.1996).

CounterfactualorFictivelearningupdatesinformationaboutpotentialfuturechoicesusing

informationfrombothobtainedandunobtainedoutcomes.Anaffectiveaccompanimentto

thisimaginedalternativerealitygivesrisetonegativeorpositiveemotions:regretorrelief.

Thenegativefeelingofregretpromptsaversivebehaviorthatguideslearning.Individuals

avoidpotentiallyrewardingoptionsifithelpsthemtoavoidanticipatedregret(Coricelli,

Critchleyetal.2005).

Theeffectofageoncounterfactuallearninghasbeenlittleconsidered.Todate,justone

publishedstudyaddressesthisquestion,moreover,incorporatingunderlyingneuralactivity

(Tobia,Guoetal.2016).Thoughthestudyfounddifferencesincounterfactualthinking

betweentheagegroups,theircharacteristicsandcauseswerenotasclear.Thestrategic

sequentialinvestmenttaskemployedprovidedcounterfactualinformationasavectorto

arriveatmore-orless-desirableFinalstatesfollowingathree-choiceround,experienced10

timesoverthecourseofablock.Olderadultsonaverageinvestedlessmoneyandearnedless.

Theyselectedthepathleadingtotheleast-rewardingoutcomemoreoftenthantheother


threepathscombined,whilemorethanhalfoftheyoungeradultsdevelopedapreferencefor

themostlucrativepath.Thestudyfoundthatolderadultsweremoreresponsiveto

counterfactualgains,bothbehaviorallyandneuronally,butthatthisdidnotleadtomore

rewardingchoices.Computationalmodelingalsoshowedthatolderadultsexploredmore,

whileyoungeradultsmademorestablechoices,whichismorerewardedinthisparticular

task.

Experimentalquestions

Inthisstudy,ourgoalsweretwofold:First,weexaminedthedifferencesinexperience

ofregretbetweenolderandyoungeradults,andsecond,weexploredtheconnectionbetween

experienceofregretandcounterfactuallearningintermsofinter-individualdifferences,

independentofage.Weaimedtofurtherexplorethecharacteristicsandtrajectoriesof

decision-makingchangeinolderadultscomparedtoyoungeradults.Asshowninprevious

studies,thecognitivedeclineaccompanyingageisnotuniformandseemstohindervarious

processesdifferently(RiggleandJohnson1996,Erixon-Lindroth,Fardeetal.2005,Tymula,

Belmakeretal.2013).Evenwithinagiventask,performancewithdifferentdemandsdeclines

tovariedextents(BoppandVerhaeghen2005,Wood,Busemeyeretal.2005,Brandand

Markowitsch2010,Hosseini,Rostamietal.2010,Mata,Josefetal.2011).Thepositivityeffect

inparticularindicatesthatlearningandeffectinolderadultsdependonvalence.Therefore,

weexpectdifferentoutcomesfromtheexperienceofregret,sinceitisavalence-dependent

phenomenon.Yetitisnotadirectexperience,aswithwinsorlosses,butacomparativeone,

basedonthecounterfactual.Wouldthisgiverisetothesameeffectasstandardnegative

outcomes,orwoulditgeneratesomesortofmodulatedoreveninvertedeffect?Becauseolder

adultsreportloweremotionalresponsefromtheexperienceofnegativeemotionslikeregret

(Reed,Chanetal.2014),intheFirstpartofthisstudy,wehypothesizedthattheemotional


effectofregretexperienceandanticipationwouldbelowercomparedtothatofyounger-adult

participants.

Wenextsetouttoexaminetheconnectionbetweenregretandcounterfactuallearning.

Wesuspectedsomerelationshipbetweenexperienceandavoidanceofregretinonetaskand

theemploymentofcounterfactuallearninginanothertask.However,becausethechangein

counterfactuallearningwithageisnotclear.Duetotheasymmetricdeclineofcognitive

processes(MatherandCarstensen2002,Carstensen,Turanetal.2011)andtheas-yet-

unestablishedeffectofageoncounterfactuallearning(Tobia,Guoetal.2016),werealizedthat

anycorrelationsmightbeevidentatanindividuallevel,butnotatthelevelofagegroups.We

hypothesizedthatforallparticipants,individualswhoshowgreaterregretsensitivitywould

alsohavehighercounterfactuallearningrates.

Toexaminethevariedeffectsofregretonolderadultsascomparedtoyoungeradults,

wemeasuredtheperformancesinnon-counterfactualandcounterfactualcontextsbyage-

segregatedparticipants.Weemployedtasksthatalloweddiscretecharacterizationofthese

contextstoexamineanydifferencesbetweenagegroups.Aninitialtaskwasselectedthat

wouldallowustomeasurethelevelofinFluenceofregretanticipationandavoidanceforeach

participant.Thenparticipantswouldcompleteasecondtaskinwhichtheirlearningbehavior

wasdescribedaccordingtoseveralcomputationalmodels.Thiswouldallowustocompare

regretsensitivityintheFirsttasktocounterfactuallearninginthesecondatgroupand

individuallevels.

Methods

Onegroupof22adultsaged60andolder(15female,M_age=70.1±1.3years,range

63-86)wasrecruitedinLyon,France.Theywerescreenedforahistoryofneurologicaland

psychiatricdisorders,aswellasfordepression(scorehigherthan10ontheGeriatric


DepressionScale,Frenchversion,Clément,Nassifetal.1997)andcognitiveimpairment(score

lowerthan24ontheMiniMentalExaminationtest,Frenchtranslation,Derouesne,Poitreneau

etal.1999).Agroupofyoungeradultscomprised24participants(11female,M_age=24.8±

1.8years,range18-53).Groupswerematchedforeducation(olderadultsM_edu=13.2±0.8

years;youngeradultsM_edu=14.3±0.4years).

Theexperimentalsessionconsistedoftwomainportions:alotterychoicetaskanda

learning/post-learningtask.

Lotterychoicetask

IntheFirsttask,participantscompleteda

two-playerchoicetaskadaptedfromBault,

Coricellietal.(2008).Participantswere

presentedwithtwoWheelofFortunelottery

circleswithpossibleoutcomesof-20,-5,+5,+20

(Fig.2-1).Theprobabilityofobtainingeach

outcomewasindicatedbydifferentcolor

segmentsonthewheel.Probabilitieswere0.2,

0.5,0.8.Greenindicatedprobabilityofapositive

outcome;red,negative.Ineachtrial,theexpected

valuesofthetwolotterieshadthesamevalence,

andthedifferencebetweenthetwoexpected

valuesneverexceeded7euro.

Participantscompleted80trialswith

completefeedback.Inthese,40privatetrials

wereintermixedwith40socialtrials,inwhich

Fig. 2-1. Wheel of fortune lottery task In the complete feedback condition, the participant selects one of the two lottery wheels, then sees both arrows spinning. When the arrows stop, the participant sees both her own, obtained outcome and the outcome of the non-chosen lottery. In the partial-information condition, only the arrow of the selected lottery is shown, and the result of the non-chosen lottery remains unknown. Adapted from Bault et al. 2019


participantssawtheoutcomeofanotherplayer’slotterychoice.Inthisstudy,weconsider

bothtypesofcomplete-informationtrials,aswellas20trialsthatprovidedonlypartial

feedback.Inthosetrials,participantssawonlytheoutcomeofthewheeltheychoseand

receivednoinformationabouttheoutcomeoftheunselectedlottery.Tostarteachtrial,the

twowheelsweredisplayed,surroundedbyagreendashedsquare(inprivatetrials).Attheir

ownpace,participantschoseonebypressingtherightorleftarrowkeysonakeyboard.In

complete-feedbacktrials,arrowsinsidebothwheelswouldspinatthesametime,whilein

partial-feedbacktrials,onlythearrowintheselectedwheelspun.Whenthearrowsstopped,

theportionofthewheelindicatedtheoutcomeofthetrial:greenforthepositiveresult,red

forthenegative.Toencourageparticipantstothinkofeachtrialasindependent,theywere

toldthat20trialswouldberandomlyselectedtodeterminepayment.Aftereachtrial,

participantsgaugedtheiremotionalreactiontotheoutcomebyansweringtheprompt“How

doyoufeelabouttheoutcomeofyourchoice”byselectinganumberbetween-50(for

“ExtremelyNegative”)through0(“NeitherPositivenorNegative”)to+50(“Extremely

Positive”).

Learningtask

Participantsthenperformedatwo-partprobabilisticinstrumentallearningtask

adaptedfromPalminterietal.(2015).TheFirstsectionwasalearningtaskthatmanipulated

outcomevalencetopresenteitherrewardorpunishment,aswellasfeedbackinformation

(partialorcomplete),usinga2x2factorialdesign(Fig.2-2).Participantscompleted192trials

intwoblocksof96trialseach.Ineachtrial,aparticipantviewedFixedpairsofabstract

symbols(Agathodaimonalphabetcharacters)onascreenandselectedone.Eachsymbol

appearedinthesamepair,andfourFixedpairswereshown24timesthroughouttheblock.

Eachpairwastiedtoonequadrantofthedesign:reward-partial,reward-complete,


punishment-partialorpunishment-complete.Theywerepresentedinpseudo-randomorder.

Intherewardcontext,thetwooutcomesweregaining50centsor0,neithergainingnor

losing.Inthepunishmentcontext,theoutcomeswerelosing50centsor0,neithergainingnor

losing.Ineachpair,onesymbolwasassigneda0.75probabilityofapositiveoutcome,andthe

otherwasgivena0.25probabilityofapositiveoutcome.Participantsweretoldneitherthe

probabilityamounts,norwhichsymbolhadagreaterprobabilityofapositiveoutcome.The

outcomeofeachsymbolwasindependentoftheotheroneachtrial,sobothcouldyield

positiveoutcomesorbothcouldyieldnegativeoutcomesinthesametrial.

Asintheprevioustask,participantssometimessawoutcomesofonlythesymbolthey

chose,whileinothertrials,theywereshownoutcomesofbothsymbols.Participantswere

instructedtogainasmanypointsaspossibletoincreasetheirpayment.Theyweretoldthat

onlythesymboltheychosewouldcounttowardtheirscore,eveniftheysawtheoutcomeof

theothersymbol.Inthesecondblock,eightnewsymbolsreplacedthosefromtheFirstblock.

Presentationonthescreenwascounterbalancedwithinpairsacrosstheblock.Valuesofthe

Fig. 2-2 Symbolic learning task Contingencies of the task show probabilities of winning 50 cents in the reward context (left squares) or losing 50 cents in the punishment context (right squares). On each trial, subjects saw one of the four screens, with the color background cueing the specific condition and context pair. In trials featuring the symbol pairs in the partial-feedback condition (top row), the result only of the chosen symbol are shown, while in the complete-feedback condition (bottom row), both results would be shown. Outcomes were independent. Adapted from Bault et al. (in preparation)


symbolswererandomlyassignedforeachparticipant.Participantscompleted4practicetrials,

oneforeachcondition,butusingsymbolsnotemployedintherestoftheexperiment.

AtrialstartedwithaFixationcross(0.5s),followedbythesymbolpairs.Aparticipant

selectedthesymbolbypressingthecorrespondingarrowkey(self-paced).Aredarrow

indicatedtheselectionfor1s.Afeedbackscreenindicatingtheoutcomeofeitherthechosen

symbol(partialfeedback)orbothsymbols(completefeedback)appearedfor3s.Inreward

trials,appearingbelowthecuewereeithera50-centcoinwiththelabel“+0.5EUR”oragray

squarelabeled“0EUR”.Inpunishmenttrials,theoutcomeswereindicatedwitheitheragray

squarewiththe“0EUR”labelora50-centcoinwithan“X”acrossitandthelabel“-0.5EUR”.

Afterthelearningsection,participantstookapost-learningtestofcuevaluesinwhich

theeightsymbolsfromthesecondblockonlywerere-presented.TheywereshowninunFixed

pairs,eachsymbolappearing4timeswitheveryothersymbol.Thistotaled112trialsfor28

possiblecombinations.Foreachpair,participantswereinstructedtoindicatethesymbolthey

believedhadthehighervalue,basedonoutcomesfromtheprevioussection.Instructions

werepresentedonlyafterthelearningsectionwascomplete,soasnottopromptattemptsto

memorizevalues.Participantsweretoldthatsymbolswouldnotnecessarilyappearinthe

samepairsthathadbeenpresentedintheprevioussection.Responseswereself-paced,and

nofeedbackwaspresented.TherewasnoFinancialincentiveinthispart,thoughparticipants

wereencouragedtoplayasiftheywouldberewarded.

Emotionalratinganalysis

Aftereachtrialinthelotterychoicetask,participantsratedtheiremotionalreactionto

theoutcome.Trialswerecategorizedaspartial-informationorcomplete-information

feedback.Toanalyzeratings,weemployednon-parametrictestsbecauseweanticipated

violationofparametricassumptions.WeestimatedthesigniFicanceofdifferencesbetween


behavioralvariablesandemotionalratingsusingtheWilcoxonsignedranktest(WSRT).We

testeddifferencesbetweengroupsusingtheMann-WhitneyUtest.

Wefurtheranalyzedsubjectiveevaluationswithmixed-effectslinearregressionsby

agegroupandbytrialinformation,whichallowedustoestimatebothrandomandconditional

Fixedeffects.Onlyrandomeffectsarereported.Parameterswereestimatedbygeneralized

leastsquares.

AllregressionswererunusingthestatisticalsoftwarepackageStata,StataCorp.,

CollegeStation,TX.OtheranalyseswereperformedusingMatlab,TheMathWorks,Natick,MA.

Choicebehavioranalysis

Thechoicelotterytaskyieldsarangeofinformationtoanalyze.Inordertoexamine

howcomponentsoflotteriesaffectedchoice,weconsideredwhatchoicetoldusaboutan

individual’sweighingofexpectedvalue,risk,anticipateddisappointmentandanticipated

regret.Wefurtherexaminedtheeffectthesefactorshadonsubsequentchoices.

Theaspectsconsideredcompriseddifferenceinexpectedvalue(dEV),anticipated

regret(r),andanticipateddisappointment(d).Thesearecomputed,perCamilleetal.(2004)

as:

�

�

�

Wealsoconsideredrisk,followingBault,JofFilyetal.(2011),andcomputingitasthe

differenceinstandarddeviation(dSD):

�

�

dEV = EV1 − EV2 = [px1 + (1 − p)y1] − [qx2 + (1 − q)y2]

r = |y2 − x1 | − |y1 − x2 |

d = [ |y2 − x2 | (1 − q)] − [ |y1 − x1 | (1 − p)]

dSD = SD1 − SD2 = p(x1 − EV1)2 + (1 − p)(y1 − EV1)2

− q(x2 − EV2)2 + (1 − q)(y2 − EV2)2


Here,x1,y1andx2,y2arethetwopossibleoutcomes(x,y)oftwolotteries(1,2).The

probabilityoftheFirstoutcomeisporq,whiletheprobabilityofthesecondoutcomeis1-por

1-q.

Apositive(negative)dEVcoefFicientindicatesthatparticipantsweremorelikelyto

choosethelotterywiththehigher(lower)expectedvalue.Apositive(negative)regret(r)

coefFicientindicatesthatparticipantsanticipated(minimized)regretandchosethelottery

withthelower(higher)anticipatedregret.Inthiscalculation,participantsconsideredwhat

wouldhappeniftheyobtainedtheworstoutcomeinthechosenwheel,comparedtothebetter

outcomeintheunchosenwheel.Anticipateddisappointment(d),bycomparison,involvedthe

considerationofobtainingtheloweroutcomeonthewheelcomparedtohigheroutcomeon

thesamewheel.Becausethisiscalculatedbasedontheoutcomesofasinglewheel,itbears

somerelationtorisk,butascanbeseenfromtheformulaeabove,itisnotthesame

calculation.Theabsolutevalueofthedifferencebetweenthetwooutcomesisweightedbythe

probabilityofobtainingthelowervalue,correlatingwiththenotionofavoidingmore

probablelosses.Apositive(negative)disappointment(d)coefFicientindicatesthat

participantsanticipated(minimized)higherandmoreprobablelossesdisappointmentand

chosethelotterywiththelower(higher)potentialdisappointment.

Weanalyzedchoicebehaviorwithmulti-levelmixedlogitregressionswithparticipants

ingroups,whichallowedustoestimatebothrandomandconditionalFixedeffects.Parameters

wereestimatedbymaximumlikelihood.

Tocomparetheemotionalimpactofcounterfactualoutcomestostandard,wetookthe

meanemotionalratingofeachsubjectforallcomplete-feedbacktrialsthathadabetter

(upward-looking)obtained-otheroutcomethanobtained-chosenandsubtractedthemean

emotionalratingforallpartialfeedbacktrialswithanegativeoutcome.Thisregret-

disappointmentfactorisanindicationoftherelativestrengthofthecounterfactual.We


furtherdescribedthecounterfactualbyperformingaregressionofemotionalratingonall

outcomevaluesincompletetrials.Thecontributionoftheobtained-unchosenconstitutesa

counterfactualcoefFicientforeachsubject.Wethencategorizedallsubjectsintoeither“weak”

or“strong”counterfactualcoefFicientsatthemedianpoint(-0.3528,range:[-2.2013,0.6419]).

Loweramountsindicatedthatgreatermissedopportunitieshadamorenegativeemotional

effect,sothelowerhalfcomposedthestronggroup.

Learningbehavioranalysis

Fromthelearningtask,weextractedseveralvariables,includingearningsandcorrect

choicerateasdependentvariables.A“correct”responsewasdeterminedtobeeitherthe

more-rewardingchoiceortheless-punishingchoice.

Welookedseparatelyatearningsincomplete-feedbacktrialsandpartial-feedback

trials.Becauseearningsmayvaryabsolutelyonanindividualbasis,wewantedtoseehow

earningsinthetwotypesoftrialscompared.Toobtainanindividuallyinternalcomparison,

wesubtractedeachparticipant’smeanearningsinpartial-feedbacktrialsfromthemanof

earningsincomplete-feedbacktrialstoarriveataregret-disappointmentdifferentialscorefor

eachparticipant.

WeperformedstatisticalanalysesonthesevariablesusingMann-WhitneyU-tests,

WilcoxonSignedRankTestsandgroupingsbycounterfactual-coefFicient,agegroupand

betweentaskinformationfeedbacktype.

Learningcomputationalmodels

Datawaspreviouslyanalyzedwithfourreinforcementlearningmodels:Q-learning,

counterfactuallearning,normalizedQ-learningandnormalizedcounterfactuallearning

(Bault,Palminterietal.2018).Inthenormalizedmodel,learningisconsideredtooccur


relativetotheaveragevalueofcontext,whichallowslearningtooccurfromzero-value

outcomes,whicharefrequentlyencounteredinthistaskbutstillprovideinformation.The

reinforcementlearningmodelsoperateondirectexperiencebyupdatingvaluesonlyfor

chosenoptionsbasedonoutcome.Counterfactualmodelsupdateboththechosenoptionand

theunchosen,whencounterfactualinformationwasavailable.

Allfourproducedasoftmaxparameterthatindicateshowselectivelyanindividual

discriminatedbetweenthetwooptionsandastandardlearningrateparameter.Thetwo

counterfactualmodelsalsogeneratedacounterfactuallearningrateparameter.Thetwo

contextualmodelsalsogeneratedacontextuallearningrateparameter.

Results

Emotionalratings

Asinpreviousstudies,regretinthistaskischaracterizedintwoways:byaneffectof

theoutcomeoftheunchosenlotteryontheemotionalevaluation,andbyastrongerinFluence

onthatevaluationoftheunchosenlotteryoutcomeinthecompletefeedbackconditionthanof

theunobtainedoutcomeofthechosenlotteryinthepartialfeedbackcondition.Inboth

instances,itistheimaginationofanobtainedoutcomeinanalternativeworldgivena

differentchoicethatdrivestheeffect.

EmotionalratingsacrossalltrialswerenotsigniFicantlydifferentbetweenolderand

youngeragegroups,apartfromtrialsthatresultedinregret(complete-information,private

trialswithaworseoutcomethantheobtainedoutcomeintheunchosenlottery)(Fig.2-3).

Youngeradults(YA)reportedfeelingworsethanolderadults(OA)(complete-privatetrials,

Mann-Whitney,Z=1.97,p=0.0484).Insituationsoflargeupwardcomparison–whenthe

obtainedoutcomeislessthantheunobtained–OAratedemotionshigher,indicatingthatin

somecontexts,theyexperiencedregretlessthanYA(Fig.2-4).


Fig. 2-3. Subjective emotional ratings in l o t t e r y t a s k M e a n subjective emotional ratings (error bars: s.e.m.) by older adults and younger adults following partial feedback trials (“Disappointment” and “Satisfaction”) and complete feedback trials (“Regret” and “Relief”).

Fig. 2-4. Emotional ratings comparison by unobtained outcome Mean emotional ratings by older adults and younger adults for two obtained outcomes (–5 and +5) as functions of the unobtained outcome (blue line, +20; red line, +20) in partial and c o m p l e t e f e e d b a c k c o n d i t i o n s . I n p a r t i a l feedback, the unobtained outcome is the unobtained amount in the selected lottery; in complete feedback, it is the obtained amount on the unchosen lottery.


Forallparticipants,emotionalreactionsweresigniFicantlyaffectedbyboththeamount

oftheoutcome(Subjectiverating,Obtained-chosen,partial-feedbacktrials,Coeff.1.392,Z=

41,p<0.001)andtheamountoftheunobtainedresultofthechosenwheel(Subjectiverating,

Unobtained-chosen,partial-feedbacktrials,Coeff.-0.102,Z=-3.17,p<0.01)inpartial-

feedbacktrials.Thisrelationshipheldtrueincomplete-feedbacktrialsaswell,withsimilar

effectsofboththeamountsoftheobtainedoutcome(Subjectiverating,Obtained-chosen,

Coeff.1.345,Z=53.78,p<0.001)andunobtained-chosen(Subjectiverating,Unobtained-

chosen,Coeff.-0.126,Z=-5.12,p<0.001).However,theemotionalratingoftheobtained

outcomewasmodulatedtoagreaterextentbytheobtainedoutcomeoftheun-chosenlottery

(Subjectiverating,Obtained-unchosen,complete-feedbacktrials,Coeff.-0.305,Z=12.68,p<

0.001)thantheunobtainedoutcomeofthechosenlottery,demonstratinganampliFication

effect.Thatis,theresultofthelotterythatwasnotchosenhadagreatereffectonthe

emotionalratingthantheresultthatwasnotobtainedinthechosenlottery.

Bothagegroupsgaveincreasinglynegativeratingswithhigherobtainedamountsin

theun-chosenlottery,meaningthehighertheamounttheycouldhaveobtainedwitha

differentchoice,theworsetheyfelt.ButtheeffectthishadontheratingwassigniFicantly

differentforthetwogroups(Table2-1),withYAhavingagreaternegativereactiontohigher

valuesintheobtainedamountontheun-chosenlotterythanOA(Table2-1).Here,the

negativecoefFicientindicatestheadversereactiontoahighervalue.YAhadastronger

negativereactiontomissedopportunities,i.e.thehighervalueintheun-chosenlottery.OA,

thoughstillreportinganegativeemotion,didnotexperienceitasmuch,indicatingthatthey

experiencedregrettoalesserextentforthesameamountofamissedopportunity.Noother

valuehadasigniFicantinteractionbygroup.

InYA,emotionalreactionstoupwardcounterfactualcomparisons(i.e.relativelosses)

weresigniFicantlystrongerinthecomplete-feedbackconditionthaninthepartial-feedback


condition(WSRT,Z=-3.77,p<0.001).ThesamecomparisonwasnotsigniFicantlydifferent

amongOA(WSRT,Z=-1.76,p=0.079).

Akeycharacteristicofregretisthegreateraffectreportedinupward-looking

complete-feedbacktrialscomparedtoupward-lookingpartial-feedbacktrials.Theindividual-

leveldifferencesbetweenratingsinupward-lookingcomplete-feedbackandupward-looking

partial-feedbackwasstrongerforYA(alltrials,Mann-Whitney,Z=2.11,p=0.0345).

Choicebehavior

Wetestedamodelofchoicethatcomprisedaschoicepredictorsthedifferencein

expectedvalueofthetwolotteries,riskandanticipatedregret.Mixedlogisticregressions

Table 2-1. Scale: Age groupYounger Older

Subjective ratings Coeff Std Error Z Coeff Std Error Z

obtained-chosen 1.413 0.048 29.49*** 1.364 0.050 27.21***

unobtained-chosen -0.067 0.046 -1.470 -0.118 0.050 -2.34*

obtained-other -0.457 0.045 -10.14*** -0.290 0.050 -5.8***

unobtained-other 0.068 0.049 1.370 0.042 0.052 0.800

Group 3.285 1.964 1.670

obtained-chosen X group -0.049 0.069 -0.700

unobtained-chosen X group -0.051 0.068 -0.750

obtained-other X group 0.167 0.067 2.48*

unobtained-other X group -0.026 0.072 -0.360

constant 2.337 1.358 1.720

complete-private feedback trials complete-private feedback trials

Wald Chi2 = 4088.89*** * p < .05; ** p < .01; *** p< .001

Table 2-1. Scale: Effects of potential outcome amounts on emotional self-evaluation by age group. Mixed-effects linear regressions modeling the effect of lottery components on rating in older adults and younger adults. Both groups had stronger negative emotional ratings with higher amounts obtained in the unchosen wheel, that is, missed opportunities. Younger adults, however, had a significantly stronger negative reaction than older adults.


showedthatwhilebothgroupssoughthigherEVwhenchoosingwhichlotterytoplay(Table

2-2),YAweremorelikelytodoso(Table2-2).ExpectedValueisthevariablewiththegreatest

inFluenceonchoice,andhigherEVhasagreaterinFluenceonthechoicesofYAthanonOA.

YAdidnotsigniFicantlyaccountforrisk(Table2-2),butOAdid,avoidingitacrossall

trialtypes(Table2-2)tosigniFicantlygreaterextent(Table2-2).Thatpreferenceisdriven

largelybycomplete-feedbacktrials,becauseOAconsiderrisktoasigniFicantlyhigherextent

inthosecomparedtopartialtrials(Table2-4).Thesameregressionforpartial-information

trialswasnotsigniFicantineitherOAorYA.TheseanalysessuggestthatOAaremorerisk

averseandlessconsiderateofexpectedvaluethanYA.

Becausethevariablerrepresentsanticipatedregret,apositivecoefFicientindicatesan

attempttoavoidregretbychoosingthelotterywiththesmallerdifferencebetweenworst

outcomeonchosenandbestoutcomeonunchosen.BothYAandOAminimizeregretacross

Table 2-2. Choice: Effects of expected value, risk and regret on lottery choice by age group. Multi-level mixed logit regression modeling the effect of lottery components on choice in older adults and younger adults. Both older and younger adults made choices at significant levels that favored expected value and minimized anticipated regret. Older adults significantly made choices that minimized risk, while younger adults did not.

Table 2-2. Choice: EV/risk/regret - Age GroupYounger Older

Choice Coeff Std Error Z Coeff Std Error Z

dev 0.155 0.012 12.86*** 0.069 0.012 5.76***

dsd 0.013 0.008 1.550 -0.037 0.008 -4.42***

r 0.015 0.003 4.77*** 0.025 0.003 7.57***

Group -0.011 0.064 -0.170

dev X group -0.086 0.017 -5.06***

dsd X group -0.050 0.012 -4.26***

r X group 0.010 0.005 2.21*

constant 0.051 0.044 1.150 0.040 0.045 0.900

partial, complete-private, complete-social trials partial, complete-private, complete-social trials

Wald Chi2 = 281.31*** Wald Chi2 = 111.67***

* p < .05; ** p < .01; *** p< .001


Table 2-3. Choice-younger: Effects of expected value, risk and regret on lottery choice by trial type in younger adults. Younger adults made choices at significant levels that favored expected value and minimized anticipated regret in both types of trials. They significantly minimized risk in complete feedback trials only.

Table 2-4. Choice-older: Effects of expected value, risk and regret on lottery choice by trial type in older adults. Older adults made choices at significant levels that favored expected value and minimized anticipated regret in both types of trials. They significantly minimized risk in complete feedback trials only.

Table 2-4. Choice: EV/risk/regret - Older AdultsPartial Complete


dev 0.081 0.027 2.98** 0.067 0.013 4.98***

dsd 0.015 0.019 0.760 -0.050 0.009 -5.31***

r 0.031 0.008 4.07*** 0.024 0.004 6.44***

Feedback 0.076 0.112 0.680

dev X feedback -0.013 0.030 -0.450

dsd X feedback -0.065 0.021 -3.01**

r X feedback -0.007 0.008 -0.840

constant -0.020 0.100 -0.200 0.056 0.051 1.110

complete-private, complete-social trials

Wald Chi2 = 119.40*** Wald Chi2 = 99.91***

* p < .05; ** p < .01; *** p< .001

Table 2-3. Choice: EV/risk/regret - Younger AdultsPartial Complete


dev 0.121 0.026 4.60*** 0.165 0.014 12.08***

dsd 0.018 0.018 0.970 0.011 0.009 1.260

r 0.021 0.007 3.00** 0.013 0.003 3.83***

Feedback 0.064 0.108 0.600

dev X feedback 0.044 0.030 1.490

dsd X feedback -0.006 0.020 -0.300

r X feedback -0.008 0.008 -1.000

constant -0.000 0.097 0.000 0.064 0.049 1.320

complete-private, complete-social trials

Wald Chi2 = 174.00*** Wald Chi2 = 150.98***

* p < .05; ** p < .01; *** p< .001


bothtypesoftrials,butnottoanysigniFicantlydifferentextent(Tables2-3,2-4).Surprisingly,

atrendshowsthathigheranticipatedregrethasagreaterinFluenceduringchoiceinpartial-

feedbacktrialsforbothagegroups.Althoughparticipantswouldbeawarethattheywouldnot

seetheoutcomeoftheunchosenlotteryinpartial-informationtrials,theymaystillanticipate

regretasanorderingeffect,duetotheintermixingoftrialtypesthroughoutthetask.Both

groupsanticipateregret,withOAdoingsotoagreaterextentacrossalltrials,butnot

signiFicantlyso.

Learningbehavior

Wecomparedearningsfromthelearningtaskbyagegroupincomplete-information

trialsandfoundthatYAearnedsigniFicantlymorethanOAincomplete-informationfeedback

trialsinthelearningtask(Mann-Whitney,Z=-2.177,p=0.0295)(Fig.2-5).Bothagegroups

earnedsigniFicantlymoreincomplete-informationtrialsthaninpartial-informationtrials(YA,

WSRT,Z=3.34,p<.001)(OA,WSRT,Z=2.24,p=.0249)(Fig.2-5).Thenwecompared

earningsdifferentialsbetweentypesoftrial.Foreachsubject,wesubtractedearningsin

partial-feedbacktrialsfromearningsincomplete-feedbacktrials.TherewasnosigniFicant

differencebetweenOAandYAinthisearningsdifferentialinthelearningtask(Mann-Whitney,

Z=-0.82,p=0.41)(Fig.2-6).

Wethenconsideredtwotypesofplayermeasurementsfromthelotterytask,dividing

allparticipantsaccordingtoamediancounterfactualcoefFicient,ascalculatedfromthelottery

task.Wecomparedthetwogroups,askingifeitheroneearnedsigniFicantlymorethanthe

otherincomplete-informationtrials(Fig.2-7)andfoundthatthehalfofsubjectswhofelt

worseaboutmissedopportunitiesinthelotterytask(strongercounterfactual-coefFicient)

earnedsigniFicantlymorethantheweakergroup(Mann-Whitney,Z=1.96,p=0.0495).We

thenaskedifeithergroupearnedmoreincomplete-feedbackthaninpartial-feedbacktrials


inthelearningtask(Fig.2-7).Thedifferencebetweenearningsinfeedbacktypeswasnot

signiFicantintheweakercounterfactual-coefFicientgroup(WSRT,Z=1.69,p=.0911.The

strongercounterfactual-coefFicientgroup,however,earnedsigniFicantlymoreincomplete-

informationtrialscomparedtopartial-informationtrials(WSRT,Z=3.93,p<.001).

Nextweconsideredthecomplete-partialearningsdifferential,aswedidagegroups.

Bothcounterfactual-coefFicientgroupsonaverageearnedmoreincomplete-feedbacktrialsof

thelearningtaskcomparedtowhattheyearnedinpartial-feedbacktrials(Fig.2-8).The

strongercounterfactual-coefFicientgroup,however,hadasigniFicantlyhigherdifferentialthan

thatoftheweakercounterfactual-coefFicientgroup(WRST,Z=2.21,p=0.027).Theirearnings

differentialbetweencompleteandpartialtrialswerehigherbyalargermarginthanthose

whowerelessunhappyaboutthecounterfactualoutcomeinthelotterytask.Thatweaker

groupfollowedthesametrendbutdidnothaveasigniFicantdifference.

Fig 2-5. Learning task: earnings by feedback and age group Mean earnings (error bars are s.e.m.) of older adults and younger adults in complete feedback trials and partial feedback trials in an instrumental learning task. Circles represent individual mean earnings for the feedback type they overlay.

Fig 2-6. Learning task: earnings differential by age group Mean earning differential (and s.e.m.), calculated as the amount earned in complete feedback trials, less the amount earned in partial feedback trials.


Discussion

Olderandyoungeradultsperformedalotterychoicetask,followedbyaninstrumental

probabilisticlearningtask.Inbothtasks,behaviorcouldbeguidedbymeasurablefactors,

suchasrisk,expectedvalue,anticipateddisappointmentandanticipatedregret.Inbothtasks

differentfeedbackconFigurationsmadepossibledecisioncomparisons.Inpartial-information

trials,subjectssawtheresultoftheirchoiceonly.Incomplete-informationtrials,theysawthe

outcomeofthechoicetheydecidedagainst,introducingelementsofforegonepossibilitiesand

responsibility,thereforeenablingcounterfactualthinking.Inthelotterytask,weaimedto

gaugeindividualandgroupsensitivitiestocounterfactualoutcomes.Weaskedifcomparing

thesemeasurementstolearningbehaviorinthesubsequenttaskwouldrevealdifferencesat

aninter-individuallevel,thussuggestingimplicationsofregretsensitivityonemotion-related

Fig. 2-7. Learning task: earnings by feedback and counterfactual-coefficient group Mean earnings (and s.e.m.) of Stronger and Weaker counterfactual coefficient subjects in complete feedback trials and partial feedback trials in an instrumental learning task. Circles represent individual mean earnings for the feedback type they overlay.

Fig. 2-8. Learning task: earnings differential by counterfactual-coefficient group Mean earning differential (and s.e.m.), calculated as the amount earned in complete feedback trials, less the amount earned in partial feedback trials.


counterfactualthinking.Wehypothesizedthatemotionalexperienceandeffectinthe

gamblingtaskwouldcorrelatedifferentlywithcounterfactuallearninginthelearningtask

bothatanindividuallevel.Wecategorizedsubjectsbythesemeasurementstogaugehow

choicesinandreactionstothelotterytaskcorrelatedwithperformanceinthelearningtask.

BothageandbehaviorcharacterizationinthelotterychoicetaskyieldedsigniFicant

effectsinthelearningtask.Subjectiveemotionalratingsinthelotterychoicetaskindicatethat

youngeradultsfeltworsewhentheysawthebetterresultsofalotterytheydidnotchoose.

Thisnegativereactiontoamissedopportunityistheexperienceofregret.Theaffectiverating

ofolderadultswasalsosigniFicant,butnotashigh.Despitethisdisparityintheexperienceof

regret,wefoundthatbothagegroupschosetominimizeregretmoreoftenandataboutthe

samerate.Thenotabledifferentiationfromthelotterychoicetask,then,wasthatwhileolder

adultsappeartobelessdisturbedbyregretsituations,theycontinuetoemployanticipationof

regretasanavoidancebehavior.

Inthelearningtask,participantsfacedsituationsinwhichtheymightencounter

disappointmentandothersituationsthatcouldbringregret.Itisinthesecomplete-

informationfeedbacktrialsthatwewouldexpectthemtoemployanticipatedregretasa

learningsignal.Indeed,bothagegroupsdemonstratedgreaterabilityviahigherearningsin

completefeedbacktrialsoverpartialfeedbacktrials.Youngeradultsshowedgreatermean

earningsincompletefeedbacktrials.Thismayhavebeenaneffectofhigherearningsacross

alltrials,however,sincethedifferentialbetweenmeanearningsincompletefeedbacktrials

versuspartialfeedbacktrialswasnotsigniFicantlydifferentbetweenthetwogroups.

Togainabettercomprehensionofhowperformanceinthetwotasksinteracts,we

consideredhowmuchthealternativerealityoftheoutcomeontheunchosenlotterywheel

inFluencedtheemotionalrating.WeexpectedthiscounterfactualcoefFicienttoindicatea

strengthofexperienceofregret.ThoughYAandOAperformeddifferentlyinthelearningtask,

wedidnotFindasigniFicantdifferencebyagegroupinthecorrelationbetweenexperienceof


regretanduseofcounterfactuallearning.WedidFind,however,thatemotionalexperience

correlatedwithcounterfactuallearningatanindividuallevel,aswehypothesized.Wefound

thattheindividualswhofeltthismorestronglyhadasimilarlysigniFicantdifferentiation

betweenearningsincompletefeedbacktrialsinthelearningtaskversuspartialfeedback

trials.Theweaker-experiencegroupdidnothaveasigniFicantdifferentiation,indicatingthat

thelesserexperienceofregretmayleadtoadiminishedabilitytoemploycounterfactual

thinkinginlearningtasks.Thestrongercounterfactualgroupearnedmorethantheweaker

groupincompletefeedbacktrials,adifferentiationconFirmedbyasigniFicantlyhigher

complete-partialdifferentialinthestrongercounterfactualgroupcomparedtotheweaker

group.Individualswhofeltworseaboutregret,accordingtotheirratingsduringthelottery

task,appearedtobemoremotivatedincompletefeedbacktrialsduringthelearningtaskata

higherratethanthosewhowerelessaffectedbyregret.Oneexplanationforthisisthatthe

emotionaleffectthatwasstrongenoughtosigniFicantlyguidechoiceinthelotterytasksoon

aftercontinuedtousecounterfactuallearningtogreaterrewardbyavoidingaversive

outcomes.

Agingisaccompaniedbyreducedpreferencefornegativestimuliinbothattentionand

inmemory.Thiswell-establishedpositivityeffectemergesinmiddleandlateadulthood.

Basedonthepositivityeffect,adecreasedattentiontonegativityshouldleadtoreduced

experienceofregretandthereforereducedanticipationandavoidanceofregret.Yetonlythe

Firstpartofthatconjecturebearsout,raisingthequestionofhowimportanttheexperienceof

regretistoitslateruseinanticipationforavoidance.Wesuspectedthatthepositivityeffect

wouldyieldreducedlearningfromregretsituationsinolderadults.Wesawsomeevidenceof

thatinthelowerexperienceofregretinthelotterychoicetaskamongolderadults,aswellas

someindicationsinthelearningtask,inwhichtheyearnedlessthanyoungeradultsin

situationsthatmightemployregretlearning.Wealsoconsideredthattheemotional


componentofcounterfactuallearningmightstabilizedeclinesinperformancethathavebeen

showntootherwiseaccompanyaging.

Brassenandcolleaguesfoundthat,followingoutcomesthatwouldelicitregret,older

adultsshowedincreasedactivityintheanteriorcingulatecortex,anareaassociatedwith

emotionalregulation(2012).Theauthorsproposethatthisisacognitive-controlmechanism

thatre-assessesregretfulexperienceaslessnegative.Theyalsosuggestthathealthyolder

adultsexternalizethecausesofregretsituations,attributingthemtofactorstheycouldnot

controlandremovingtheresponsibilitythatisakeycomponentofregret.Thisisconsistent

withthepositivityeffect,whichstatesthatminimizingnegativeexperiencesinolderadultsis

notanemotionalregulationstrategy,butrathergoal-orientedcognitiveprocessing

(CarstensenandDeLiema2018).Accordingtothesocioemotionalselectivitytheory,thisis

consistentwithaging,sinceitisaccompaniedbychanginggoals(duetodiminishingtime

horizons)thattriggerincreasingoccurrenceofthepositivityeffect.Brassenandcolleagues

arguethatthepositivityeffectingeneralanddiminishedregretexperienceinparticularare

adaptiveforemotionalwell-beinginolderage.Disengagementfromregretconstitutesa

specialcaseofthepositivityeffect.Inadditiontotheemotionalwell-beingderivedfrom

avoidingnegativeemotions,disregardingregretexperienceisprotectiveforolderadults

becausetheyhavereachedatimeinlifewhenopportunitiestoundoregrettedbehaviorare

diminishingtothepointofvanishing(Brassen,Gameretal.2012).

Previousstudiesaremixedintheirassessmentofhowriskpreferencechanges,ifatall,

withage,andthecontextofhowriskcaninFluencelearningandearningsseemstoplay

outsizeimportanceinhowitchanges.Olderadultsinthisstudyaccountedforriskinthe

lotterychoicetask,whileyoungeradultsdidnot,suggestingthatminimizingriskbecomes

moreimportantwithageinataskthatdoesnothaveatendencyofrewardingorpunishing

riskseeking.OurFindingsexpandonthoseshownbyTobiaandcolleagues(Tobia,Guoetal.

2016),whofoundthatolderadultsweremoreresponsivetocounterfactualgains.Wefound


thatolderadultsareconverselylessresponsivetomissedopportunities.Wefurthersaw,

however,thatthisdoesnotgiverisetoanysigniFicantdifferenceinchoicebehavior.

Theseresultssuggestdifferencesintheexperienceandanticipationofregretin

decisionmakingandlearning.Thewheeloffortunelotteryisareliableindicatorofpreference

forrisk,anticipationofregretandimportanceofexpectedvalue.Tofurtherexplorethe

relationbetweenchoicebehaviorandcounterfactuallearning,thelotterytaskcouldbepaired

withotherlearningtasksthatcanemployregretlearning.

Duetothenecessityoftwotypesofcomplete-informationfeedbacktrialsinthelottery

choicetask,partial-informationfeedbacktrialswerelimitedtojustone-quarterofallthe

completefeedbacktrials,makingregressioncomparisonstothecompletefeedbacktrialsless

reliable.Afuturestudycouldincreasethenumberofpartialfeedbacktrials.

Chapter 3: Priming regret: inducing counterfactual thinking to influence learning

Priming regret: inducing counterfactual thinking to influence learning


Experimentalquestions

Cantheexperienceofcomplete-informationcounterfactual,speciFicallyregret,inonetask

modulatelearninginaseparatetask?

Ifso,doesthetransferencouragemoresophisticatedlearningbehavior?

Introduction

Indecisionmaking,thebraincomparesactualoutcomesofchoicestootherpossible

outcomes,bothalternativesfromthechoicemadeandthoseforegone(LoomesandSugden

1982).Thiscounterfactualinformationfromimaginedalternativerealitiesgivesrisetoasetof

emotionalsituations.Alternativechoiceoutcomesarenotalwaysknowntous,butwhenwe

doseethem,andthecomparisonbetweenthoseandactualoutcomescorrespondtothe

emotionsthatwelabelregretorrelief(Zeelenberg,vanDijketal.1998).Thesearedistinct

fromtheemotionswecalldisappointmentandsatisfaction,whicharestillcounterfactual

emotionsbutresultinthecomparisontoalternativeoutcomesofthesamechoicedueto

nature.Asignaturecomponentofregretisagency:itrequirestheimaginingofanalternative

realitythatcouldhavebeenrealizedviaadifferentchoice.Theinformationfromthese

comparisonscanguidelearningwithinatask(Zeelenberg,Beattieetal.1996).Butinthe

frameworkofTransferLearning,weaskifthismechanismcancrossoverfromonedecision

contexttoanother:Iflearningfromcounterfactualemotionsinonetaskproducesadifferent

state,wouldthatmodulatelearninginadifferenttask?

Invariousstates,peoplemaybebetterorworsepreparedtolearnrapidly(Youngand

Nusslock2016).SeverallearningmodelswithrobustreFlectionsofbehavioralandbrain

activitysuggestthatlearningincorporatesnotonlychoicesmadeandtheiractualoutcomes,

butalsopathsnottakenandimaginedrewardsorpunishmentsfromsomealternativereality


(Zeelenberg,Beattieetal.1996,CamererandHo1998,Zeelenberg,vanDijketal.1998,

FudenbergandLevine1999).Theweightgiventothesealternativerealitiescanvaryfrom

sessiontosession,persontoperson,andeventrialtotrial.Certaintypesofthinkingmay

manifestasdifferentbehavioralstrategies.Counterfactualthinking,includingthecognitively

enhancedemotionsdrivelearningwithinthesametask(Camille,Coricellietal.2004).

Traditionalmodelsofregretarebasedonadaptivelearning,inwhichtheprobabilityof

makingachoicevariesdependingonthedifferencebetweenactualrewardandtherewards

thatoptionwouldhaveyieldedifithadbeenchoseninthepast(FosterandVohra1999,Hart

andMas-Colell2000,FosterandYoung2003).

Inarepeatedgame,aplayerwhorecognizesthatadifferentstrategywouldhave

broughthigherrewardifshehadmadeadifferentchoicecouldchangestrategiesinthenext

iteration(Coricelli,Dolanetal.2007).Moreover,simulatedbehaviorinneuralnetworks

showedthatincorporatingregretintochoicemodelsyieldedimprovedperformance

(MarchioriandWarglien2008).Thisimprovementappearsnottobeaphenomenonpurelyof

additiveinformation,learningactualandimaginedoutcomesofasinglegame,butratherthe

triggerforaparticularlearningmechanism.CoricelliandcolleaguesobservedBOLDactivity

thatledthemtosuggestthisintegrationofcognitionandemotionoccursintheorbitofrontal

cortex(OFC)followingfeedbackfromadecision,inthiscaseagamblingtasktrial,butbefore

thepresentationofasubsequentchoice.Thesupplementalemotionalcomponentofregret

raisesthepossibilityofsustainedaffectthatmaytransferintoanunrelatedandnoveltask,

andoncethere,potentiallyacceleratinglearningasithasbeenseentodowithinasingle

repeatedgame.

Afterexperiencingregret,individualsmakechoicestoavoidthenegativefeeling,often

inviolationofnormativebehavior(Ritov1996).Theemotionalmotivationofavoidingregret

canmodulatechoicesawayfrompurelyrationalexpectedutility.Inanongoing,adaptive

context,itconstitutesalearningbehavior(Zeelenberg,Beattieetal.1996).Punishmentas


effectivelearningsignalpresentsaparadox,becauseonceindividualslearntoavoidit,the

reinforcerisnotencountered.Instead,successfulavoidance,whichisintrinsicallyneutral,

gainsapositivesignbywayofthecounterfactual,comparingthesuccessful,neutralavoidance

toanotherpossibleoutcomethatwouldhavebeenworse(Kim,Shimojoetal.2006,

Palminteri,Khamassietal.2015).Inadaptivelearningmodels,theregretsignalmodulatesthe

tendencytomakeagivenchoicebycomparingtherewardsthatchoicewouldhavebrought

andactualrewards(Megiddo1980,FosterandVohra1999,HartandMas-Colell2000,Hart

2005).Inregretmodelsinrepeatedgames,theprobabilityofswitchingtoanotherchoice

variesdependingonhowmuchrewardthatchoicewouldhavebroughtifithadbeenchosen

throughoutthegame,comparedtoactualreward(Hart2005).Playersthatminimizeregret

convergeonoptimalsolutions,sometimesmorequicklyorwithfewerlosesthanthosewhodo

notminimize(CoricelliandRustichini2010).Becauseregretcarriesaconnotationofaffective

inFluence,some,includingLohrenzandcolleagues(2007),differentiatebetweentheemotion

regretasdescribedbyBell,LoomesandSudgen(Bell1982,LoomesandSugden1982)andthe

signalthelaterstudyobservesastheresultofFictiveLearning.Thoughbothtermsareusedin

theliterature,here,weuseregrettodescribethiseffect.

Measuresofprimatedopaminergicneuronsexaminedthetemporaldifference(TD)

modelinsituationsthatcorrespondtodisappointment(e.g.aconditionedstimulusnot

followedbyanexpectedreward)(Dayan1994,Schultz,Dayanetal.1997,Schultz,Tremblayet

al.1998,Schultz2002)andshowedthatdopamineneuronsreactnottorewarditselfbutto

rewardpredictionerror,thedifferencebetweenexpectedrewardandreceived(Schultzetal.

1997).Counterfactualoutcomesappeartoregisterinsingleneuronsintheanteriorcingulate

cortex(ACC)(Hayden,Pearsonetal.2009).Inthetask,monkeysselectedoneofeightscreen

positionsinanefforttogetasinglehighestjuicereward.Singleneuronsshowedhigher

activitywhenthebestoptionwaschosen,varyingdependingonthemagnitudeofthatoption

inagiventrial.Yetevenwhenthemonkeyfailedtochoosethebestoptionbutsawthe


magnitudeofthatmissedreward,theACCneuron(andlocalpopulation)encodeditatthe

samemagnituderelativetobest-optionamountsinothertrials.Morerecentworkinhumans

showedthatamodelcombiningpredictionerrorandcounterfactuallearningbetterpredicted

striataldopaminergicactivity(Kishida,Saezetal.2016).Thisworksuggestsamoreelaborate

andcomplexrolefordopamine,encodingnotonlyrewardsandlosses,butalsotheresultsof

computationscomparingactualoutcomestoalternative,imaginedrealities(PlattandPearson

2016).

Hsu&Zhu(2012)comparedneuronalmanifestationoftwomodelsofregret-related

learninginacompetitivegame,Findingbroaderevidenceforabelief-basedFictitiousPlay

model.OneRPEsignalcorrespondedtothedifferencebetweenthereceivedrewardandthe

highestpossible,aswiththemonkeys,whichdoesnottakeintoaccounttheactionsofthe

opponent.Thissignalcorrelatedwithactivityonlyinthebilateralputamen.TheFictitious

play-basedsignalbycontrastconsideredthedifferencebetweenactualrewardandan

expectedrewardthatisbasedonthefrequencyoftheopponent’spreviouschoices.Thissignal

correlatedwithactivityinthebilateralputamenandinanareacomprisingthemPFC,theOFC

andtheACC.

Experience-WeightedAttractionmodel

Previousresearchhasdescribedlearningbycomparingplayers'choicestrategiesto

thesevariouslearningmodels.Reinforcementlearning(RL)modelsbestdescribeplayerswho

valuestrategiesthathavepaidoffinthepast.Becausereinforcementlearnerspaythemost

attentiontotheirownchoicehistory,theirbehaviorisoftenmarkedbysequentially

dependentchoices–thatis,choicesthathaveledtopositiveoutcomesaremostlikelytobe

madeagain.ThoughRLhassomeclearshortcomingsbecauseitdoesnotincorporateall

availableinformation,itdoesdescribewellbehaviorinanumberofmixed-strategygames


(RothandErev1995).Belief-Basedlearning(BBL)models,bycontrast,reFlectthechoicesof

playerswhotakeintoaccounttheiropponent'sdecisions.InBBL,theoutcomesofdecisions

notmadeareincorporated,takingintoconsiderationbeliefsabouttheactionsofanother

player,allowingsubsequentchoicesthathavenotbeenrewardingbeforeormaynoteven

havebeenmadepreviously.

Playersseemtoincorporatesomecombinationofthesetypesofplay,evenvaryingthe

amountwithinaseriesofchoices(Ansari,Montoyaetal.2012).Tocapturetherelativeuseof

eachtypeoflearning,CamererandHodevelopedthehybridExperience-WeightedAttraction

(EWA)model,whichnestsbothReinforcementandBelief-Basedlearningmodels.Behavior

thatreFlectseitherofthesemodelsisaccountedforinEWA,andEWA’skeybeneFitisits

productionofparameter∂thatindicatestherelativeweightofactionvalues.InRL,themost

recentlyrewardingdecisionsaremostvaluableandattractiveandthereforecontinuetobe

chosen.TheBBLmodelnestedinEWAisitselfanestedmodelofseveralbeliefmodels,

includingFictitiousplay(FudenbergandLevine1998),whichincorporatesallpastactions.But

unlikeBBL,Fictitiousplayappliesnotemporaldecay.Inanupdaterulefortheattractionto

strategyk,thedecayofthestrengthofpastattractionsistheweightedparameterΦ,asin

where� istheattractionofstrategyktoindividualiaftertimeperiodtand�

representsupdatedgameexperience(Ansari,Montoyaetal.2012).TheparameterΦindicates

thelevelofFictitiousplay(Φ=1)versussingle-periodCournotbelieflearning(Φ=0),andπis

thepayofffunction.

ABik(t) Ni(t)

ABik(t) =

ϕiABik(t − 1)Ni(t − 1) + πi(sk

i , si(t))ϕiNi(t − 1) + 1


WhenbehavioraldatareFlectsmoresequentiallydependentplay,EWA-modeled

behaviorindicatesRL,whereaswhendataindicatesdeletionofdominatedstrategies,the

modelshouldyieldBBL.Thetendencytowardonestrategyortheotherderivesfroma

weightedcombinationoftheactualpayoffandthoseforegone,aswellasanaverageofallpast

attractions–ratherthanasumofthosepriormeasures(RapoportandAmaldoss2000).The

EWAdynamicallycombinestheapparentbestpartsofRL(reinforcedchosenstrategies)and

BBL(considerationofunchosenstrategiesbyallplayers).ItisaFlexiblemodelbecausethe

extenttowhichitincorporatesthesetwocomponentsvaries.Andnumerousstudieshave

shownthathybridmodelslikeEWAbetterpredictbehavioraldatathanmodelsthatemploy

justonetypeoflearning(CamererandHo1998).Thiscomesoutevendespitepenaltiesfor

thehighernumberofparametersinmanyversionsofEWA.Further,Zhuandcolleagues

(2012)foundthatBBLandRLmodelsperformedaboutequallywell,butstillnotaswellas

EWA.

TheconstituentsofRLandBBLareevidentintheEWAupdateruleforstrategykof

playeri:

(Zhu,Mathewsonetal.2012).TheΦparameterdiscountspreviousattractions,andN(t)

representsthedecayofpastexperience.The∂indicateshowmuchweightisgivento

strategiesnottaken.Givingthemfullweight(∂=1)wouldreFlectfullybelief-basedlearning,

while∂=0indicatesnoconsiderationofforegonechoices,andthereforereinforcement

learning.The∂parameterthenprovidesaclearandcontinuousmeasureofaplayer'srelative

useofRLandBBL.ParameterImakestheswitchbetweentheweightofchosenstrategy(1)

Vki (t) =

ϕ ⋅ N(t − 1) ⋅ Vki (t − 1) + πi(sk

i , s−i(t))N(t) , if sk

i = si(t)

ϕ ⋅ N(t − 1) ⋅ Vki (t − 1) + δi ⋅ πi(sk


i ≠ si(t)


andforegonestrategyweight(between0and1).N(t)isestimatedinitiallyandthenisupdated

eachperiod,accordingtothedecayrepresentedbyρ.

Thepreviousexpectedrewardofagivenstrategyisdepreciatedbyφ,aconceptionof

theopponent'sadaptationspeed,andadiscountrateforpastexperience.Itisthenincreased

bytherewardforthatstrategy,giventheopponent'sactualchoiceinthepreviousperiod,and

thatisdividedbyallpastexperiencetoarriveatthenewvalueofthestrategyinquestion.A

smallφmeansthattheplayerbelievesheropponentadaptsquickly,sopreviousvaluesare

depreciatedmorequickly.Alargeρ,whichupdatesthepast-experiencediscount,indicatesa

rapiddeclineofpriorbeliefs.

ThehybridmodelreducestoRLwhenparameters∂andρare0andinitialexperience

N=1.ThemodelispureBBLwhen∂=1andφ=ρ.Sotheupdatetoavalueofanactionisgiven

fullweightwhenitwastheonechosen–exactlyasitwouldbeinRL.ButinBBL,thevalueis

weightedbythebeliefstheplayerhasaboutthefutureactionsofotherplayers(Zhu,

Mathewsonetal.2012).So∂canbeseentodescribetheplayer'stendencytowardeitherRL

orBBL.

ThePatentRace

Thepatentracegameprovidesaframeworkinwhichtoobserveiterativethinkingin

limitedstrategyspace.InthisasymmetricconFiguration,twoplayerscompeteforaprizein

oneoftwoasymmetricroles:onewithanendowmentofFivecards(strongrole);theother,

four(weak).Ineachround,theendowmentisrenewed,andeachplayermustinvestfrom0to

thefullamountoftheendowment.Theplayerwhoinvestsstrictlymorewinsaprizeof10

cards.Anyendowmentcardsthattheplayerdoesnotinvestgointoherwinningsbutdonot

carryoverintothenextround’sendowment.Incaseofatie,neitherplayerwinsthereward

butretainsthatportionoftheendowmentnotinvested.


Tounderstandandpredicttheopponent’schoices,playersbeneFitbyexaminingthe

structureofthegamefromthebeginning,includingitsasymmetricalaspect.Towit,thestrong

playermightrealizethatshecanwintheprizeeverytimebyalwaysinvestingthefull

endowment.ShewouldlosetheentireFive-cardendowmentbutwintheprize.Theweak

playermightinvestsomeorallofhisendowmentseveraltimes,losingtheentireendowment

andneverwinningtheprize,beforerealizingthefutility,thenreducinghisinvestmenttozero

soastoretaintheentireendowmentinhisearnings.Seeingthis,thestrongplayermight

realizethatshedoesnothavetoinvestherentireamountinordertowin,leadingherto

increaseearningsbyoccasionallyinvestinglessthanthefullamount.Thisthenprovides

openingstotheweakplayertopredictwhenthestrongplayerwillplaylessthanthefull

amountandtoinvestmoreinordertowintheprize,evenwithasmallerendowment.

Thestrong(weak)playercanemploysix(Five)strategies:oneforeachpossible

investmentchoice.“Strategy”inthiscasereferssimplytothechoiceofhowmanycardsto

investineachround.Playerswithmoreiterativestrategicthinkingmayrealizethatsome

strategiesalmostnevermakesenseinagivenrole.Theseso-callediterativelydominated

strategiesderivefromaknowledgeofthestructureofthegame.Thestrongplayermaynot

needtoinvestherentireamount,butitwouldnevermakesenseforhertoinvest0,thus

guaranteeingapaymentof5,sinceshecanguaranteeapaymentof10simplyinvestingthefull

amount.So0isadominatedstrategy.

Ahigh-levelreasonerwillconsiderwhetherornothisopponentunderstandsthe

structureofthegame.Tothatend,hewouldobservethatitnevermakessenseforthestrong

playertoplay0cards.Iftheweakplayerrealizesthis,hewouldseethatitwouldnevermake

senseforhimtoinvest1,sinceitwouldneverbeatanystrategyplayedbythestrongplayer.If

thestrongplayerbelievesthattheweakplayerunderstandsthestructurewellenoughto

reachthislevel,shemayconcludethatitnevermakessenseforhertoinvest2cards.Itmight

resultinawin,aswellassomeretainedendowment.Butiftheweakplayerisunlikelytoplay


1,shelosesanadditionalcardofearningsbyinvestingtworatherthan1.Thiscontinueswith

iterativelyeliminateddominatedstrategiesfortheweakplayercomprising1and3,andfor

thestrongplayer,0,2and4.Inthisway,theiterativelyeliminateddominatedstrategiesofthe

strong(weak)playerare0,2and4(1and3).

Aplayermorereliantonreinforcementlearningwouldbeslowertoadapttothe

opponent’sbehavior,continuingformoreroundstomakethechoicesthatbroughthigher

rewardmorerecently.TheinFluenceofeithersomebelief-basedlearningoratendencyto

explorenewoptionsmayeventuallyinducetheplayertotryadifferentstrategy.

However,iftheplayerobservesorassumesthathisopponentdoesnotunderstandthe

structureofthegame–i.e.thesedominatedstrategies–hemaywellchoosethedominated

strategiesofhisrole.RapoportandAmaldossfoundthatiterativelyeliminateddominated

strategieswereplayedmoreoftenthanprobabilityprediction,yetthatthosehigherinthe

hierarchyofdeletion(i.e.thoseforwhichthethoughtprocesstakeslongertogetto:2and4

forthestrongplayer,3fortheweak)wereplayedlessfrequentlythanlower-leveldominated

strategies(2000).

ThepatentraceisparticularlywellsuitedtoEWAmeasurementsbecausethe

asymmetricalendowmentsgeneratethedifferentsecurestrategiesof0or5investmentfor

weakandstrongplayers.Theseinvestmentsactasasortof"safetynet,"sincetheyyieldthe

samepayoff,regardlessoftheopponent'schoice.Whenplayedasanasymmetricmixed-

strategygame,thepatentraceisshowntoemploybothRLandBBL(CamererandHo1998,

RapoportandAmaldoss2000).YetitispossiblethatEWAcouldmeasureamixedpoolof

purelyRLplayersandpurelyBBLplayers.Toensurethatthetwolearningmethodsareboth

presentatvariousstrengthsinindividualsinonepopulation,theBBLparametershouldbe

distributedalonganinterval,ratherthanclusteredateitherend,whichbycontrastwould

indicatetwodistinctpopulations(Zhu,Mathewsonetal.2012).HsuandZhucharacterizethe

combinationoflearningtypesnotasatruehybrid,ratherasthatoftwosystemsinconFlict


(2012).Indeed,overthecourseofthetask,themodelmixmaychangeinthesameindividual,

employingatFirstmoreBBLand,asrhythmsofthegameandhabitsoftheopponentbecome

clear,relyingmoreonthecognitivelyless-demandingRL(Ansari,Montoyaetal.2012).

Primingtask:WheelofFortune

TheWheelofFortunegamblingtaskadaptedbyCamilleandcolleagues(2004)from

anearliertask(Mellers,Schwartzetal.1999)canelicitwithlittlemanipulationemotions

borneofeitherresponsibilityornature,thatis,disappointmentorregret.Theonlychangeis

oneofpresentingalternativeresults,soparticipantsprimedwitheithertypeofemotion

undergolargelythesametask.Theemotionanindividualfeelsdependsonboththeobtained

outcomeandoneoftheunobtainedoutcomes.Eveniftheobtainedoutcomeispositive,the

prevailingemotioncanbenegativeifanindividualseesthattheunobtainedoutcomewas

better(Camille,Coricellietal.2004).

Subjectswhoobtainedanoutcomeof-50

reportedemotionalratingsof-20when

theother,unobtainedpossibilityonthe

wheelwas200(Fig.3-1).Yetwhen

subjectswhoobtained-50sawthatthe

unobtainedoptiononthewheelwas

-200,theyreportednetpositive

emotionalratings.Asimilarshiftwas

evidentinsubjectswhowon50,incases

inwhichtheunobtainedresultwas200

theemotionwasslightlypositiveversus

highlypositivewhentheunobtained

Fig. 3-1 Effect of unobtained outcome on emotional rating of obtained outcome in healthy subjects following (A) partial-information feedback and (B) complete-information feedback. Image adapted from Camille, Coricelli et al. 2004.


resultwas-200.Incompletefeedback,theresultontheunchosenwheeloverwhelmedthe

effectoftheunobtainedresultonthechosenwheel.Thedisparityinemotionalratingsforthe

sameobtainedamountfollowsthesamepatternfortheresultontheunchosenwheelasfor

theunobtainedamountinthepartialfeedbackcondition.Infactthemodulatingeffectofthe

unobtainedoutcomeontheunchosenwheelissostrongthatincomplete-information

feedbackconditions,thereliefatobtainingasmallerloss(obtaining-50onthechosenwheel

insteadof-200obtainedontheforegonewheel)producesahighermeanemotionalrating

than50obtainedonthechosenwheelinlightofanunobtained200resultontheunchosen

wheel.Notably,theeffectofregretismorepronouncedthanthatofdisappointment,evenwith

themagnitudesofobtainedandforegoneheldconstant.

Patientswithlesionsintheorbitofrontalcortexexhibitthesamepatternofemotional

shiftdependingonforegoneoutcomeinpartial-informationfeedbackconditions.Theshift

disappears,however,inthecompletefeedbackcondition.Bothnegativeemotionsatlosing

andpositiveemotionswithwinsareconstant,whethertheforegonewheel’soutcomewas

betterorworse.

Moodpriming

Moods,sustainedaffectivestates,havebeenlikenedtoaclimatewithgradualchanges,

ascontrastedtothemoresuddenandbriefoccurrencesofemotions,whicharecomparableto

dailyweatherevents(RottenbergandGross2007,Kohn,Falkenbergetal.2014).Immediate

emotionalactionsaremodulatedbyexternalevents:improvedbyrewardandworsenedby

loss.Theyaresimilarlyimprovedbydownwardcounterfactuals(inwhichtheoutcomeis

betterthanthealternative)andworsenedbyupwardcounterfactuals(seeingthatwhatwas

foregonewasworsethanwhatwasobtained)(Markman,Gavanskietal.1993,Roese1994,

SannaandTurley1996,Sanna1997).Thesustainednatureofmood,however,allowsthe


effecttoruntheotherdirection,formingapotentialpositivefeedbackloopbetweenmood,

emotionandreward.Indeed,moodandmentalsimulationaresorelatedthattheyareboth

causeandconsequenceofeachother(Sanna1998).

Moodcanchange,affectingthevaluationofchoices(TamirandRobinson2007).The

outcomeofawheeloffortune(WoF)gamechangedparticipants’moods,theninFluencedtheir

feelingsduringanimmediatelysubsequenttask(EldarandNiv2015).Participantschose

betweenpairsofmarkedslotmachineswithdifferentbutstableprobabilitiesofarewardwith

thegoalofmaximizingreward.TheythenparticipatedinaWheelofFortune,anunrelated

taskwithnochoicebutwitharelativelylargepayout,andthosewhowonreportedbeingina

bettermood.Afterward,theyplayedtheslotmachinetaskagainwithdifferentsetsof

differentiatingmarkingsbutwith(unbeknownsttothem)similarprobabilities.Aftertheslot

machinelearningtasks,participantsassignedvaluestoalltheslotmachinestheyhadseen.

Peoplewhowonthewheeloffortuneassignedhighervaluestoslotmachinesthey

encounteredaftertheWoFgameofchance,eventhoughtheirvaluesweresimilartothose

encounteredbeforetheWoF.ParticipantshadnoinFluenceoverthegameofchance,yetthe

outcomereliablypredictedwhethertheywerehappierwiththeslotmachinesinthelater

task.

Attheneuronallevel,positivemoodinductionisaccompaniedbycortico-striatal

activityduringrewardanticipationversuslossanticipation,ascomparedtodifferenceswith

neutralmoodinduction(YoungandNusslock2016).Criticallyforregretlearning,positive

moodinductionbroughtgreateractivityinthevmPFCduringanticipationofrewardversus

anticipationofloss.Notably,thosedifferencesarenotevidentinanyoftheareasduringthe

outcomephasesofwinorloss.Thesemeasurementssuggestthatpeoplewhoarealready

feelinggoodassignmoreimportancetopositiveoutcomes,effectivelyenhancingthem.It

furthersuggeststhatpositivefeelingsmayconstitutesomeinsulationagainstthenegative

feelingsrequiredofregretlearning.Theobverseraisesthepossibilitythatthepresenceof


negativefeelingscouldmakeapersonparticularlysusceptibletolearningviaanticipationof

regret.However,itispossiblethatinstatinganegativemoodmaybringsubsequentlossesinto

lowerrelief,whichisconsistentwithprospecttheory(KahnemanandTversky1979).Thatis,

ifaplayerisalreadyfeelingbadly,shewillnotbe“broughtdown”bysubsequentsmaller

losses.Howthathigh-passFiltermightbearonregretanticipationhasnotbeenexplored.

Hypotheses

Asdemonstratedinpreviousstudies,theexperienceofregretandthesubsequent

anticipationandavoidanceofregretisaformoflearning(Zeelenberg,Beattieetal.1996,

FosterandVohra1999,HartandMas-Colell2000,Camille,Coricellietal.2004).Inrepeated

interactions,thisadaptivebehaviorleadstomorerewardingoutcomes(Coricelli,Dolanetal.

2007,MarchioriandWarglien2008),aswellasmorerapidarrivalatequilibrium(Coricelli

andRustichini2010).AsdescribedbytheEWAmodel,peoplemayemployeitherRLorBBLto

differingextentswhileplayinganasymmetricrepeatedstrategygame(CamererandHo1998,

RapoportandAmaldoss2000,YoungandNusslock2016).ThepreciseratioofRLtoBBLeven

appearstovarythroughoutatask(Ansari,Montoyaetal.2012).Yettothispoint,thecauseof

thesevarietieshavenotbeencharacterized.

WhydopeopleuseRLandBBLtovaryingextentsinsteadofatthesamerates?Whydo

somepeopleemployBBLmorethanothers?Onepossibilityforthevariationbetweenpeople

istheirstateatinitialization,thatis,thedispositionoftheplayeratthestartofthegame.

Whereasonepersonmightcometoataskmorenaivelyandconsiderimplicationsofstructure

ashegoesalong,anothermightbepreparedfromtheoutsettoconsidertheentireframework

ofthetask.Wehypothesizedthatplayersalreadyinthemidstofacounterfactual

considerationwouldbemoreinclinedtocontinuesimilarconsideration.Suchaplayerwould

morequicklycometounderstandthestructureofthegame,andherchoiceswouldindicate


greateremploymentofBBL.Totestthis,wewouldhaveallsubjectsplayastrategygamethat

wouldmeasuretheirlearningmix,withonemoreindicativeofbehaviorarisingfrom

counterfactuallearning.Wewouldsegregatethemintogroups,primingthemwithdifferent

counterfactualoutcomesinanunrelatedchoicetask(ornotprimethematall).Werandomly

assignedvariousplayerstooneoftheseFivepre-gameconditions.Ourgreatestinterestwasin

thehighlysalientnegativecounterfactualregretcondition,butweemployedtwoactive

controls(oneeachforvalenceandfeedback)andonepassivecontrol(nopriming).Ifthe

playerhadalreadyexperiencedaregretsituationwithlargeconsequences,wehypothesized

thatitwouldreadyherlearningprocessesinsuchaconditionastomakedifferentchoice

patternsduringanewtask.Thatis,havingalreadyengagedinconsiderationofthebetter

alternativestoherchoices,thislearningwouldtransfertoagreaterpreparationtoanticipate

andavoidregretinanunrelatedsituation.

Regretiscommoninthepatentracebecausethegameincorporateselementsofregret

inanyroundthatdoesnotincludeaperfectwin.InmostconFigurationsofchoicethatendina

loss,theplayercouldeitherhavemadeadifferentchoicetoavoidthelossorcouldhavemade

adifferentchoicetoattenuatetheloss(i.e.maintainmoreoftheendowment).Likewise,in

moststrategyconFigurationsthatresultinawin,theplayercouldhavemadeadifferentchoice

tooptimizethewin,earningmore.Anyofthesenon-optimaloutcomesresultsinregret:the

recognitionthatadifferentchoicewouldhaveyieldedabetteroutcome.Thisregretsignal,its

anticipationandlikelyavoidancegiverisetolearningandbetterunderstandingthe

opponent'splay.Suchincreasedutilizationoftheregretsignalandmoresophisticated

understandingshouldgenerateagreaterincorporationofbelief-basedlearninginthehybrid

modelofchoicebehavior.


Methods

Wedesignedanexperimentincorporatingawell-documentedandresearchedWheelof

Fortune(WoF)lotterytask(Mellers,Schwartzetal.1999,Camille,Coricellietal.2004),along

withasimplecompetitivestrategygame.SubjectsFirstplaythewheeloffortunelotterytask

andreceiveeitherpartialfeedbackorcompletefeedback(Fig.3-2).Theresultprimesthemin

differentways:eitherwithcounterfactualemotions(regret/reliefincompletefeedback)or

withnon-counterfactualemotions(disappointment/satisfactioninthecaseofpartial

feedback).Theythenimmediatelyplaythecompetitivestrategycardgamecalledthepatent

race.Theopponentisalearningcomputeralgorithmthatrespondstothesubject'sgameplay.

1. Presentation

2. Choice

3. Adversary’s choice and result

strong role weak role

€8

€8

€8

1. Presentation

2. Selection

3. Spin

4. Result

completefeedback

partialfeedback

Regret

You lost 16 euro

-€16

-€16 -€16

-€16 -€16

-€16-€16 -€6

-€6

-€6

-€6

-€6 -€6

-€6€18

€18

€18

€18

€18

€18€18 €8

€8

€8

€8

You lost 16 euro

Disappoi

ntment

€8

€8

€8

Fig. 3-2 In the Wheel of Fortune (WoF) priming task, participants choose one of two wheels presented to determine their win or loss. Subjects selected for complete feedback see the arrow spin both on the wheel they selected and on the one they did not, even though it has no bearing on the amount they win or lose. In partial feedback, subjects see the arrow spin only on the wheel they chose; they are not informed of the result of the wheel that they did not choose.


TheWheelofFortunetaskpresentstwowheels,eachdividedintotworesultssectors

(redandgreen)indicatingtheprobabilityofthatresult.Resultsarematchedtovaluepairs,in

thiscaseonepositive,onenegativeandpresentedinthesamecolorasthecorresponding

portionofthewheel.Participantscompleted10practicetrialswithvalueslessthan1andno

currencysymbol.Theyweretoldtheseoutcomeswouldnotaffecttheirscore.Whendirected,

theyproceededtothemainlotterytask,wheretheyweredirected,asbeforetousearrowkeys

toselectwhichwheeltheywishedtoplay.

Allparticipantssawthesamesetofwheels:+€8/-€16atprobabilitiesof.66and.33,

and+€18/-€6atprobabilities.25/.75.Amountsandprobabilitiesweresettohaveequal

expectedoutcomeof0.Thepotentialregreterrordiffersbetweenthetwo,makingthechoice

anindicatorofeitherregrettolerance(incompletefeedback)orrisktolerance(inpartial

feedback).Inpartialfeedbacksessions,thearrowinsidetheunselectedwheeldisappeared,

whilethearrowinsidetheselectedwheelbegantospin,indicatingtheoutcomewhereit

stopped.Outcomeswerepredeterminedbyseatingposition.Thenumberofparticipantswho

wonorlostcorrespondedtothepresentedprobabilities.Incomplete-informationfeedback

sessions,thearrowontheopposingwheelwassettolandintheareaofoppositevalenceto

thatoftheobtainedoutcome.

Participantsplayedthepatentraceagainstacomputeralgorithmthatadjusteditsplay

basedonthechoicesoftheparticipantandabelief-basedlearning“Fictitiousplay”algorithm

(seeBox3-1).Participantswereinformedbeforehandthattheiropponentwasalearning

computeralgorithm.Previousstudieshavehadparticipantsplayagainsteachotheroragainst

randomizedround-matchedresponsesfromapoolofpasthumanplayers.Wesought

consistentopponentplaytoprovidestableprogressionsoflearningoverthecourseofthetwo

blocks.Wereasonedthatrandomizedpoolplaywouldnotprovideasenseofconsistencyand

thathumanversushumaninteractionwouldintroduceelementsofreputationand

mentalizingconsiderations(Zhu,Mathewsonetal.2012).Thislimitedthehumanplayer’s


considerationstorecentactionsbythecomputeroritspatternhistory,whicharethelevelsof

differentiationourmodelingsoughttodescribe.Thoughthegamefeaturessecurestrategies

foreachrole,neitherhasapureNashequilibriumbecausepayoffcanbeincreasedby

changingstrategy.Therelativelylargestrategyspaceprovidesabreadthofchoiceand

prompts,whichaffordsgreatermodelingdistinctionbetweenreinforcementandbelief-based

behavior.

Regret

Ourhypothesisrestsontheideathatindividualsexposedtoregretintheprimingtask

willengagetoagreater-extentinbelief-basedlearningduringthepatentrace.Onepotential

vectorforthismoreextensiveuseofthemoresophisticatedlearningtypeisanavoidanceof

regretinthepatentrace.Regretisaparticularlyusefulsignalinthepatentracebecauseofthe

task’sasymmetricrolesthatresultinfrequentdisparitiesbetweenoutcomeandregret.We

Box 3-1. Computer player value update

Fic$veearningsf(s)istheamountthatwouldhavebeenearnedbyplayingstrategysinthecurrenttrialt,inwhichPtisthehuman

player’sinvestment,givencomputerendowmenteandrewardr.

ValueupdateV(s)tisthevalueofstrategysattheendoftrialt,

a=erbeingupdatedbyf(s).Thelearningrateconstantα[0,1]determineshowmucheffectnewdatahasonthepreviousvalue.

f (s) = e + r − st, if Pt > st

f (s) = e − st, if Pt ≤ st

V(s)t = V(s)t − 1 + δ ⋅ f (s)


calculateregreterrorineachround� asthedifferencebetweenactualearningsand

highest-possibleearnings:

�

Regreterrorinroundtistherewardreturnedbychosenstrategykbyplayeriminusthe

maximumrewardreturnedbyanystrategyintimetifthatstrategyhadbeenchosen.In

roundsinwhichaplayerlostbutcouldhaveplayedastrategythatwouldhavewon,regretis

easytoidentify:thestrongplayerplays3andlosestotheweakplayerplaying4.Here,the

strongplayerhaskept2,whileplaying5wouldhavewonhim10,producingaregreterrorof

8.Theweakplayer,meanwhile,hasplayedtheequilibriumstrategyandhasnoregreterror.In

thisscenario,ifthestrongplayerhadinvestedjust1,hewouldstillhaveregret,butitwould

belower,havingretained4cardsoftheendowment,resultinginaregreterrorofjust6.Ina

winningscenario,however,therecanstillberegret.Supposethestrongplayerinvestsall5,a

fairlycommoninvestment,andtheweakplayerhasrealizedthefrequentinutilityofinvesting

anythingandsoinvests0.Thestrongplayerwins,taking10,butseesthathecouldhavekept

evenmoreifhehadinvestedaslittleas1card.Thedifferenceandregreterroris4,even

thoughhewontheround.

Participants

Werecruited259healthyvolunteers(124female)viatheCognitiveandExperimental

EconomicsLaboratoryattheUniversityofTrento,Trento,Italy.Subjectshadameanageof

21.8±2.8witharangeof18to38.Subjectswererandomlyassignedtoseatingpositions,

whichdeterminedtheoutcomeoftheprimingexperiment.

REt

REt = Vki −max Vt


Procedure

Participantsreadonscreeninstructions(alsoduplicatedinprint)ontherulesofboth

theWheelofFortunegameandthepatentracebeforecompleting10practiceroundsofthe

WheelofFortune,whichtheyweretoldwouldnotaffecttheirpayment.Theywerethentold

thenextWheelofFortunetaskwouldhavethesinglelargesteffectontheirpayment.All

experimentalsubjectswerepresentedwithidenticalprobabilitywheels(probabilitiesof

66.6/33.3%of€8/-€16and25/75%,€18/-€6)andchoseeithertherightortheleft.

Subjectsundergoingcompletefeedbackprimingsawthearrowonthewheelthey

chosespin,aswellasthearrowontheunchosenwheel;thoseundergoingpartialfeedback

primingsawthearrowonthechosenwheelspin,whiletheotherarrowdisappeared.After

threerevolutions,thearrowsstoppedspinningandrestedontheoutcome,thenanon-screen

alerttoldthesubjectwhethershehadwonorlostandtheamount.Participantscompletedan

emotionalevaluationviaanon-screenLikertitem,thentheWheelofFortuneoutcomewas

presentedagainfor10seconds.

Immediatelyfollowing,thepatentracebegan(Fig.3-3).Subjectsbeganineitherthe

weakorstrongrole(132and127,respectively).Theyplayed50roundsofeachrole,

counterbalancedfororder.Theiropponentwasacomputeralgorithmprogrammedtouse

Fictivelearningtodetermineitsstrategy.Forthealgorithm,eachstrategy(Fivestrategiesinthe

weakrole,playing0-4,orsixinthestrongrole,0-5cards)wasassignedaninitialvalueof5.

Thefunctionupdatedthevaluesforeachstrategyattheendofeachround.Itcalculatedthe

earningseachofitsownchoiceswouldhavebrought,haditbeenplayed(includingthechoice

actuallymade),thenfoundthedifferencebetweenthosevaluesandtheexistingvalues

assignedtoeachstrategy.Thefunctionattenuatedeachdifferencebymultiplyingitbya

learningrateof0.5,thenaddedthoseamountstotheexistingvaluestoarriveatanewarrayof

valuesforthenexttrial.


Participantswereclearlytoldthattheywouldbeplayingagainstacomputer.Butthey

begantheprimingtaskandsubsequentexperimentalgameingroupsof2-4sothatnosubject

wasplayingthegamealoneatanytime.SubjectsweretoldtheresultoftheWheelofFortune

wouldbeaddedtoordeductedfromtheirtotal,whichincludedashow-upfeeof€5.Forthe

patentrace,theyweretoldthateachcardwonorretainedwasworth€0.01.

Ineachtrial,aFixationcrossappearedfor4-8s,followedbytheexplanationscreen,

andagraphicalillustrationoftheendowmentforthesubjectandopponent,aswellasthe

possibleearningsforthesubject.Subjectsselectedtheamountofinvestmentusingarrow

keys,thenconFirmedattheirownpace.Then,2-6slater,theopponent'schoicewasrevealed,

alongwiththesubject'searnings.After50trials,thesubjectwasinformedofearningsforthe

Fig. 3-3 Course of play in each of two roles in the patent race. Players invest all or a portion of their endowment (first row blue outlines; selection in filled blue) and keep any uninvested portion (third row, filled blue). The computer’s investment is revealed (second row, filled gray), and then the player’s prize, if any, is displayed (third and fourth rows, filled green).


round,aswellastheopponent'searnings.Thesubjectthenswitchedrolesandperformed

another50trials.

ControlsubjectshadnoprimingandbeganthecardgamewhentheyFinishedreading

instructions.Wedesignedasimplevisualrepresentationofthecardgamebasedonthatof

Zhuandcolleagues(2012),displayingboththeplayer’sendowmentandtheopponent’s,along

withthepotentialreward.

ComputationalLearningmodels

Weanalyzedparticipantchoicebehaviorbyperformingestimationsusingfourmodels:

Q-learning,atypeofreinforcementlearning(RL);CounterfactualQ-learning,atypeof

reinforcementlearningthatincorporatesunchosenoutcomes(CRL),FictitiousPlay,abelief-

basedlearningmodel(BBL)(Hampton,Bossaertsetal.2008);andahybridmodelthatnests

theRLandBBLmodels,asimpliFiedversionoftheexperience-weightedattractionmodel

(EWA)(Zhu,Mathewsonetal.2012).TheRLandBBLmodelsconstitutespecialcasesofEWA,

withaparameter∂indicatingtherelativeweightofBBLintheexaminedbehavior.EWA

updatesaccordingtotworules,dependingontheplayer’smostrecentchoice:

�

Here,� representsthestrategy(choice)kofplayeri.� isthechosenstrategyinperiodt,so

thesetwoequationsupdatedifferentlydependingonwhetheritappliestothechosenaction

ornot.� isthestrategyplayedbytheopponentinperiodt.Theplayer’sexpectedreward

Vki (t) =

ϕ ⋅ N(t − 1) ⋅ Vki (t − 1) + πi(sk


i = si(t)

ϕ ⋅ N(t − 1) ⋅ Vki (t − 1) + δi ⋅ πi(sk


i ≠ si(t)

ski si(t)

s−i(t)


forplayingagivenstrategykinperiodtis� .Itisdeterminedbythreeparameters:� ,

whichdepreciatespastvaluesatdifferentrates,dependingonhowfastanadaptertheplayer

believestheopponenttobe.Thekeyparameteristhe� ,whichdetermineshowmuchweight

anunplayedoptionhasonupdatedvalues.Ifaplayerbelievesforegonestrategiesdeliveras

muchinformationasthoseplayed,then� reaches1,andthemodelreducestotheBBLmodel.

Atnoweight,� =0,andthemodelreducestoRL.

Foreachmodelandsubjectbysubject,weperformedindividualmaximumlikelihood

estimationwithagridsearchoverarangeofvalues.Wecalculatedpredicteddecision

probabilitiesoverthefullrangeofeachsetofparametersandcomparedthemtothesubject's

actualchoices,selectingthatsetofparameterswiththemaximumloglikelihood.Wethen

performedindividualandgrouplevelestimations.

Results

Weaskedwhetherprimingplayerswithexposuretoalargegamblinglossoutcome

thattypicallyinducesregretwouldmodulatetheirstrategyinadifferenttaskplayed

immediatelyafterward.Subjectsweregroupeddependingonthetypeofprimingthey

underwent:bypositiveornegativevalence,bycompleteorpartialfeedback,andthosewho

receivednopriming.

WeFirstconsideredonlyblock1ofthepatentracebecauseitwastheclosestintimeto

theprimingtask.Todetectdifferencesbetweenpriminggroups,wecharacterizedtheir

gameplay,examiningwhatportionofpossibleearningstheywon(Fig.3-4A)aswellashow

muchregreterrortheywereexposedto,apredictivemeasureofregretavoidance(Fig.3-4B).

Wecalculatedearningspercentagetonormalizebetweentheweakandstrongroles.Earnings

percentageiscalculatedasamountwondividedbytotalpossibleearningsregardlessof

opponent’sstrategy:13fortheweakroleand14forstrong.Weanalyzedtheseresultsacross

Vki (t) θi

δi

δi

δi


all50

block

1 trials,

as well

asin

the

First 20

trials

and the

First5

trials.

We

Fig. 3-4 Performance in the patent race game, grouped by priming condition: A) earnings as a percentage of total possible (to allow comparison between strong and weak conditions) across all 50 trials of block 1. B) Mean Regret Error (actual earnings less highest possible earnings) across all block 1 trials. C) Earnings percentage for first 20 trials (during which priming would likely be stronger). D) Earnings percentage for first 5 trials (during which priming would likely be strongest). E) Mean distance from equilibrium prediction in first 5 trials, accounting for dominated strategies. Error bars are s.e.m.

A

B

C

D

E


expectedanyeffectofprimingtobestrongestintheseearlytrials,sowealsocalculatedfor

eachparticipant’sFirstFivetrialsthemeandistancefromNashequilibrium.Equilibrium

probabilitiesaccountfortheiteratedeliminationofdominatedstrategies,aswellasthefact

thatasinglestrategyismostlikelyforeachrole,butnotabsolute.Foreachtrial,wemeasured

thedistancefromequilibriumastheprobabilityofnotchoosingthatstrategy,orthe

probabilityofchoosingthatstrategysubtractedfrom1.Thepredictedprobabilities(p)are

listedinTable3-1.Distancefromeachiscalculatedas1-p.

Inbothstrongandweakroles,weobservedmildtrendsofregret-primedplayers

earningmorethandisappointment-primedplayers,aswellasrelief-primedplayersearning

morethansatisfaction-primedplayers.Surprisingly,non-primedindividualsearnedmorein

block1thanallpriminggroupswhentheystartedoutintheweakrolebutearnedlessin

block1thanallprimedgroupswhentheystartedoutinthestrongrole.

Regreterrorindicatedlittledifferentiationbetweenprimingtypesintheweakrole.In

thestrongrole,participantsprimedwiththetwotypesofnegativeoutcomehadhigher

averageregreterrorthanthoseprimedwithpositiveoutcome.Participantswhowerenot

primedalsohadhigheraverageregreterror.

Therewaslittledifferentiationandhighvarianceamongallgroupsandacrossboth

rolesinearningspercentageduringtheFirst20trialsoftheblock.Thesamelackof

differentiationcharacterizedtheFirst5trials.AnalyzingthedistancefromNashequilibrium

Table 3-1: Predicted strategy probabilities

Weak Strong

Investment p p

0 0.6 0

1 0 0.2

2 0.2 0

3 0 0.2

4 0.2 0

5 - 0.6


A

B

Fig. 3-5 Mean Regret Error (actual earnings less highest possible earnings) in across all block 1 trials of the patent race game, grouped by priming condition and segregated by trial outcome: A) weak role winning trials B) strong role winning trials D) weak role losing trials E) strong role losing trials. Error bars are s.e.m.

C

D


duringtheFirst5trials,suggestedthatallfourprimingcategoriesplayedexpectedstrategies

lessfrequentlythantheparticipantswhodidnotundergopriming.

WethenlookedatREsegregatedbywinningtrialsandlosingtrials,hypothesizingthat

acrossgroups,playerswouldgenerallybehavedifferentlyinawinversusaloss(Fig.3-5).

Becauseofthetendencytowininthestrongroleandtoloseintheweakrole,weconsidered

onlysubjectswithfourormorewinsorlossesforconsiderationineachcategory(i.e.ifa

playerwonallbuttwoofherroundsinthestrongrole,theoutcomesfollowinghertwolosses

wouldnotbeconsideredintheloss-trialsanalyses).Ofthe192subjects,187wereconsidered

inRE-wincalculationsand181inRE-losscalculations.Wefoundnodifferentiatingtrend

amongtheFivegroups.

A

C

B

D

Fig. 3-6 Performance in the patent race game, grouped by priming feedback and valence: A) Mean regret error (RE) in loss trials across all 50 trials of block 1, by priming feedback information level. B) Mean RE in win trials across all 50 trials of block 1, by priming feedback information level. C) Mean RE in loss trials across all 50 trials of block 1, by priming valence. D) Mean RE in win trials across all 50 trials of block 1, by priming valence. Error bars are s.e.m.


Duetosimilaritiesinperformanceacrossbothaxesoffeedbacktype(i.e.partialand

complete)andfeedbackvalence(i.e.positiveornegative),wepooledprimingtypesacross

thosetwovectorsandanalyzedtheirperformanceinothermeasures.

Playingintheweakrole,subjectsprimedwithfull-informationfeedback(i.e.regretand

relief)exhibitedatrendoflowerREinwintrialsthanthosewhodidnotundergopriming

(Fig.3-6A).Becausetheweakrolepresentsfewopportunitiestowin,thismayrepresent

successfulattemptstoekeoutwinsonopportuneoccasions.Meanwhileinlosstrials,RE

trendshigherforthefull-informationfeedbackpriminggroup.Thismaysuggestsimilar

attemptstowinbyinvestinghighbutlosinganyway.

Consideringprimingbyoutcomevalence,comparedtobothno-primingandnegative-

valencesubjects,playerswhohadapositiveprimingoutcometrendedlowerinREduring

winningtrialsandhigherinlosstrialswhenplayingtheweakrole.Bycontrast,negatively

primedsubjectstrendedhigherinregreterrorinwinningtrialsandlowerinlosstrialswhen

playingthestrongrole.ThissuggeststhattheywerenotefFicientintheirwinsinthestrong

Fig. 3-7 Model frequency estimations A group-level Bayesian model comparison showed that for all types of priming, as well as no priming, the bel ie f -based learn ing (BBL) F ic t i t ious P lay mode l best described behavior. The red line at the 0.25 level represents chance, which none of the other models exceeds.


role,indicatingpossiblelackofsophisticatedunderstandingofthegameorconcernabout

investingtoolowandlosing.Lowregreterrorinlossesindicatesaslightmisreadingofthe

opponent,leadingtoalossbyamarginalamount.

Modeling

Inordertodeterminetherelativecontributionofdifferenttypesoflearning,we

compareddatatofourlearningmodels:ReinforcementLearning(RL),Counterfactual

ReinforcementLearning(CRL),Belief-BasedLearning(BBL)andthehybridmodel

Experience-WeightedAttraction(EWA).TheFirsttwomodelsconstitutespecialcasesofEWA,

withaparameter∂indicatingtherelativeweightofBBLintheexaminedbehavior.Weuseda

group-levelBayesianModelComparisontocompareloglikelihoodsofmodelFittochoice

behavior.Tooursurprise,BBLoutperformedallothermodels(Fig.3-7).Employingthe

learningrateparameterfromtheBBLmodel,wecompareditamongtheprimingtypesbut

foundnosigniFicantrelationships.

Discussion

ThemanipulationfailedtoyieldanysigniFicantdifferenceinmeasurementsofchoice

behavior.Nostudiestoourknowledgehaveattemptedtochangechoicebehaviorinthepatent

racegame,sotheFieldwaswideopenastohowtoattempttheperturbationandhowto

measureitseffects.Afteranumberofconsiderationshowtoinduceregret,wesettledona

large,monetaryresultinthehopesthatitsmagnitudeandsaliencewouldbeeffective.Butitis

possiblethatother,morevisceralformsofregret,suchasautobiographicalrecollection,or

repeatedforms,suchasseveralroundsofthelottery,wouldbringchangestopatentraceplay.

Itmaybethatthemagnitudeofregretwasnothighenoughtoyieldappreciablechangesinthe


lattertask.Itisalsoofcoursepossiblethattherewereeffectsoftheprimingbutthatour

measureswerenotpreciseenoughtorecordthem.Beforeanalyses,however,weconsidered

manyapproachesformeasurement,bothinhidden-variablemodeling,summarystatistics,

andcalculatedvariables,alongwithwhichportionofthetasktoapproachFirst.Wehadlittle

hope,forexample,thatanyeffectwouldpersistintothesecondblockofthetaskbutincluded

itforpurposesofcomparisonwiththeFirst.

Wechosetouseanalgorithmastheopponentforthepurposesofconsistency,butits

behaviordoesnotmatchthatofhumangameplayinthatitdidnotavoidtheiteratively

eliminateddominatedstrategies,otherthanasaresultofadaptationtothehumanchoice

behavior.Thislikelygaverisetodifferentplaybehaviorfromhumansubjectsthantheywould

haveexhibitedhadtheyplayedagainstotherhumans.Thealgorithmcouldbemaintainedbut

itsbehaviorchangedbysimplyreducingtheinitialvaluesofeachdominatedstrategyinan

amountcommensuratewithitstheoreticalrateofavoidance.Thesevaluesareupdatedeach

roundandrepresenttherelativeattractivenessofeachstrategyand,inpart,thelikelihoodof

thatstrategybeingselectedbythealgorithm.Thoughitwouldbeimportanttomaintainsome

balancebetweenconsistencyandnaturalplaybytheopponent,afuturestudymightaimfor

morehuman-likechoices.

Aftertryingtoanswerourhypotheses,withthemorespeciFicprimingtypes,wepooled

priminggroupstoexamineeffectsmorebroadly:byvalenceandbyfeedbacktype.Here,asin

themorespeciFicgroupings,therewerenonotabletrends,norsigniFicanteffects.Although

theseindicatorssuggestedlittlehopeforeffectsyieldedbymodelestimations,weconducted

themandfoundthateventheassumptionofthemodelthatwouldbestFitwasincorrect.The

failureofEWAtobetterpredictsubjectbehaviormadeouroriginalhypothesisimpossibleto

test.WecouldnotcomparethebalanceofRLandBBLbetweengroupsbecauseouranalysis

toldusthatBBLalonepredictedchoicebehavior.


Incontrolexperimentsusinganalgorithmastheopponent,typicallyRLhasbetterFit

participantbehavior,possiblybecauseplayersviewthecontestasasimplerewardsituation

ratherthanatruecompetition(Zhu,Mathewsonetal.2012).TheuseofBBL,however,

suggeststhatparticipantstreatedthegameasacompetitioninwhichbeliefsaboutthe

opponent’sactionsinformedsubsequentdecisions.ThiscouldbeaneffectoftheFictitious

playalgorithmweemployed,whichusesaformofbelief-basedlearningtoguideitschoices.

Becauseofthealreadylargedimensionsofthestudy,weusedonlyonetypeofplayinthe

opponentalgorithm.Employingdifferentalgorithmsinafuturestudymightindicateifthisisa

behavior-mirroringeffect.

Chapter 4: Electrical brain stimulation effect on level-k thinking

Electrical brain stimulation effect on level-k thinking


Experimentalquestion

Howdoesstimulationofprefrontalcortexareaschangetheconsiderationofotherplayersand

accountingfortheiractionsinaniterativethinkingcontest?

Introduction

Incompetitivesituations,choicesthataccuratelyaccountfortheactionsofotherslead

togreatersuccess.Becauseindividualsdemonstraterangesofsophistication,thisassessment

presentsseveralchallenges:understandingthatthereisarange,whatitsboundsareand

whereonitcompetitorslie.Themostsuccessfulplayersarenotnecessarilythosewhoengage

atthehighestlevelsofsophistication,butratherthosewhomostaccuratelyassessthelevelof

sophisticationofothers.Competitivestrategicgamespresentaframeworktomeasurethese

assessmentsbecausetheycallonaplayer’sabilitytomentalize,thatis,toconsiderthestateof

mindofopponents.Correlationsofactivityintheprefrontalcortextoreasoninglevelsduring

suchtaskspresentreasonablecandidatetargetsformanipulationviaelectricalstimulation.

Level-kmodelsassumethatindividualsheterogeneouslyemployacognitivehierarchy

ofthinkingtypes.Theselevelscorrespondtothenumberofrecursionsinconsiderationofthe

beliefsofothers.Peopleatlevel0assumeallothersareactingrandomly,essentiallyanaive

lackofconsiderationofothers.Peopleatlevel1considerthebeliefsoftheotherbutnomore.

Level2thinkersbelievethatotherindividualsareatlevel1andaccountfortheiterative

thinkingatthatlevel.

Themostrationalactioninsuchasituationistocontinueasmanystepsofiterative

thinkinguntilreachingtheNashequilibriumof0,butthefactthatpeoplestopwellshortof

thissuggeststhattheyarerationalwithincertainlimits.Thisboundedrationalitymaybedue


tocognitivelimitationsandthegreatercomputationaldemandsofeachadditionalstep

(Camerer,Hoetal.2004).

JohnMaynardKeynesencapsulatedthisnotionbydescribinganoldtypeofcontest

newspapersusedtorun.Theywouldprintapagefullofheadshotphotosof100women,from

whichreaderschosethesixprettiestandsentthemin.ThewinnerwasselectedbyFinding

whichentrymostcloselymatchedtheaveragepreferencesfromallentrants(i.e.themost

popularchoices).Sothetaskbeforethereaderwasnottorelyonhisownestimationofbeauty

buttoimaginethatofalltheotherentrants,mostofwhomwouldbeunknowntohimand

whomhewouldhavetoconsiderinageneralway.Aplayerinthisgamemightignorethe

methodofFindingthewinnerandsimplyselectphotosaccordingtoherownpreference,not

consideringotherplayersatall.Shemightrankthemaccordingtohowshethinksother

playerswouldpreferthem.Andshemightrankthemaccordingtohowshethinksother

playerswillthinkallotherplayerswillpreferthem.Andsoon.Intheend,individual

assessmentsofbeautydidn’tmatteratall,rathertheabilitytogaugehowallotherswould

assessbeauty(orhowallotherswouldassesstheassessmentofattractiveness).

InamodernlaboratoryandquantiFiableversionofthistask,participantsareinstead

directedtoguessthenumberthatissomefractionoftheaverageofguessesbyall

participants.Responsestothegamecanbereasonablydescribedwithacognitivehierarchy

model.Giventheparametersofanumberbetween0and100and2/3theaverageofall

players,atthelowestlevel,0,aplayerrespondsrandomly,withoutconsiderationofthe

structureofthegameorinteractionwithotherplayers.Atonelevelup,level1,theplayer

considersallotherplayerstobelevel-0playersandbaseshischoiceontheirrandomplay,

guessingtheiraverageguesstobesomethingaround50andthenmultiplyingby2/3,reaching

33.Atlevel2,theplayerFiguresalltheotherplayersarelevel1,thattheyhavesubmitted33,

soshemultipliesthatFigureby2/3(effectively,50*2/3*2/3)andarrivesat22.Continuing


oniteratedlevelsofthinkingleadstoeliminationofdominatedstrategyuntiltheplayer

eventuallyarrivesat0,thegame’sNashequilibrium.

Atlevel0,theplayerisconsiderednaivebecauseherespondswithoutconsiderationof

thestructureofthegameorotherplayers.Level-1playersemployamodelofthegamespace

andrespondtotheactionstheybelieveotherplayerswilltake.Notably,thereisno

opportunitytoadaptstrategybasedontheactionsofotherplayers,sincethoseactionsarenot

revealed.Nevertheless,level-2playersiterateastepfurtherbyimaginingthattheirown

actionsarebeingconsideredbyotherplayersandthereforeinFluencingthechoicesofthe

otherplayers.Atlevelshigherthanlevel2,theplayersconsidertowhatextenttheyandtheir

opponentsareawareofmutualawareness(i.e."IknowthatyouknowthatIknow…"etc.).

Akeyunderstandingofthelevel-kmodelisthatitdoesnotdirectlydescribestrategic

sophisticationability,butrathertheindividual’sassessmentofothers.Anindividualmightbe

capableofhigh-levelthinking,butifsheassessesothersasnaive,shemightonlymakechoices

indicativeoflevel-1thinking.Regardlessoftheactualoutcome,theplayer'sguessindicates

herevaluationoftheotherparticipantsand,therefore,theirownk-levelofthinking.

Thementalcalculationsrequiredforhigherlevelsofthinkinginthebeautycontest

gamedemandthemultiplicationofintegersbyfractionsandfractionsbyfractions.Poor

performanceinthebeautycontest(BC)mightindicatenotalowlevelofreasoningbutrather

poormentalmathematicalabilities.Forthisreason,thesecondsectionofthetaskcomprises

calculationsofintegersmultipliedbyfractionsandintegersmultipliedbyafraction,then

multipliedbythesamefraction,directmimicsofthementalcalculationsrequiredoflevel1

andlevel2thinking,respectively.


Brainareasoflevel-kthinking

Functionalimagingstudieshavelocatedseveralareasintheorbitofrontalcortexthat

covarywithcomponentsofiterativethinking.Themedialprefrontalcortex(mPFC)haslong

beenassociatedwithmentalizing,theconsiderationofthementalstatesofothers(Frithand

Frith1999,AmodioandFrith2006,Mitchell,Macraeetal.2006).Notably,increasedactivityin

themPFCcorrelatedwithcomputationalsignalsassociatedwithastrategythatincorporatesa

player’sconsiderationofhisownchoicesonthedecisionsofothers(Hampton,Bossaertsetal.

2008).ThatstudyconcludesthatthemPFCispartofanetworkthatperformscomputations

usedinmentalization.Ventralportionsofthemedialprefrontalcortex(mPFC)inparticular

hasbeenassociatedwithvariousself-referentialtasksaswellasduringmentalizingtasks

suggestthatitmaybeengagedinassessingthementalstatesofsimilarothersbyreferencing

theunderstandingofownpersonalfeelings(Mitchell,Macraeetal.2006).Similarlyventral

areasofthemPFCwereparticularlyactiveinhigh-levelreasoningplayersduringbeauty

contesttrialsagainstanotherhuman,ascomparedtothoseagainstacomputer(Coricelliand

Nagel2009).Inagainst-humantrialsonly,higheractivitywasobservedinthatareainboth

high-andlow-levelreasoners,suggestingthemPFCisacenterforstrategicthinkingabout

others’behavior.Thatsamestudyalsofoundhigheractivityinrightandleftdorsolateral

prefrontalcortex(dlPFC)inhigh-levelreasoningplayersatgreatermagnitudesthaninlow-

levelreasoningplayers,implicatingtheareasinaprocessofhigher-levelreasoningabout

others.Theydidnotobservecommensurateactivityduringmentalcalculationtasksthat

madesimilarcomputationaldemands.

Toinvestigatehowandatwhatpointtheseconstituentareashaveacausalroleinthe

process,theycanbeelectricallystimulatedforexcitatoryoninhibitoryneuronaleffect,

accompaniedbymeasurementofanybehavioralchanges,inparticularthelevelthinking

demonstratedbymembersoftheexperimentalgroup.


tDCS

Noninvasiveelectricalstimulationallowsforthemodulationofneuronalactivityatthe

regionallevel.Inparticular,transcranialdirect-currentstimulation(tDCS)providesforboth

excitationandinhibition,dependingontheorientationofelectrodes.Thetechniqueinvolvesa

low-levelelectricalcurrent(often≤2mA)betweentwoelectrodes,fromanodaltocathodal,

thoughthebrain.Theareasundertheanodalelectrodearegenerallythoughttoundergo

excitatorystimulationviasub-thresholddepolarizationofneurons.Crucially,tDCSisnot

believedtotriggeractionpotentialsinneurons,butrathertochangethelikelihoodthatan

actionpotentialwillresultinpost-synapticFiring(Nitsche,Frickeetal.2003,Coricelliand

Rusconi2010).Theareasunderlyingthecathodalelectrode,nearwherethecurrentleavesthe

bodyarebelievedtobehyperpolarized,resultinginaninhibitoryeffect.

VariousstudieshavesuccessfullystimulatedthemPFC(Civai,Miniussietal.2015,

Hämmerer,Bonaiutoetal.2016,Zheng,Huangetal.2016)andthedlPFC(Fecteau,Knochet

al.2007,Fecteau,Pascual-Leoneetal.2007,Boggio,Campanhãetal.2010,Hecht,Walshetal.

2010,Minati,Campanhãetal.2012,PripFl,Neumannetal.2013)indecision-makingtasks.

Thoughsimilarlyimplicatedinstrategicthinkingtasks,toourknowledge,modulationhadnot

beenattemptedviatDCS.WeproposedtoinFluencebehaviorbyalteringneuronalactivity

usinganodaltDCS.BecausefMRIstudiessuggestanincreaseinmPFCanddlPFCactivations

duringhigherlevel-kthinking,weaimedtoconFirmthecausalinvolvementoftheareas.We

hypothesizedthatanodalstimulationofthemPFCandthedlPFCwouldencourageexcitatory

activity,resultinginhigherlevel-kperformanceduringthebeautycontesttaskinmore

participants,whileshamstimulationgroupswouldhavefewermembersexhibitinghigher

level-kthinking.Weexpectedstimulationandshamgroupswouldhavesimilaroutcomesof

othermeasurementsoftasksunderstimulation,suchasofmemoryandcalculation.

ToexpanduponbrainimagingFindingsoflevel-kthinking,weappliedelectrical

stimulationtoparticipantsplayingthesamegameusedinapreviousstudy.Resultsfromthat


studydescribedthefrontalactivityduringthetask(CoricelliandNagel2009).Theselectively

heightenedactivityofmPFCintrialsagainstotherhumansandindlPFCamonghigher-level-

thinkingparticipantssuggestedthattheseareasmayplaysomeroleingeneratingstrategic

thinking.Wehypothesizedthatelectricalstimulationonthescalpaboveeachoftheseareas

separatelywouldincreaseneuronalactivity,whichcouldinturngiverisetohigherlevelsof

iterativethinking.Ifthatresultwereindeedfound,itwouldsuggestacausativerolein

iterativethinkingforthetargetedarea.

Methods

ParticipantsplayedtheBeautyContestgameagainstotherpresentparticipants,during

whichsomeunderwentstimulationwhileothershadsensorsplacedontheirscalpsbut

experiencedshamstimulation.Thisexperimentcomprisedtheiterativethinkingtaskcalled

theBeautyContest,whichwasconductedduringtranscranialdirect-currentstimulation

(tDCS)orshamstimulation.Thatmaintaskwasfollowedbyacalculationtask,twodigitspan

memorytasksandFinallyaseriesofquestionnaires.TheBeautyContestrequiresparticipants

toguessanaveragenumber,butthatFigureisinFluencedbyboththeirownselectionandthe

numberstheybelievewillbechosenbyothers.Becausethetargetnumberismodulatedby

thechoicesofallotherplayers,themostsuccessfulplayersconsiderhowtheother

participantswillchoose.Basedonthechoices,eachparticipantwasassignedapreciselevelof

thinkingscoreandthencategorizedashigh-orlow-levelthinking.

Participants

Werecruited64healthyvolunteers(32female)totakepartinatwo-partstudyatthe

MattarelloResearchCenteroftheUniversityofTrento,Italy.Meanageofparticipantswas23.9


years(4.25,SD).Volunteersgavefullyinformedconsentfortheproject,whichwaspartof

umbrellatDCSprojectapprovalfromtheUniversityEthicalCommittee.Eachparticipantwas

screenedtoexcluderiskofepilepticseizure,psycho-activemedicationandconditions

includingpsychologicalorphysicalillnessorhistoryofheadinjury.FortheFirstexperiment,

werecruitedthroughouttheuniversity,butafterobservingdifFicultyinthemathematical

portionsoftheexperiment,forthesecondexperiment,werecruitedinareasoftheuniversity

frequentedbystudentsstudyingscience,mathematicsandengineering.Allsessionswere

conductedwithexactly8volunteers.

ExperimentalDesignandTask

ExperimentersFittedtwoelectrodesover(experiment1)themPFC(Brodmannarea

10)andvisualcortex(BA17)and(experiment2)therightandleftdlPFC(BA9).Afterallhad

beenFitted,participantsthenunderwent30minutesoftrans-cranialdirectcurrent

stimulationwhileperforming50trialsoftheexperimentaltask.Theexperimentaltask

consistedofaFirstsessionof26trialsoftheBeautyContestgameandthenasecondsessionof

24trialsofamentalcalculationtask.Next,theyexecutedatwo-partmemorytask:Forward

DigitSpan,measuringshort-termmemoryandconsistingof2to14trials,dependenton

performance;andbackwarddigitspan,measuringworkingmemoryandconsistingof15

trials,regardlessofperformance.Theexperimentlastedabout90minutes,withthe30

minutesofstimulationcoveringinstructionsforthebeautycontest(sothatanyeffectfrom

stimulationalsoaffectedreadingandcomprehensionoftheinstructions),completionofthe

beautycontestandcompletionofthecalculationtask.Thestimulationperiodendedfor47

participantsduringthedigitspanmemorytasks.ParticipantscompletedtDCSquestionnaires,

Raven’sProgressiveMatricesandacognitivereFlectiontasktomeasureinterindividual

differencesunrelatedtoeffectsofthestimulation.Asitisunknownhowlongtheeffectsofthe


stimulationlast,wecannotexcludethatparticipantswerestillunderthefadinginFluenceof

stimulationduringthesetasks.

ThebeautycontestgameasdesignedbyCoricelliandNagel(2009)consistsofahuman

conditionandacomputercondition.Inthehumancondition,participantsaredirectedto

selectanintegerbetween0and100(inclusive),withtheaimofbeingclosesttoafractionMof

theaverageofthenumberchosenbyallparticipants:M*mean,inwhichsixvaluesareM<1

(1/8,1/5,1/3,1/2,2/3,3/4)andanothersixvaluesareM>1(9/8,6/5,4/3,3/2,5/3,7/4).

Wealsoincludedno-multipliercontroltrialsinwhichM=1.

Inthecomputercondition,theyaretoldthatthecomputerwillselectsevennumbers

from0to100atrandom.Playingonlyagainstthecomputer,andnotagainstother

participants,theparticipantwinsifhernumberisclosesttotheproductofMandtheaverage

ofalleightnumbers(i.e.hernumberandthesevenrandomlyselectednumbers).Thesame13

valuesofMareusedinthecomputerconditionforatotalof26trials.Computerandhuman

conditiontrialswereintermixed.

Inthesecondsession,whichcomprisedthecalculationtrials,participantswere

instructedtoFindtheproductofatwo-digitintegerNandeitherMorM*M,inwhichMwas

thesamesetofmultipliersusedinthebeautycontest,otherthanM=1,whichwasexcluded.

EachoftheMvalues,otherthanM=1,appearedonceintheN*Mcalculationandonceinthe

N*M*Mcalculation(inwhichbothMsarethesamemultiplier).Acorrectresponsewasjudged

tobe+/-1aroundtheroundedupanddownanswer.E.g.,ifN*M=22.2,therange21-24would

bejudgedcorrect.Participantsreceived50eurocentsforeachcorrectresponse.Participants

receivednofeedbackbetweentrials.Toavoidbehavioralpriming,thecalculationtask

followedthebeautycontestforallparticipants.Taskswerepresentedandresponsesrecorded

usingMATLAB(TheMathWorks,Inc.,Natick,MA)usingPsychToolBoxextensions.Participants

wereseatedatdividedcomputerstationsandcouldnotinteractwithorseeeachotherduring

thetask.


Fig. 4-1 Time course of beauty contest and calculation trials A) The screen progression for the beauty contest shows the condition (Human or Computer) at the beginning of each trial, along with a multiplier, in red, and the instruction. After 1-2 sec, the response prompt appeared. Participants answered at their own pace with no time limit. For each human-condition round, a reward of €5 was paid to a single winner from the session, or in case of a tie, divided evenly. For each computer round, the participant won €1 if she was closest to the target or €0.50 in case of a tie with one of the computer’s numbers. B) In the calculation task, participants were instructed to calculate the product of (in M*1 condition) a fraction multiplied by an integer or (in M*2 condition) the product of a fraction, the same fraction and an integer, answering with an integer. The multipliers were the same set as those encountered in the beauty contest. When the prompt appeared, the participant had to press “enter” to continue to the response screen, at which point the prompt disappeared to encourage a response instead of continued consideration. If the participant did not press enter within 21 sec, a warning appeared that there were only 9 seconds remaining. If the participant did not enter a response within 30 seconds, the trial ended and the response was categorized as incorrect. Participants won €0.20 for each correct calculation. A calculation was considered correct if it was within 1 of the correct answer rounded up and rounded down to the nearest integers. All participants performed the calculation task after the beauty contest task.

A

B


Timecourseofexperimentaltasks

EachtrialoftheBeautyContestgame(Fig.4-1A)consistedofaninformationscreen

displayedfor1-2sec,whichincludedtheconditionofthetrial(i.e.humanorcomputer),the

multiplier(M)andtheinstruction.Theinformationscreenremainedvisible,andaresponse

promptappeared,wheretheparticipantcouldtypeherresponseusingthecomputernumber

pad,followedbythe“enter”key.Choicewasself-pacedwithnotimelimit,andthetask

continuedassoonastheparticipantresponded.Oncetheresponsewasentered,aFixation

crossappearedfor1-3secbeforethenexttrialbegan.Participantsreceivednofeedback

betweentrials.

Insession2,thecalculationtask,asimilartimecoursewasfollowed:aninformation

screenwithtypeofcalculation(N*MorN*M*M),multiplierM,integerNandinstruction(2

sec),followedbytheresponseprompt(Fig.4-1B).Choicewasself-paced,butresponsetime

wasconstrainedto30sec,withawarningafter21sec.Assoonastheresponsewasentered,a

Fixationcrossappearedfor1-2sec.

Questionnaires

Afterthestimulationperiod,participantswereaskedfourdebrieFingquestions:

1. Please explain your reasoning in your first choice, M = 2/3, in the human condition.2. Please explain your reasoning in your choice when M = 1/4 in the computer

condition.3. Did you have a general rule for the trials in the human condition?4. Did you have a general rule for the trials in the computer condition?

Short-termandworkingmemorytasks

Participantsthencompletedaforwarddigitspantask,inwhichaFixationcross

appeared,followedbyaseriesofdigits,appearingonscreensinglyandsequentiallyfor1sec

each.Oncetheserieswascomplete,aseriesoflinesinthesameamountastheseries

appearedonscreenwiththeinstructionfortheparticipanttoenterthesequenceinorder.


Participantswereallowedtomakecorrectionsbeforepressing“enter”,whichbeganthenext

trial.Theresponsewasrequiredtohaveexactlythenumberofdigitsasthepromptinorderto

proceed.Thesequencebeganwithaseriesofthreenumbers,increasingbyonedigitoneach

trialuntilanincorrectresponseoraftera9-digitseries.Atthatpoint,asecondsequence

beganwithaseriesofthreedigits,increasingeachrounduntilanincorrectresponseor

completionofthe9-digitseries.Wegeneratedpseudo-randomseriesofnon-repeatingdigits,

thenusedthesameseriesandordersforallparticipants.Thescorewasdeterminedbythe

longestcorrectlycompletedresponseineitherofthetwoseries.

Atthecompletionoftheforwarddigitspantask,instructionsappearedexplainingthe

backwarddigit-spantask,inwhichparticipantsweretoldtheywouldonceagainseeseriesof

numbers,butthatinordertocorrectlyrespond,theyhadtoentertheseriesinthereverse

order.AFixationcrossonceagainappeared,followedbyaseriesofsingledigitsfor1seceach,

thenfollowedbytheseriesofblankspaces.Participantsonceagainenteredtheirresponse

beforetyping“enter”.Inthebackwarddigitspantask,allparticipantscompletedthreeseries

eachoflengthsincreasingfrom4to8,foratotalof15series.Inthebackwardtask,any

numberinthecorrectpositionwasawardedapoint.Forthebackwardtask,weuseda

prescribedsetofseriesfromDevetag&Warglien(2003).

Payment

ParticipantperformancewasFinanciallymotivated.Eachsubjectreceiveda€5show-up

feeand€0.20foreachcorrectcalculation.Ineach8-subjectsession,onehumantrialofthe

BeautyContestwaschosenatrandom,andonecomputertrialwaschosenatrandom.

Anonymizedresultswerepresentedtoallsubjectsonthecomputerscreen.Theparticipants

whocameclosestineachselectedtrialwereawardedanadditional€5.Inthecaseofties,the


prizewassplitevenlyamongalltyingplayers.Eachparticipant’stotalappearedonthescreen

attheendofthesession,andtheywerelaterpaidbybanktransfer.

tDCSstimulation

ThetDCSprocedureappliesaweakdirectcurrentintoandoutofthescalpviatwo

electrodes,eachsandwichedinsaline-soakedspongesandspreadwithalayerofconductance

gel.TheconstantcurrentisdeliveredbyaBrainSTIMbattery-poweredstimulator(E.M.S.

Medical,Bologna,Italy).Ifparticipantsreporteduncomfortableticklingoritchingsensations,

experimentersaddedgelundertheelectrodetoincreasecontactbetweenelectrodeandscalp.

Nonecomplainedofpainduringthesession.DuringdebrieFingsession,subjectsreportedmild

sensationsoftickling,tingling,warmthorpain,mostlyatthebeginningofthesession,but

someattheend,andafewinthemiddle.Theyreportedthatthesensationssubsidedquickly.

Thedirectionofthecurrentcanhavedifferenteffectsonthetargetarea.Anodalstimulation

encouragescorticalexcitability,whilecathodalstimulationinhibits(Nitsche,Doemkesetal.

2007).Participantsinexperiment1wererandomlyassignedtoreceiveanodaltDCSoverthe

mPFC(N=16,10female,meanage=24.4)orshamstimulation(N=16,7female,mean

age=24.4).Participantsinexperiment2wererandomlyassignedtoreceiveanodaltDCSover

therightdlPFC(N=16,7female,meanage=22.4)orshamstimulation(N=16,8female,mean

age=24.5).

StimulationandreferencepointswereselectedbysimulatingtDCSstimulationin

SimNIBSsoftware(Thielscher,Antunesetal.2015)withvariouselectrodeplacementsand

sizes,alongwithvariedcurrentstrengths.SimulationswereviewedinGMSHsoftware

(GeuzaineandRemacle2009).Currentdensity,thestrengthofcurrentdividedbytheareaof

theelectrode,hasaneffectonstimulationefFiciency.Forexcitatorypurposes,itisdesirableto

maketheanodalelectrodesmallertofocuscurrentandtomakethecathodalelectrodelarger


todiffuseit(Nitsche,Doemkesetal.2007).Wesetacurrentdensitytargetof0.07mA/cm2,

whichmeantadecreaseinsizeoftheelectrodeinserviceofgreaterprecisionmightrequire

anattendantreductionincurrentstrength,whichwouldreduceboththereachandintensity

ofthestimulation(Fig.4-2).

Inexperiment1,thesmaller,anodalelectrode(4x4cm)wasplacedoverthevmPFCat

theFPzposition,accordingtotheinternationalEEG10/20system,andthecathodalelectrode

(5x7cm)wasplacedoverthevisualcortexattheOzposition.At16cm2andacurrentof1mA,

thecurrentdensityattheanodalpositionwas0.0625mA/cm2.

Inexperiment2,thesmaller,anodalelectrode(5x5cm)wasplacedovertherightdlPFC

attheF4position,accordingtotheinternationalEEG10/20system,andthecathodal

electrode(5x7cm)wasplacedovertheleftdlPFCattheF3position,bothpositionscalculated

usinganonlinelocationsystem(Beam,Borckardtetal.2009).At25cm2andacurrentof2

mA,thecurrentdensitywas0.08mA/cm2.

Forstimulatedparticipants,acurrentwasrampedupover30secto1mAin

experiment1and2mAinexperiment2,thenkeptconstantforthelengthoftheexperimental

tasks(nomorethan29min),followedby30secrampingdown.Intheshamcondition,the

electrodeswereplacedasinstimulationcondition,butstimulationhaltedafterthe30sec

ramp-up,unbeknownsttotheparticipant.Theproceduresometimesproducesanitching

sensationatthebeginningofasession,andshamparticipantswouldbeexposedtothat

telltalesign,makingituncleartothemiftheywerebeingstimulatedornot(Gandiga,Hummel

etal.2006).

ExperimentersandassistantssetscalplocationsbymeasuringFiduciarypoints,making

measurementsfromthosepoints,thenmarkingonthescalpatelectrodelocations.Electrodes

wereheldinplacebyahairnetandsurgicalrubberstraps.Conductancewiththescalpwas

facilitatedbyapplyingaconductivegeltotheundersideofspongessoakedinasaline

physiologicalsolution.


Protocols

Wesettwostimulationprotocolsforeachexperiment:(full)stimulationandsham.For

eachsession,fourofeachprotocolwererandomlydistributedamongparticipants.

Participantsweretoldtheymightundergostimulationorsham.Theexperimentwasdouble

blind:neitherparticipantsnorexperimenters(duringthetestingandanalysisphases)were

awareofwhichprotocolwasrealandwhichwasshamstimulation.Eachsession’sstimulation

wasinitiatedandmonitoredfromacentralPC(schematicFig.4-2C,D).

Fig. 4-2 Simulated electrode placement over mPFC and visual cortex and simulated effects of current A) Electrode were placed over a simulated brain according to MNI reference coordinates for mPFC and visual cortex, based on Zheng and colleagues (2016). The sizes of the electrodes and the current were adjusted to achieve a focused simulated stimulation, as viewed in a cross-section brain (B). Schematics of the electrode placement are shown for C) mPFC and visual cortex: 1 mA for 30 min and D) right dlPFC and left dlPFC: 2 mA for 30 min.

A B

D AnodeCathodeAnode

Cathode

C


Statisticalanalysis

First,wecalculatedforeachtrialthequadraticdistanceQDMbetweentheresponse

andthetheoreticallevel-kvaluesbasedontheCognitiveHierarchymodelusingtheequation

� ,

wherexisthechoiceofparticipantiinhuman/computerconditionjformultiplierM(Coricelli

andNagel2009).Theequationissolvedforeachlevelofk=(0,1,2).Theminimumquadratic

distanceforeachlevelindicatedthelevel-kforthattrial.Whenaparticipanthadamajority

(sevenormore)trialsofminimumdistanceforalevel,shewasassignedthatlevel-k.Ifthe

participantdidnothavesevenoccurrencesofanylevel,herlevelwassetatlevel0,random

(Table4-1).

Next,weusedsubjectresponsestocalculateapreciselevel-kforeachtrial.Thekis

determinedbythenumberoftimesthemultiplierfractionisappliedtothemeanofthe

integerrange,whichmeansitcanalsobefoundincontinuousvaluesusingthesolving

equation

� ,

whereristheresponsetomultiplierM.E.g.,forM=2/3,aresponseof25indicatesalevel-kof

1.7).Aresponseof0presentsacalculationproblemofpreciselevel-k,sointhoseinstances,

weusedacorrectedprecisevalue,inwhichresponseswereindexedbyadding1tothe

responseinteger.Theywerethendividedby50.5insteadof50,ensuringthatresponsesof

100wouldbethesameasinuncorrected,since101/50.5=100/50.

Then,foreachexperiment,wecomparedlevel-koutcomesbetweenthetwo

stimulationprotocols.Toaccommodateoutliersandnon-normaldistributions,weusedthe

Kruskal-Wallistesttocompareeachsubject’smedianpreciselevel-kforFirsthumantrialsand

thencomputertrials(illustratedinFig.4-3).Wealsoconsidereddatafromthecalculation

task,calculatingtheabsolutedistance(AD)fromthecorrectanswer.WecomparedADacross

QDMijk = (xijM − 50 * Mk)2

ki = log(ri /50)/log(Mi)


Table 4-1: Modal minimum QD

Experiment 1 - vmPFC 2 - dlPFC

Protocol sham full stimulation sham full stimulation

Versus-human trials Level 0 Level 0 Level 1 Level 1

Versus-computer trials Level 2 Level 1 Level 1 Level 1

Table 4-1 Level-k instances by stimulation protocol and opponent type For each trial, we solved for the quadratic distance from levels 0, 1 and 2, classifying the choice as the level with the lowest quadratic distance. If a participant had seven or more trials of one level, she was classified as that level. Otherwise, she was classified as level 0 (random). These are the most common level types for each treatment and condition.

Fig. 4-3 Level-k in Beauty Contest by opponent and by stimulation protocol Precise level-k was calculated for each trial and then averaged by participant. These graphs illustrate the median level-k for the two experiments. There was no significant difference in the medians. Error bars are s.e.m.

A) Experiment 1: human trials B) Experiment 1: computer trials

C) Experiment 2: human trials D) Experiment 2: computer trials


alltrials,aswellascategorizedintotrialswithasinglemultiplier(M*1)andtrialswitha

doublemultiplier(M*2)(Fig.4-4).HigherADvaluesindicateworseperformance.Weused

mixedANOVAtotestforsigniFicancewithinsubjectsformultiplierlevelandbetweensubjects

forstimulationprotocol,aswellasforinteractionbetweenthetwofactors.Becausesphericity

ofdatatestsfailed,weusedGreenhouse-Geissertestsforinterpretation.

Fig. 4-4 Absolute distances in calculation task by stimulation protocol Mean absolute distance from correct answer in all calculation trials (A, C) by stimulation protocol, and mean distance in M*1 and M*2 trials (B, D), by protocol. Mixed ANOVA showed in both experiments an effect of multiplier level but not stimulation protocol alone, nor the interaction between stimulation protocol and multiplier level. Error bars are s.e.m.

A) Experiment 1: all calculation trials

C) Experiment 2: all calculation trials

M*1 trials M*2 trials

D) Experiment 2: calculation trials by multiplier level

B) Experiment 1: calculation trials by multiplier level

M*1 trials M*2 trials


Kruskal-WallistestswererunwithSPSSStatistics(IBMCorp.,Armonk,NY).Mixed

ANOVAtestswererunwithStata,StataCorp.,CollegeStation,TX.

Results

IntheFirstexperiment,testsshowedasigniFicanteffectofthemultiplieroncalculation

performance(F(1,30)=32.36,p<.001)butnosigniFicanceinstimulationprotocol,norinthe

interactionbetweenmultiplierlevelandprotocol.Inthesecondexperiment,wefoundthe

samepattern:asigniFicanteffectofthemultiplier(F(1,30)=130.28,p<.001)butno

signiFicanceinstimulationprotocol,norintheinteraction.

Inexperiment1,aKruskal-WallisTestwasconductedtoexaminethedifferenceson

level-kaccordingtothestimulationprotocolundergone.NosigniFicantdifferenceswerefound

betweenthetwostimulationprotocolsinhumantrials(Chisquare=1.04,p=.309,meanrank

sham=18.19,meanrankfull=14.81),norincomputertrials(Chisquare=1.74,p=.817,mean

ranksham=18.69,meanrankfull=14.31).Inexperiment2,aKruskal-WallisTestwasalso

conductedtoexaminethedifferencesonlevel-kaccordingtothestimulationprotocol

undergone.NosigniFicantdifferenceswerefoundbetweenthetwostimulationprotocolsin

humantrials(Chisquare=0.05,p=.821,meanranksham=16.13,meanrankfull=16.88),nor

incomputertrials(Chisquare=0,p=.955,meanranksham=16.41,meanrankfull=16.59).

(Table4-2).

Table 4-2 Kruskal-Wallis test statistics table

Experiment 2 (dlPFC)

chi-square p mean rank sham mean rank full

versus humans 0.05 0.821 16.13 16.88

versus computer 0 0.955 16.41 16.59

Experiment 1 (vmPFC)

chi-square p mean rank sham mean rank full

versus humans 1.04 0.309 18.19 14.81

versus computer 1.74 0.187 18.69 14.31


Fig. 4-5 Participant description visualizations Visualizations for typical level-k players show 1) performance in the calculation task, with the solid line representing the correct answer and subject responses shown as red diamonds for single-multiplier trials and blue diamonds for double-multiplier trials; 2) beauty contest responses, with the solid red line representing the theoretical level 1 and the dashed blue line, the theoretical level 2, with participant responses shown as red diamonds for computer trials and blue diamonds for human trials; 3) trial-by-trial quadratic distance from level 1 versus computer (red line), level 1 versus humans (dashed blue line) and level 2 versus humans (solid blue line); 4) trial-by-trial level-k precise and corrected versus humans.

A) A level-0 player’s results show proficiency in calculations, but irregularity in the beauty contest against both computer and humans. B) A level-1 player has responses against both computer and human along the theoretical L1 line (graph 2) and shows consistent level-1 choices across all trials. C) A level-2 player has versus-human choices closer to the theoretical level 2 dashed blue line and has level-k choices above level 1 across all trials.

i

ii

iii

iv

Ai

ii

iii

iv

Bi

ii

iii

iv

spacing


AfterFindingnosigniFicantdifferenceintheexperimentaltaskduetostimulation

conditions,weexploredthedatainthehopesofFindingguidanceforasubsequent

experiment.WeFirstplottedcalculationanswerstoensurethatagivenparticipantdidnot

haveanymathematicallimitations(Fig.4-5[i]).Wethencomparedparticipantanswersin

bothhumanandcomputerconditionstotheoreticallevel1(thetargetforresponsesin

computertrials)andtheoreticallevel2(Fig.4-5[ii]).Nextweplottedthetimecourseof

quadraticdistancesoverthecourseofthetaskforeachopponenttype(Fig.4-5[iii]).And

Finally,weplottedtrial-by-triallevel-kforhumantrialsonly(Fig.4-5[iv]).Wecategorized

playersbytheirchoicesoverthe

courseofthetask(Fig.4-6)inan

efforttoseeiftheirunderstanding

ofthetaskappearedtochange(as

illustratedinFig.4-7).Wefound

nodifferenceincategorization

typesbetweenthetypesof

stimulationineitherexperiment.

Severalplayersineach

experiment,andinboth

stimulationprotocols,appearedto

changetheirlevelsofthinkingover

thecourseoftheexperiment:some

fromlevel-0tolevel-1,andsometo

level-1tolevel-2.

Bad Calculators

Random

Random to L1

L1

L1 to L2

L2

0 1 2 3 4 5 6 7 8 9 10

shamfull stimulation

Experiment 1 (mPFC)

Bad Calculators

Random

Random to L1

L1

L1 to L2

L2

0 1 2 3 4 5 6 7 8 9 10

shamfull stimulation

Experiment 2 (dlPFC)

Player categorizations by stimulation protocol

Fig. 4-6 Player categorizations by stimulation protocol


Discussion

WeusedtDCStotest64volunteers,applyingfullorshamstimulationoverFirstthe

mPFCandlaterthedlPFCwhileparticipantsplayedastrategicthinkinggameagainsteach

other.Resultsshowedamildtrendforhigherlevelsofthinkingamongtheshamgroupinthe

Firstexperiment(Fig.4-3[A]),whichaimedtostimulatethemPFCwithanodaltDCS.

Meanwhile,resultsfromthesecondexperiment,inwhichthedlPFCwastargeted,didnot

indicateevenatrendinlevel-kthinkingbetweenshamandstimulation(Fig.4-3[C]).

ThelackofsigniFicantoutcomemayindicateaproblemwithourexperimentaldesign.

Wemayhavetargetedareaswithlessprecisionthannecessary.Thatcouldbeimprovedwitha

Fig 4-7 Evolving participant description visualizations Some players’ choices suggested an understanding of the task that changed throughout the session. A) An example of a player who at first seems to play at level 0, in later trials reduces the quadratic distance to level 1 in both computer and human trials. B) A player who begins making choices around level 1 or level 0 by the late trials of the task makes several level-2 choices in a row.

Ai

ii

iii

iv

Bi

ii

iii

iv


differentarrangementofelectrodes,aswellasbyconsultinganatomicalscansofparticipants

fromwhichtomapelectrodepositioning.

Ontheotherhand,theproblemmaylienotwithdesign.Itmaybethatevenifwe

successfullystimulatedthetargetedareas,anychangeinneuronalactivityhasinsufFicient

effectontaskperformance–orthatanychangeisbelowmeasurablelevels.Itisalsopossible

thattheareaswetargetedformanipulation,whileinvolvedintheprocessofiterativethinking

asdescribedinCoricelliandNagel(2009),donotplayasingularlysufFicientroleinthat

process.Despiteresultsfrompreviousstudies,itispossiblethatmPFCanddlPFCdonotplay

causalrolesinlevel-kthinking.

ThetDCStechniqueitselfisnotfullyprovenandmaynothaveaneffectonbrain

activitythatcanmodulatedecisionmakingatall.ThoughtDCShasbeenusedwithsuccessin

stimulatingmotorareas,itseffectivenessindecision-makingstudieshasbeenfarless

demonstrated.ThedlPFChasbeenapromisingareainthosesuccessfulstudies,butthemPFC

lessso.

Problemscouldliewithinthetaskaswell.ThoughtheBeautyContestgamehasbeen

successfullydeployedbymanygroupsacrossnumerousstudies,thepresentationofthetask

provedparticularlydifFicultinthissetting.Akeycomponentisensuringthetaskinstructions

areclearbutwithoutguidingparticipantstohigherlevelsofthinking.Examplesandpractice

trialshavehighpotentialtopromptthehigherlevelswearetryingtomeasureasarisingfrom

stimulationandsowereexcluded.

Thetrendswedidseeindatasuggestthatstimulationhasasurprisingeffectonlevel-k

thinking,thatis,attenuation(Fig.4-3).Anodalstimulation,whichtypicallyinducescortical

excitability,onthemPFCappearstohavediminishedthenumberofhigh-levelthinkers.This

couldsuggestgreaterspeciFicityisneededinidentifyingandstimulatinglocationsoflevel-k

reasoninginthemPFC,possiblyattheindividuallevel.


Consistentwithpreviousstudies,wepositionedelectrodestostimulatetherightdlPFC

andtodiminishcurrentunderthereferenceelectrodesoasnottoinhibittheleftdlPFC.Ifthat

dispersalwasinsufFicient,itwouldcreateaninhibitoryeffectintheleftdlPFC.Ifthearea’s

roleiniterativethinkingisbilateral,thenaneffectivestimulationwouldbothexciteand

inhibit,possiblyproducinganet-zerochangeinbehavior.

Withoutfeedback,participantsshouldnotlearnoverthecourseofthetask,butwith

repeatedexposuretosimilarpromptswithminimalvariations,reasoningmaychange

resultinginplayerswhoatFirstactaslevel0orlevel1playersrealizingamoresophisticated

strategy–whatsomehaveclaimedisaphenomenoncalledEpiphanyLearning(Chenand

Krajbich2017).Indeedwesawevidenceofthisphenomenoninnineparticipantswhose

behaviorindicatedatransitionfromlevel0tolevel1inthecourseofthetaskandeightwith

indicationsofmovingfromlevel1tolevel2(examplesinFig.4-7).

Inbothstudies,wefoundsigniFicanteffectofthemultiplierlevelinthecalculationtask,

butnotofthestimulationprotocolalone,norofitsinteractionwithmultiplierlevel.This

suggeststhatourgroupscomprisedaspectrumofmathematicalabilities,withsome

participantsabletocalculatewellenoughonM*1trialsbutfewercalculatingwellonM*2

trials.Inagroupwheremostparticipantswerenotabletocalculateeithermultiplierlevel

well,therewouldbenoeffectofmultiplierlevelontheabsolutedistancefromthecorrect

answer.Furthermore,becausethiswasacontroltaskensuringthatanyeffectdetectedinthe

beautycontestisnotduetoinhibitionorencouragementofcalculationabilities,equal

performancebetweenstimulationprotocolsisprerequisiteforfurtherinferencesfromthe

beautycontest.AneffectaccompanyingstimulationofthemPFCwouldbeasurprise,since

CoricelliandNageldetectednoactivityinthemPFCassociatedwiththecontrolcalculation

task(2009).

General Discussion

Themoment-to-momentdecisionshumanbeingsmakethroughouttheirwakinglives

comeatopamountainofpriorexperienceencounteredinavarietyofdomainsandsituations.

Becauseweknowthattheseoccurrencesarenotindependentandcometogethertoforma

continuousexperience,itisreasonabletosuspectthatsomeofthesechoicesand

consequencesmustcommingleandinformoneanother.Yetmanyimplicationsand

mechanismsoftransferaretothispointunexplored.Wepokedatthisproblem,askinghow

differencesindecision-makingconditionsmightinFluencechangesinlaterchoicesand

learning.Infourchapters,weconsideredsimilaritiesbetweendecision-makingregretand

moraldecisionmaking,differencesinlearningwithage,emotionalpriminginlearningand

electricalstimulationofiterativethinking.

Moraldecisionmaking

Understandinghowhumansmakelow-importance,quantiFiabledecisionsmayaid

understandingofbroaderchoiceswithlargerimpactlikemoraldecisions.Thebrainprocesses

observedtounderlycertaintypesofeconomicdecisionmakingandmoraldecisionmaking

appeartooverlap.Perhapsnotsurprisingly,similarinjuriestoanddeFicienciesintheareas

implicatedintheseprocessesgiverisetosimilarhindrancestothoseprocesses.Bothpeople

withhighpscyhopathyindicationsandpatientswithlesionsintheventromedialprefrontal

cortex(vmPFC)experienceregretbutdonotapplyitasfullytofuturedecisionsashealthy

subjects.Peopleofboththesegroupsalsomakemoreutilitarianmoraldecisions,rejectingthe

emotionalattenuationseeninthechoicesofhealthysubjects.

TheconFluenceofthesedeviationsfrombehaviorseeninhealthypopulationssuggest

possibilitiesforthevmPFC’sparticularrole.Itcouldbethesiteforlearningthatbothprovides

Discussion Timberlake 124

errorsignalsthatinformregretandprovidesfoundationformoraldecisionsregardingothers.

ItmaybealocationforconFlictbetweenutilitarianandemotionalconsiderations;ormore

basically,awaystationthatsimplydelaysdecisionsuntiltheconFlictcanbeworkedout

elsewhere.Itmayalsobeanintegratorofemotionintocomplicateddecisionsacrossmany

domains,puttingtousetheexperiencesofthepastinconsiderationoffutureconsequences.

Aging,regret,riskandlearning

Weexamineddatafromtwodecision-makingtasksdesignedtocomparechoice

behaviorandlearningbetweenolderandyoungeradultagegroups.IntheFirsttask,

participantschosebetweentwolotterieswithdifferentprobabilitiesofwinningorlosing

uniformamounts.Insometrials,onlytheoutcomeofthechosenlotterywasrevealed,whilein

others,bothoutcomeswereshown.Inasubsequentprobabilisticlearningtask,participants

selectedbetweenpairsofsymbols,eachofwhichhadhiddenprobabilitiesofdelivering

rewardsorpunishments.Forsomesymbolpairs,bothoutcomeswereshown,whileforother

symbolpairs,onlytheoutcomeofthechosensymbolwasshown.Wecharacterizedtheir

choiceinFluencesinthelotterytaskusingmixedregressions,andweanalyzedlearning

behaviorviacomputationalmodeling.BasedontheFindingsofapreviousstudy,we

hypothesizedthatolderadultswouldexperienceregrettothesameextentasyoungeradults,

butthattheywouldanticipateandavoiditinsubsequentchoicestoalesserextent.Wefurther

hypothesizedthatthetwoagegroupswouldlearnsimilarlyinpartial-informationfeedback

contextsbutthatlearningrateswoulddifferincomplete-informationcounterfactualcontexts.

Wefoundinfactthatinthelotterytask,bothgroupsweresigniFicantlyemotionally

affectedbycomplete-feedbacknegativeoutcomes,theconditionforregret.However,younger

adultsreportedsigniFicantlymorenegativereactionstotheseoutcomes.Yetwhenitcameto

anticipatingoravoidingregret,bothgroupsincorporateditintotheirchoices,butnot

differently.Theattenuatedreactionofolderadultstonegativeoutcomesisconsistentwitha


positivityeffectthataccompaniesaging.Olderadultspayselectiveattentionawayfromthe

negativeearlyinprocessingofexperience.Evenlaterinappraisal,youngeradultstendto

dwellonnegativeexperiences(CharlesandCarstensen2010,CarstensenandDeLiema2018).

Olderadults,whilenotsufferingthenegativeemotionalconsequencestothesameextent,still

appliedtheexperienceintofuturechoicesataboutthesamelevel.ThissuggestsabeneFitof

age:avoidanceofpotentiallynegativeoutcomesbutatloweremotionalcost.Inthelearning

task,youngeradultshadbetteroutcomesintermsofearnings,butitappearedtobedueto

overallperformance,ratherthansustainedabilityincounterfactuallearning.Bothagegroups

earnedmoreincomplete-feedbacktrialsthaninpartial-feedbacktrials,andyoungeradults

earnedmoreincomplete-feedbacktrialsthanolderadultsdid.Thedifferencebetweentrial

types,however,wasnotsigniFicantlydifferentbetweenagegroups.Thisindicatesthatboth

groupsaremoresuccessfulwhenincorporatingcounterfactuallearningandthatyounger

adultssimplyoutperformedolderadultsgenerallyinthelearningtask.Thisfailstosupport

theFindingsofTobiaandcolleagues(2016),whofoundthatolderadultsweremore

responsivetocounterfactualgainsbutthatthisactuallyhinderedsubsequentchoices.Further

analysesshouldexplorethedifferencesingainsbetweenpositiveandnegativecounterfactual

outcomesbetweenagegroups.ThisexperimentwouldalsobeneFitfromextensionto

neuroimagingtocomparetothefMRIresultsoftheTobiastudy.

Althoughourhypothesesdidnotdirectlyaddressriskpreference,thelotterytask

incorporatedpossibilitiesofbothgainandloss,soweconsideredriskasaregressionfactorin

ouranalysis.WefoundthatitdidnotplayasigniFicantroleinthechoicesofyoungeradults

butthatitdidinolderadults,whoavoidedit,andparticularlyincomplete-feedbacktrials.

TheirriskaversionandtherisktoleranceofyoungeradultsispartiallyconsistentwithTymula

andcolleagues’generalassessmentsofriskpreferencevariationsacrossadulthood(2013).

Ourresultsshowlowerrisktoleranceamongolderadultsafterencounteringregretsituations

butriskpreferenceonparwithyoungeradultsinpartial-feedbackcontexts.Ourresults


supportasimilardistinctioninriskpreferenceobservedbetweenyoungeradultsandolder

adultswithMultipleSclerosis,inwhichyoungeradultswereriskneutralinawheeloffortune

lottery,whileolderadultpatientswereriskaverse(Simioni,Schluepetal.2012).Another

studythataddressedbothregretandriskshowedthathealthyolderadultsdidnotchange

risktoleranceafterexperiencingregret,whileyoungeradultsanddepressedolderadultsdid

(Brassen,Gameretal.2012).Theregret-elicitingtaskinthiscasewasa“hot”devilgame,like

theballoonanaloguerisktask,inwhichriskcomputationisnotexplicit,andlearningto

toleratemoreriskeventuallyleadstohigherrewards.Ourresultsbolstersupportforthe

notionthatregretleadstorisk-seekingbehaviorinyoungeradultsbutthatolderadultsare

resistanttothisandmayevenbecomemorerisk-aversiveafterexperiencingregret.Afuture

studywiththisashypothesiscouldmorespeciFicallyaddressthispossibility.

Intaskswhereriskisexplicitlystated,decision-from-descriptionparadigms,suchas

thewheeloffortunelotteries,olderandyoungeradultstypicallyperformsimilarly(Mata,

Josefetal.2011),consistentwithourresultsinpartial-feedbacktrials,butincontrastto

increasedriskaversionwesawincomplete-feedbackcontexts.Yetolderadultsreported

feelinglessbadlyaboutregretoutcomes.Thisraisesthepossibilityofarelationshipbetween

reducedexperienceofregretandincreasedriskaversion,evenwithstabilityofregret

anticipation.AfuturestudycouldexplorethispotentialrelationshipspeciFically.Otherstudies

shouldexamineriskandregretinvariationsofpairedtaskswithattentiontotheconditions

thatgiverisetoregretandtotheparticulartypesofriskeachtaskemploys.

Regretinduction

Wehadparticipantsplayastrategiccompetitiveinvestmentgamewithasymmetric

roles,encouragingvariationsinstrategytorevealpatternsoflearningbehavior.Inprevious

studies,thisgamehasbeenusedtocharacterizeindividuallearning,speciFicallywhen


behavioriscomparedtoahybridExperience-WeightedAttraction(EWA)modelthatnests

bothreinforcementlearning(RL)andbelief-basedlearning.Belief-basedlearning(BBL)

requiresanunderstandingofthestructureofthegameaswellasanticipationofthestrategy

oftheotherplayer.Justpriortoplayingthegame,participantsplayedawheeloffortune

lotterydesignedtoinduceregret,relief,disappointmentorsatisfaction.Inrepeatedgames,

regrethasbeenshowntoinFluencelearning(Camille,Coricellietal.2004).Ourhypothesis

wasthatthoseexposedtocomplete-feedbackcounterfactualemotionswouldbeprimedfor

thinkingaboutalternativesituationsratherthanonlythechoicetheyhadmade.Wesuspected

thattheywouldengagetogreaterextentsandatgreaterratesinthemoresophisticatedBBL

thaninsimplerRL.Wewouldmeasuretheseoutcomeswithaparameterinthehybrid

experience-weightedattractionmodelthatindicatesthebalanceofBBLandRL.

Arequirementofcomputationalmodelingisdemonstratingthatthemodelusedisthe

bestofthoseavailable.BecauseweplannedtouseamodelthatincorporatedbothRLand

BBL,itwasnecessarytoconsiderthosecomparativelysimplermodelsontheirown.Our

modelcomparisonshowed,tooursurprise,thattheBBLmodeloutperformedbothRL,EWA

andareinforcementlearningmodelthatalsoincorporatescounterfactualoutcomes.OurBBL

modelfeaturesalearningrateparameter,butthisisdistinctfromtheweightedparameterof

theEWAmodelthatindicatesrelativeutilizationofRLandBBL.Gaugingthisweightamong

differentgroupswascentraltoourhypothesis.BecausewehadtorejecttheuseoftheEWA

model,wecouldnottestourhypothesis.Previousstudiesthatmodeledpatentracegameplay

haveconsistentlyfoundEWAtobethebest-Fittingmodel.Apossiblereasonforthe

unexpectedoutcomeofourmodelestimationsmaybetheopponentalgorithmwe

programmed.ThealgorithmwasbasedonFictitiousplay,aformofbelief-basedlearning.In

paststudies,playershaveplayedagainstotherhumansoragainstpooledresponsesby

humans,inwhichtheopponent’schoicewasselectedfromanumberofhumanchoicesonthat

trialnumber(Zhu,Mathewsonetal.2012).Thoughwebelievedparticipantswouldbehave


andlearnsimilarlyagainstanalgorithmsettoplaywithsimilarchoicestoahuman,itmay

haveinadvertentlypromptedmirroringbehaviorfromourparticipants.Itisunlikelythatthe

primingtaskhadsuchabroadeffect,sincewedetectednopatternacrossprimingtypes,nor

controlparticipantswhodidnotundergopriming.

Infuturestudies,weshouldconsiderusingadifferenttypeofopponent,beitavariety

ofcomputerizedopponentswithvariedmodelsandparameters,actualhumanopponentsor

thepooledopponentplayemployedbyZhuandcolleagues.WeatFirstrejectedusingthe

pooledplaybecausewebelievedtheplaywouldseemdisjointed,andwewantedarealistic

opponent.Wethoughtthatevenifparticipantsknewtheiropponentwasanalgorithm,they

wouldstillusethesamelearningbehaviorstotrytowin.

Electricalstimulationofsophisticatedthinking

Incompetitivestrategicconditions,considerationofothersandanabilitytogauge

theirmentalstateisanadvantage.Previousstudiesshowthatindividualsconsidertheactions

ofotherstoarangeofextents:fromnotatalltoassessmentofothersasalsohighly

considerate.Weusedabeautycontestcalculationcompetitioninwhicheachparticipant

guessedatargetnumberthatwouldbeinFluencedbytheirownchoice,aswellasthechoices

ofothers.Theaverageofallthechoicesmultipliedbyafractiondeterminedthetargetnumber,

andsoforfractionslessthan1,theequilibriumgoalwas0.Thedegreeofsophisticationof

thinking,then,wasmeasurablygreaterasaparticipant’schoiceapproached0.Inprevious

imagingstudies,thesehigherlevelsofthinkingwereaccompaniedbyincreasedactivityinthe

medialprefrontalcortex(mPFC)andinbothrightandleftdorsolateralprefrontalcortex

(dlPFC)(CoricelliandNagel2009).Awell-establishedlimitationofimagingstudiesistheir

inabilitytoestablishcausation,andinthecaseoffMRI,evensequentialorder.Oneaimof

stimulationstudiesistoFillthatgapbyinterruptingorencouragingprocessesbytargeting


brainareasduringactivitieswhoseresultsarewell-studied.Ifobservablebehaviorischanged,

itindicatesthatthetargetedareaplaysaparticularroleintheprocess.

Wehypothesizedthatthe“neuralsignatures”ofhigher-levelstrategicthinkinginmPFC

anddlPFCindicatedresponsibilitiesoftheseareasformoresophisticatedreasoning.Totest

thehypothesis,wetargetedthoseareasusingtranscranialdirect-currentstimulation(tDCS).If

thegroupofsubjectsreceivingstimulationhadhigherlevelsofthinkingorhigherlevelsat

higherrates,itwouldindicateacausativeroleforthetargetedarea.Ourstimulations,

however,didnotproducedifferentbehavioralresultsbetweenshamandstimulation.This

outcomemaybeanindicationthatthetargetedareasofdlPFCandmPFCdonotplaycausative

rolesinhigher-levelreasoning.ThehigheractivityobservedintheseareasinfMRIstudies

duringhigher-levelreasoningsimplymaynotdrivetheprocessthatgivesrisetothebehavior.

Itwouldbeoverreachtosaythisisconclusive,however,especiallygiventheunproven

natureoftDCSasatechnique.Thoughstudiestargetingmotoractionsandvisualperception

haveseensuccessfulmanipulationviatDCS,therecordinstrategicdecisionmakingisshorter.

AmildtrendinthestimulationofmPFCwasintheoppositedirectionofourhypothesized

result:lowerlevelsofthinkinginthegroupthatreceivedfullstimulation.Thissuggestsless-

sophisticatedthinkingasaresultofexcitatoryactivityinthemPFC.Iffurtherstudysupported

thistrend,itcould,alongwiththeestablishedimagingresults,indicatearegulatoryor

mediatingroleformPFC,ratherthanagenerativeone.However,someevidencefromprevious

studiesindicatesthatanodaltDCSstimulationsformorethan26minutesatsomepointcease

havingexcitatoryeffectsandreversetoinhibitory(Thair,Hollowayetal.2017).Thisisaless

likelyexplanationbecauseweexaminedtrial-by-triallevel-kactivityattheindividualleveland

didnotseeindicationsofareversalfromhigher-tolower-levelactivity.Ifanything,level

thinkingappearedtoincreaseinsomeparticipantsoverthecourseoftheexperiment.Other

studieshavedemonstratedinhibitoryeffectsofcathodalstimulationbutnobehavioralchange


duetoanodalstimulation,suggestingthattargetedbrainareasmaybeactiveatanoptimal

levelandunabletobefurtherexcited(Antal,Nitscheetal.2001).

ReviewsoftDCSstudieshavefoundvariabilitydependingonageandsex,aswellas

mentalstatesofalertness,sleepdebt,timeofday,andevenrecentcaffeineconsumption

(KrauseandCohenKadosh2014).Futurestudiescouldcontrolformoreofthesefactors,

conductingallexperimentalsessionsatthesametimeofday,requestingparticipantsabstain

fromcaffeineforaperiodbeforethesessionandaskingforinformationaboutrecentsleep

habits.

Thesemultipleattemptstomakemodulationstolearninganddecisionmaking

processeslargelydidnotproducemeasurableeffects.Thoughthesewereoftenfailuresto

rejectnullhypotheses,theseinquiriestogetherrevealtheresilienceofasetoflearningand

thinkingprocessesthatcanwithstandperturbationsinthelab.

References

Addis,D.R.,A.T.WongandD.L.Schacter(2008)."Age-relatedchangesintheepisodicsimulationof

futureevents."Psychologicalscience19(1):33-41.

Amodio,D.M.andC.D.Frith(2006)."Meetingofminds:themedialfrontalcortexandsocialcognition."

NatureReviewsNeuroscience7(4).

Anelli,F.,E.Ciaramelli,S.ArzyandF.Frassinetti(2016)."Age-RelatedEffectsonFutureMentalTime

Travel."NeuralPlasticity:1-8.

Ansari,A.,R.MontoyaandO.Netzer(2012)."Dynamiclearninginbehavioralgames:AhiddenMarkov

mixtureofexpertsapproach."QuantitativeMarketingandEconomics10(4).

Antal,A.,M.A.NitscheandW.Paulus(2001)."Externalmodulationofvisualperceptioninhumans."

Neuroreport12(16):3553-3555.

Baskin-Sommers,A.,A.M.Stuppy-SullivanandJ.W.Buckholtz(2016)."Psychopathicindividuals

exhibitbutdonotavoidregretduringcounterfactualdecisionmaking."ProceedingsoftheNational

AcademyofSciences113(50):14438-14443.

Baskin-Sommers,A.R.,C.S.Neumann,L.M.CopeandK.A.Kiehl(2016)."Latent-variablemodelingof

braingray-mattervolumeandpsychopathyinincarceratedoffenders."Journalofabnormal

psychology125(6):811.

Bault,N.,G.CoricelliandA.Rustichini(2008)."InterdependentUtilities:HowSocialRankingAffects

ChoiceBehavior."PLoSONE3(10).

Bault,N.,M.JofFily,A.RustichiniandG.Coricelli(2011)."Medialprefrontalcortexandstriatummediate

theinFluenceofsocialcomparisononthedecisionprocess."ProceedingsoftheNationalAcademyof

Sciences108(38):16044-16049.

Bault,N.,S.Palminteri,V.AglieriandG.Coricelli(2018).Reducedvaluecontextualizationimpairs

punishmentavoidancelearningduringaging,CenterforMind/BrainSciences(Cimec),Universityof

Trento,Trento,ItalyLaboratoiredeNeurosciencesCognitives,INSERMU960,EcoleNormale

Supérieure,Paris,FranceInstitutdeNeurosciencesdelaTimone,UMR7289,CNRSandUniversité

Aix-Marseille,FranceDepartmentofEconomics,UniversityofSouthernCalifornia,USA.

Bayer,H.M.andP.W.Glimcher(2005)."MidbrainDopamineNeuronsEncodeaQuantitativeReward

PredictionErrorSignal."Neuron:129-141.

References Timberlake 132

Beam,W.,J.J.Borckardt,S.T.ReevesandM.S.George(2009)."AnefFicientandaccuratenewmethod

forlocatingtheF3positionforprefrontalTMSapplications."BrainStimulation2(1):50-54.

Bechara,A.,A.R.Damasio,H.DamasioandS.W.Anderson(1994)."Insensitivitytofuture

consequencesfollowingdamagetohumanprefrontalcortex."Cognition50(1-3):7-15.

Bechara,A.,H.Damasio,D.TranelandA.R.Damasio(1997)."DecidingAdvantageouslyBefore

KnowingtheAdvantageousStrategy."Science275(5304):1293-1295.

Bechara,A.,D.TranelandH.Damasio(2000)."Characterizationofthedecision-makingdeFicitof

patientswithventromedialprefrontalcortexlesions."Brain123(11):2189-2202.

Bechara,A.,D.Tranel,H.DamasioandA.R.Damasio(1996)."FailuretoRespondAutonomicallyto

AnticipatedFutureOutcomesFollowingDamagetoPrefrontalCortex."CerebralCortex6(2):

215-225.

Beer,J.S.,E.A.Heerey,D.Keltner,D.ScabiniandR.T.Knight(2003)."Theregulatoryfunctionofself-

consciousemotion:Insightsfrompatientswithorbitofrontaldamage."JournalofPersonalityand

SocialPsychology85(4):594.

Behrens,T.E.J.,L.T.Hunt,M.W.WoolrichandM.F.S.Rushworth(2008)."Associativelearningofsocial

value."Nature456(7219):245-249.

Bell,D.E.(1982)."RegretinDecisionMakingunderUncertainty."OperationsResearch30(5):961-981.

Blair,R.J.(1995)."Acognitivedevelopmentalapproachtomortality:investigatingthepsychopath."

Cognition57(1):1-29.

Blair,R.J.R.(2007)."Theamygdalaandventromedialprefrontalcortexinmoralityandpsychopathy."

TrendsinCognitiveSciences11(9):387-392.

Boggio,P.,C.Campanhã,C.A.Valasek,S.Fecteau,A.Pascual-LeoneandF.Fregni(2010)."Modulationof

decision-makinginagamblingtaskinolderadultswithtranscranialdirectcurrentstimulation."

EuropeanJournalofNeuroscience31(3):593-597.

Bopp,K.L.andP.Verhaeghen(2005)."Agingandverbalmemoryspan:ameta-analysis."Thejournals

ofgerontology.SeriesB,Psychologicalsciencesandsocialsciences60(5):33.

Bopp,K.L.andP.Verhaeghen(2005)."AgingandVerbalMemorySpan:AMeta-Analysis."TheJournals

ofGerontology:SeriesB.

Brand,M.andH.J.Markowitsch(2010)."Aginganddecision-making:aneurocognitiveperspective."

Gerontology56(3):319-324.


Brassen,S.,M.Gamer,J.Peters,S.GluthandC.Buchel(2012)."Don'tLookBackinAnger!

ResponsivenesstoMissedChancesinSuccessfulandNonsuccessfulAging."Science336(6081):

612-614.

Byrne,R.(2002)."Mentalmodelsandcounterfactualthoughtsaboutwhatmighthavebeen."Trendsin

CognitiveSciences6(10):426-431.

Camerer,C.andT.-H.Ho(1998)."Experience-WeightedAttractionLearninginCoordinationGames:

ProbabilityRules,Heterogeneity,andTime-Variation."JournalofMathematicalPsychology42(2-3):

305-326.

Camerer,C.F.(2003)."Behaviouralstudiesofstrategicthinkingingames."Behaviouralstudiesof

strategicthinkingingames.

Camerer,C.F.,H.T.HoandK.J.Chong(2004)."ACognitiveHierarchyModelofGames."TheQuarterly

JournalofEconomics119(3):861-898.

Camille,N.,G.Coricelli,J.Sallet,P.Pradat-Diehl,J.-R.DuhamelandA.Sirigu(2004)."Theinvolvementof

theorbitofrontalcortexintheexperienceofregret."Science(NewYork,N.Y.)304(ii):1167-1170.

Camille,N.,C.A.GrifFiths,K.Vo,L.K.FellowsandJ.W.Kable(2011)."Ventromedialfrontallobedamage

disruptsvaluemaximizationinhumans."TheJournalofneuroscience:theofFicialjournalofthe

SocietyforNeuroscience31(20):7527-7532.

Carstensen,L.L.andM.DeLiema(2018)."Thepositivityeffect:anegativitybiasinyouthfadeswith

age."CurrentOpinioninBehavioralSciences19:7-12.

Carstensen,L.L.,B.Turan,S.Scheibe,N.Ram,H.Ersner-HershField,G.R.Samanez-Larkin,K.P.Brooks

andJ.R.Nesselroade(2011)."Emotionalexperienceimproveswithage:evidencebasedonover10

yearsofexperiencesampling."Psychologyandaging26(1):21-33.

Charles,S.T.andL.L.Carstensen(2010)."SocialandEmotionalAging."AnnualReviewofPsychology

61(1):383-409.

Chen,W.andI.Krajbich(2017)."Computationalmodelingofepiphanylearning."Proceedingsofthe

NationalAcademyofSciences114(18):4637-4642.

Chen,Y.andY.Sun(2003)."AGEDIFFERENCESINFINANCIALDECISION-MAKING:USINGSIMPLE

HEURISTICS."EducationalGerontology29(7):627-635.

Ciaramelli,E.andG.d.Pellegrino(2011)."VentromedialPrefrontalCortexandtheFutureofMorality

TL-3."EmotionReview(3):308-309.


Cima,M.,F.TonnaerandM.D.Hauser(2010)."Psychopathsknowrightfromwrongbutdon’tcare."

SocialCognitiveandAffectiveNeuroscience5(1):59-67.

Civai,C.,C.MiniussiandR.I.Rumiati(2015)."Medialprefrontalcortexreactstounfairnessifthis

damagestheself:atDCSstudy."SocialCognitiveandAffectiveNeuroscience10(8):1054-1060.

Clément,J.P.,R.F.Nassif,J.M.LégerandF.Marchan(1997)."[Developmentandcontributiontothe

validationofabriefFrenchversionoftheYesavageGeriatricDepressionScale]."L'Encéphale:91-99.

Connolly,T.,L.D.OrdóñezandR.Coughlan(1997)."RegretandResponsibilityintheEvaluationof

DecisionOutcomes."OrganizationalBehaviorandHumanDecisionProcesses70(1):73-85.

Coricelli,G.,H.D.Critchley,M.JofFily,J.P.O'Doherty,A.SiriguandR.J.Dolan(2005)."Regretandits

avoidance:aneuroimagingstudyofchoicebehavior."NatureNeuroscience8(9):1255-1262.

Coricelli,G.,R.J.DolanandA.Sirigu(2007)."Brain,emotionanddecisionmaking:theparadigmatic

exampleofregret."TrendsinCognitiveSciences11(6):258-265.

Coricelli,G.andR.Nagel(2009)."Neuralcorrelatesofdepthofstrategicreasoninginmedialprefrontal

cortex."ProceedingsoftheNationalAcademyofSciences106(23):9163-9168.

Coricelli,G.andE.Rusconi(2010).ProbingthedecisionalbrainwithrTMSandtDCS.AHandbookof

ProcessTracingMethodsforDecisionResearch:ACriticalReviewandUser’sGuide.M.Schulte-

Mecklenbeck,A.KuehbergerandR.Ranyard.Hove,UK,PsychologyPress:205-222.

Coricelli,G.andA.Rustichini(2010)."CounterfactualThinkingandEmotions:RegretandEnvy

Learning."PhilosophicalTransactionsoftheRoyalSocietyB:BiologicalSciences365(1538).

Damasio,A.R.,D.TranelandH.Damasio(1990)."Individualswithsociopathicbehaviorcausedby

frontaldamagefailtorespondautonomicallytosocialstimuli."BehaviouralBrainResearch41(2):

81-94.

Dayan,P.(1994)."Computationalmodelling."CurrentOpinioninNeurobiology4(2):212-217.

Denburg,N.L.,C.A.Cole,M.Heandez,T.H.Yamada,D.Tranel,A.BecharaandR.B.Wallace(2007)."The

OrbitofrontalCortex,Real-WorldDecisionMaking,andNormalAging."AnnalsoftheNewYork

AcademyofSciences1121(1):480-498.

Derouesne,C.,J.Poitreneau,L.Hugonot,M.Kalafat,B.DuboisandB.Laurent(1999)."[Mini-Mental

StateExamination:ausefulmethodfortheevaluationofthecognitivestatusofpatientsbythe

clinician.ConsensualFrenchversion]."Pressemédicale(Paris,France:1983):1141-1148.


Devetag,G.andM.Warglien(2003)."Gamesandphonenumbers:Doshort-termmemoryboundsaffect

strategicbehavior?"JournalofEconomicPsychology24(2):189-202.

Eldar,E.andY.Niv(2015)."Interactionbetweenemotionalstateandlearningunderliesmood

instability."NatureCommunications6:6149.

Erixon-Lindroth,N.,L.Farde,T.-B.Wahlin,J.Sovago,C.HalldinandL.Bäckman(2005)."Theroleofthe

striataldopaminetransporterincognitiveaging."PsychiatryResearch:Neuroimaging138(1).

Fecteau,S.,D.Knoch,F.Fregni,N.Sultani,P.BoggioandA.Pascual-Leone(2007)."DiminishingRisk-

TakingBehaviorbyModulatingActivityinthePrefrontalCortex:ADirectCurrentStimulation

Study."TheJournalofNeuroscience27(46):12500-12505.

Fecteau,S.,A.Pascual-Leone,D.H.Zald,P.Liguori,H.Théoret,P.S.BoggioandF.Fregni(2007).

"ActivationofPrefrontalCortexbyTranscranialDirectCurrentStimulationReducesAppetitefor

RiskduringAmbiguousDecisionMaking."TheJournalofNeuroscience27(23):6212-6218.

Foster,D.P.andR.Vohra(1999)."RegretintheOn-LineDecisionProblem."GamesandEconomic

Behavior29(1-2):7-35.

Foster,D.P.andH.P.Young(2003)."Learning,hypothesistesting,andNashequilibrium."Gamesand

EconomicBehavior45(1):73-96.

Frijda,N.H.,P.KuipersandE.t.Schure(1989)."Relationsamongemotion,appraisal,andemotional

actionreadiness.TL-57."JournalofPersonalityandSocialPsychology57(2):212-228.

Frith,C.D.andU.Frith(1999)."Interactingminds–abiologicalbasis."Science286(5445).

Fudenberg,D.andD.Levine(1999)."Learningingames."EuropeanEconomicReview42(3-5):

631-639.

Fudenberg,D.andD.K.Levine(1995)."ConsistencyandcautiousFictitiousplay."JournalofEconomic

DynamicsandControl19(5-7):1065-1089.

Fudenberg,D.andD.K.Levine(1998)."Thetheoryoflearningingames."MITpress2.

Gallagher,H.L.andC.D.Frith(2003)."Functionalimagingoftheoryofmind."Trendsincognitive

sciences7(2):77-83.

Gandiga,P.C.,F.C.HummelandL.G.Cohen(2006)."TranscranialDCstimulation(tDCS):Atoolfor

double-blindsham-controlledclinicalstudiesinbrainstimulation."ClinicalNeurophysiology

117(4):845-850.


Geuzaine,C.andJ.F.Remacle(2009)."Gmsh:A3-DFiniteelementmeshgeneratorwithbuilt-inpre-

andpost-processingfacilities."InternationalJournalforNumericalMethodsinEngineering79(11):

1309-1331.

Gillan,C.M.,S.Morein-Zamir,M.Kaser,N.A.Fineberg,A.Sule,B.J.Sahakian,R.N.CardinalandT.W.

Robbins(2014)."CounterfactualProcessingofEconomicAction-OutcomeAlternativesinObsessive-

CompulsiveDisorder:FurtherEvidenceofImpairedGoal-DirectedBehavior."BiologicalPsychiatry

75(8):639-646.

Greene,J.D.(2007)."WhyareVMPFCpatientsmoreutilitarian?Adual-processtheoryofmoral

judgmentexplains."TrendsinCognitiveSciences11(8):322-323.

Greene,J.D.,L.E.Nystrom,A.D.Engell,J.M.DarleyandJ.D.Cohen(2004)."TheNeuralBasesof

CognitiveConFlictandControlinMoralJudgment."Neuron44(2):389-400.

Greene,J.D.,R.B.Sommerville,L.E.Nystrom,J.M.DarleyandJ.D.Cohen(2001)."AnfMRI

investigationofemotionalengagementinmoraljudgment."Science(NewYork,N.Y.)293(5537):

2105-2108.

Gu,X.,X.Wang,A.Hula,S.Wang,S.Xu,T.M.Lohrenz,R.T.Knight,Z.Gao,P.DayanandP.R.Montague

(2015)."Necessary,YetDissociableContributionsoftheInsularandVentromedialPrefrontal

CorticestoNormAdaptation:ComputationalandLesionEvidenceinHumansTL-35."TheJournal

ofNeuroscience35(2):467-473.

Guerini,R.,L.FitzGibbonandG.Coricelli(inpress)."Theroleofagencyinregretandreliefin3-to10-

year-oldchildren."

Hämmerer,D.,J.Bonaiuto,M.Klein-Flügge,M.BiksonandS.Bestmann(2016)."Selectivealterationof

humanvaluedecisionswithmedialfrontaltDCSispredictedbychangesinattractordynamics."

ScientiFicReports6(1).

Hampton,A.N.,P.BossaertsandJ.P.O'Doherty(2008)."Neuralcorrelatesofmentalizing-related

computationsduringstrategicinteractionsinhumans."ProceedingsoftheNationalAcademyof

SciencesoftheUnitedStatesofAmerica105(18).

Hampton,A.N.,P.BossaertsandJ.P.O’Doherty(2006)."TheRoleoftheVentromedialPrefrontalCortex

inAbstractState-BasedInferenceduringDecisionMakinginHumans."TheJournalofNeuroscience

26(32):8360-8367.

Hart,S.(2005)."AdaptiveHeuristics."Econometrica73(5):1401-1430.


Hart,S.andA.Mas-Colell(2000)."ASimpleAdaptiveProcedureLeadingtoCorrelatedEquilibrium."

Econometrica:1127-1150.

Hayden,B.Y.,J.M.PearsonandM.L.Platt(2009)."FictiveRewardSignalsintheAnteriorCingulate

Cortex."Science324(5929):948-950.

Hecht,D.,V.WalshandM.Lavidor(2010)."TranscranialDirectCurrentStimulationFacilitatesDecision

MakinginaProbabilisticGuessingTask."TheJournalofNeuroscience30(12):4241-4245.

Hershey,D.A.andJ.A.Wilson(1997)."AgeDifferencesinPerformanceAwarenessonaComplex

FinancialDecision-MakingTask."ExperimentalAgingResearch23(3):257-273.

Hosseini,S.M.,M.Rostami,Y.Yomogida,M.Takahashi,T.TsukiuraandR.Kawashima(2010)."Aging

anddecisionmakingunderuncertainty:Behavioralandneuralevidenceforthepreservationof

decisionmakingintheabsenceoflearninginoldage."NeuroImage:1514-1520.

Hosseini,S.M.,M.Rostami,Y.Yomogida,M.Takahashi,T.TsukiuraandR.Kawashima(2010)."Aging

anddecisionmakingunderuncertainty:behavioralandneuralevidenceforthepreservationof

decisionmakingintheabsenceoflearninginoldage."NeuroImage52(4):1514-1520.

Hsu,M.andL.Zhu(2012)."Learningingames:neuralcomputationsunderlyingstrategiclearning."

RechercheséconomiquesdeLouvain78(3):47-72.

Hughes,M.A.,M.C.DolanandJ.C.Stout(2013)."Regretinthecontextofunobtainedrewardsin

criminaloffenders."CognitionandEmotion28(5):913-925.

Kahneman,D.andA.Tversky(1979)."ProspectTheory:AnAnalysisofDecisionunderRisk."

Econometrica47(2).

Kim,H.,S.ShimojoandJ.P.O'Doherty(2006)."Isavoidinganaversiveoutcomerewarding?Neural

substratesofavoidancelearninginthehumanbrain."PLoSbiology4(8).

Kishida,K.T.,I.Saez,T.Lohrenz,M.R.Witcher,A.W.Laxton,S.B.Tatter,J.P.White,T.L.Ellis,P.E.M.

PhillipsandR.P.Montague(2016)."SubseconddopamineFluctuationsinhumanstriatumencode

superposederrorsignalsaboutactualandcounterfactualreward."ProceedingsoftheNational

AcademyofSciences113(1).

Koenigs,M.,M.KruepkeandJ.P.Newman(2010)."Economicdecision-makinginpsychopathy:A

comparisonwithventromedialprefrontallesionpatients."Neuropsychologia48(7):2198-2204.

Koenigs,M.,M.Kruepke,J.ZeierandJ.P.Newman(2012)."Utilitarianmoraljudgmentinpsychopathy."

Socialcognitiveandaffectiveneuroscience7(6):708-714.


Koenigs,M.,L.Young,R.Adolphs,D.Tranel,F.Cushman,M.HauserandA.Damasio(2007)."Damageto

theprefrontalcortexincreasesutilitarianmoraljudgements."Nature446(7138):908-911.

Kohn,N.,I.Falkenberg,T.Kellermann,S.B.Eickhoff,R.C.GurandU.Habel(2014)."Neuralcorrelatesof

effectiveandineffectivemoodinduction."SocialCognitiveandAffectiveNeuroscience9(6):

864-872.

Krause,B.andR.CohenKadosh(2014)."Notallbrainsarecreatedequal:therelevanceofindividual

differencesinresponsivenesstotranscranialelectricalstimulation."Frontiersinsystems

neuroscience8:25.

Levens,S.M.,J.T.Larsen,J.Bruss,D.Tranel,A.BecharaandB.A.Mellers(2015)."Whatmighthave

been?Theroleoftheventromedialprefrontalcortexandlateralorbitofrontalcortexin

counterfactualemotionsandchoiceTL-54."Neuropsychologia54VN-readcube.com.

Levens,S.M.,J.T.Larsen,J.Bruss,D.Tranel,A.BecharaandB.A.Mellers(2015)."Whatmighthave

been?Theroleoftheventromedialprefrontalcortexandlateralorbitofrontalcortexin

counterfactualemotionsandchoiceTL-54."Neuropsychologia54.

Löckenhoff,C.E.andL.L.Carstensen(2007)."Aging,emotion,andhealth-relateddecisionstrategies:

Motivationalmanipulationscanreduceagedifferences."PsychologyandAging22(1):134.

Lohrenz,T.,K.McCabe,C.F.CamererandP.R.Montague(2007)."NeuralsignatureofFictivelearning

signalsinasequentialinvestmenttask."ProceedingsoftheNationalAcademyofSciencesofthe

UnitedStatesofAmerica104(22):9493-9498.

Loomes,G.andR.Sugden(1982)."RegretTheory:AnAlternativeTheoryofRationalChoiceUnder

Uncertainty."TheEconomicJournal92(368):805.

Manstead,A.S.R.(2000).Theroleofmoralnormintheattitude–behaviorrelation.Appliedsocial

research.Attitudes,behavior,andsocialcontext:Theroleofnormsandgroupmembership.D.J.

TerryandM.A.Hogg.Mahwah,NJ,LawrenceErlbaumAssociates:11-30.

Marchiori,D.andM.Warglien(2008)."PredictingHumanInteractiveLearningbyRegret-DrivenNeural

Networks."Science319(5866):1111-1113.

Marchiori,D.andM.Warglien(2008)."Predictinghumaninteractivelearningbyregret-drivenneural

networks."Science319(2001):1111-1113.

Markman,K.D.,I.Gavanski,S.J.ShermanandM.N.McMullen(1993)."TheMentalSimulationofBetter

andWorsePossibleWorlds."JournalofExperimentalSocialPsychology29(1):87-109.


Mata,R.,A.K.Josef,G.R.Samanez-LarkinandR.Hertwig(2011)."Agedifferencesinriskychoice:a

meta-analysis."AnnalsoftheNewYorkAcademyofSciences1235(1):18-29.

Mather,M.andL.L.Carstensen(2002)."AgingandAttentionalBiasesforEmotionalFaces."

PsychologicalScience14(5):409-415.

Megiddo,N.(1980)."Onrepeatedgameswithincompleteinformationplayedbynon-Bayesian

players."InternationalJournalofGameTheory9(3):157-167.

Mellers,B.,A.SchwartzandI.Ritov(1999)."Emotion-basedchoice."JournalofExperimental

Psychology:General128(3).

Minati,L.,C.Campanhã,H.D.CritchleyandP.Boggio(2012)."Effectsoftranscranialdirect-current

stimulation(tDCS)ofthedorsolateralprefrontalcortex(DLPFC)duringamixed-gamblingrisky

decision-makingtask."CognitiveNeuroscience3(2):80-88.

Mitchell,J.P.,C.N.MacraeandM.R.Banaji(2006)."Dissociablemedialprefrontalcontributionsto

judgmentsofsimilaranddissimilarothers."Neuron50(4):655-663.

Moll,J.,R.d.Oliveira-Souza,I.E.BramatiandJ.Grafman(2002)."FunctionalNetworksinEmotional

MoralandNonmoralSocialJudgmentsTL-16."NeuroImage16VN-readcube.com(3):696-703.

Moll,J.,R.d.Oliveira-Souza,I.E.BramatiandJ.Grafman(2002)."FunctionalNetworksinEmotional

MoralandNonmoralSocialJudgmentsTL-16."NeuroImage16(3):696-703.

Montague,P.R.,B.King-CasasandJ.D.Cohen(2006)."ImagingValuationModelsinHumanChoice."

AnnualReviewofNeuroscience29:417-448.

Newton,J.D.,F.J.Newton,M.T.Ewing,S.BurneyandM.Hay(2013)."Conceptualoverlapbetween

moralnormsandanticipatedregretinthepredictionofintention:Implicationsfortheoryof

plannedbehaviourresearch."Psychology&Health28(5):495-513.

Nicolle,A.,D.R.Bach,C.FrithandR.J.Dolan(2011)."Amygdalainvolvementinself-blameregret."

SocialNeuroscience6(2):178-189.

Nitsche,M.A.,S.Doemkes,T.Karaköse,A.Antal,D.Liebetanz,N.Lang,F.TergauandW.Paulus(2007).

"ShapingtheEffectsofTranscranialDirectCurrentStimulationoftheHumanMotorCortex."Journal

ofNeurophysiology97(4):3109-3117.

Nitsche,M.A.,K.Fricke,U.Henschke,A.Schlitterlau,D.Liebetanz,N.Lang,S.Henning,F.TergauandW.

Paulus(2003)."PharmacologicalModulationofCorticalExcitabilityShiftsInducedbyTranscranial

DirectCurrentStimulationinHumans."TheJournalofPhysiology553(1):293-301.


O'Doherty,J.P.(2004)."Rewardrepresentationsandreward-relatedlearninginthehumanbrain:

insightsfromneuroimaging."Currentopinioninneurobiology14(6):769-776.

Palminteri,S.,M.Khamassi,M.JofFilyandG.Coricelli(2015)."Contextualmodulationofvaluesignalsin

rewardandpunishmentlearning."NatureCommunications6:8096.

Park,D.C.,G.Lautenschlager,T.Hedden,N.S.Davidson,A.D.SmithandP.K.Smith(2002)."Modelsof

visuospatialandverbalmemoryacrosstheadultlifespan."PsychologyandAging17(2):299.

Parker,D.,A.S.R.MansteadandS.G.Stradling(1995)."Extendingthetheoryofplannedbehaviour:

Theroleofpersonalnorm."BritishJournalofSocialPsychology34(2):127-138.

Platt,M.L.andJ.M.Pearson(2016)."Dopamine:Contextandcounterfactuals."Proceedingsofthe


PripFl,J.,R.Neumann,U.KöhlerandC.Lamm(2013)."Effectsoftranscranialdirectcurrentstimulation

onriskydecisionmakingaremediatedby‘hot’and‘cold’decisions,personality,andhemisphere."

EuropeanJournalofNeuroscience38(12):3778-3785.

Rapoport,A.andW.Amaldoss(2000)."Mixedstrategiesanditerativeeliminationofstrongly

dominatedstrategies:Anexperimentalinvestigationofstatesofknowledge."JournalofEconomic

Behavior&Organization42(4):483-521.

Reed,A.E.,L.ChanandJ.A.Mikels(2014)."Meta-analysisoftheage-relatedpositivityeffect:Age

differencesinpreferencesforpositiveovernegativeinformation."PsychologyandAging29(1):1.

Riggle,E.D.B.andM.M.S.Johnson(1996)."Agedifferenceinpoliticaldecisionmaking:Strategiesfor

evaluatingpoliticalcandidates."PoliticalBehavior18(1):99-118.

Ritov,I.(1996)."ProbabilityofRegret:AnticipationofUncertaintyResolutioninChoice."

OrganizationalBehaviorandHumanDecisionProcesses66(2):228-236.

Rivis,A.,P.SheeranandC.J.Armitage(2009)."ExpandingtheAffectiveandNormativeComponentsof

theTheoryofPlannedBehavior:AMeta-AnalysisofAnticipatedAffectandMoralNorms."Journalof

AppliedSocialPsychology39(12):2985-3019.

Roese,N.J.(1994)."Thefunctionalbasisofcounterfactualthinking."JournalofPersonalityandSocial

Psychology66(5):805.

Roth,A.E.andI.Erev(1995)."Learninginextensive-formgames:Experimentaldataandsimple

dynamicmodelsintheintermediateterm."Gamesandeconomicbehavior.


Rottenberg,J.andJ.J.Gross(2007)."EmotionandEmotionRegulation:AMapforPsychotherapy

Researchers."ClinicalPsychology:ScienceandPractice14(4):323-328.

Samanez-Larkin,G.R.andB.Knutson(2015)."Decisionmakingintheageingbrain:changesin

affectiveandmotivationalcircuits."NatureReviewsNeuroscience16(5).

Samanez-Larkin,G.R.,S.M.Levens,L.M.Perry,R.F.DoughertyandB.Knutson(2012)."Frontostriatal

WhiteMatterIntegrityMediatesAdultAgeDifferencesinProbabilisticRewardLearning."The

JournalofNeuroscience32(15):5333-5337.

Sanna,L.J.(1997)."Self-EfFicacyandCounterfactualThinking:UpaCreekwithandwithoutaPaddle."

PersonalityandSocialPsychologyBulletin23(6):654-666.

Sanna,L.J.(1998)."DefensivePessimismandOptimism:TheBitter-SweetInFluenceofMoodon

PerformanceandPrefactualandCounterfactualThinking."Cognition&Emotion12(5):635-665.

Sanna,L.J.andK.Turley(1996)."AntecedentstoSpontaneousCounterfactualThinking:Effectsof

ExpectancyViolationandOutcomeValence."PersonalityandSocialPsychologyBulletin22(9):

906-919.

Schultz,W.(2002)."GettingFormalwithDopamineandReward."Neuron36(2):241-263.

Schultz,W.,P.DayanandR.P.Montague(1997)."Aneuralsubstrateofpredictionandreward."Science

275(5306):1593-1599.

Schultz,W.,L.TremblayandJ.R.Hollerman(1998)."Rewardpredictioninprimatebasalgangliaand

frontalcortex."Neuropharmacology37(4-5):421-429.

Shivapour,S.K.,C.M.Nguyen,C.A.ColeandN.L.Denburg(2012)."EffectsofAge,Sex,and

NeuropsychologicalPerformanceonFinancialDecision-Making."FrontiersinNeuroscience6:82.

Simioni,S.,M.Schluep,N.Bault,G.Coricelli,J.Kleeberg,R.A.DuPasquier,M.Gschwind,P.Vuilleumier

andJ.-M.Annoni(2012)."MultipleSclerosisDecreasesExplicitCounterfactualProcessingandRisk

TakinginDecisionMaking."PLOSONE7(12):e50718.

Suddendorf,T.andM.C.Corballis(2007)."Theevolutionofforesight:Whatismentaltimetravel,and

isituniquetohumans?"TheBehavioralandbrainsciences30(3):299.

Sutton,R.S.andA.G.Barto(1998).Reinforcementlearning:Anintroduction,MITpressCambridge.

Suzuki,S.,E.L.S.Jensen,P.BossaertsandJ.P.O.Doherty(2016)."Behavioralcontagionduringlearning

aboutanotheragent’srisk-preferencesactsontheneuralrepresentationofdecision-risk."

113(14).


Tamir,M.andM.D.Robinson(2007)."Thehappyspotlight:positivemoodandselectiveattentionto

rewardinginformation."Personality&socialpsychologybulletin33(8):1124-1136.

Thair,H.,A.L.Holloway,R.NewportandA.D.Smith(2017)."TranscranialDirectCurrentStimulation

(tDCS):ABeginner'sGuideforDesignandImplementation."FrontiersinNeuroscience11:641.

Thielscher,A.,A.AntunesandG.B.Saturnino(2015)."FieldModelingforTranscranialMagnetic

Stimulation:AUsefulTooltoUnderstandthePhysiologicalEffectsofTMS?**Allauthorscontributed

equallytothework."201537thAnnualInternationalConferenceoftheIEEEEngineeringin

MedicineandBiologySociety(EMBC)2015:222-225.

Thomas,B.C.,K.E.CroftandD.Tranel(2011)."HarmingKintoSaveStrangers:FurtherEvidencefor

AbnormallyUtilitarianMoralJudgmentsafterVentromedialPrefrontalDamage."Journalof

CognitiveNeuroscience23(9):2186-2196.

Tobia,M.J.,R.Guo,J.Gläscher,U.Schwarze,S.Brassen,C.Büchel,K.ObermayerandT.Sommer(2016).

"Alteredbehavioralandneuralresponsivenesstocounterfactualgainsintheelderly."Cognitive,

Affective,&BehavioralNeuroscience16(3):457-472.

Troiano,A.R.,M.SchulzerandF.F.-R.L.Synapse(2010)."DopaminetransporterPETinnormalaging:

dopaminetransporterdeclineanditspossibleroleinpreservationofmotorfunction."Synapse.

Tymula,A.,L.A.Belmaker,L.Ruderman,P.W.GlimcherandI.Levy(2013)."Likecognitivefunction,

decisionmakingacrossthelifespanshowsprofoundage-relatedchanges."Proceedingsofthe


VanHoeck,N.,P.D.WatsonandA.K.Barbey(2015)."Cognitiveneuroscienceofhumancounterfactual

reasoning."FrontiersinHumanNeuroscience9.

Weisberg,D.P.andS.R.Beck(2012)."Thedevelopmentofchildren'sregretandrelief."Cognition&

Emotion26(5):820-835.

Wood,S.,J.Busemeyer,A.Koling,C.R.CoxandH.Davis(2005)."OlderAdultsasAdaptiveDecision

Makers:EvidenceFromtheIowaGamblingTask."PsychologyandAging20(2):220.

Yang,Y.,A.Raine,T.Lencz,S.Bihrle,L.LaCasseandP.Colletti(2005)."VolumeReductioninPrefrontal

GrayMatterinUnsuccessfulCriminalPsychopaths."BiologicalPsychiatry57(10):1103-1108.

Yoon,C.,C.A.ColeandM.P.Lee(2009)."Consumerdecisionmakingandaging:Currentknowledgeand

futuredirections."JournalofConsumerPsychology19(1):2-16.


Young,C.B.andR.Nusslock(2016)."Positivemoodenhancesreward-relatedneuralactivity."Social

CognitiveandAffectiveNeuroscience11(6):934-944.

Zeelenberg,M.,J.Beattie,J.v.d.PligtandN.K.d.Vries(1996)."ConsequencesofRegretAversion:

EffectsofExpectedFeedbackonRiskyDecisionMakingTL-65."OrganizationalBehaviorand

HumanDecisionProcesses65VN-readcube.com(2):148-158.

Zeelenberg,M.,J.Beattie,J.v.d.PligtandN.K.d.Vries(1996)."ConsequencesofRegretAversion:

EffectsofExpectedFeedbackonRiskyDecisionMakingTL-65."OrganizationalBehaviorand

HumanDecisionProcesses65(2):148-158.

Zeelenberg,M.andR.Pieters(2007)."ATheoryofRegretRegulation1.0."JournalofConsumer

Psychology17(1).

Zeelenberg,M.,W.W.vanDijkandA.S.R.Manstead(1998)."ReconsideringtheRelationbetween

RegretandResponsibility."OrganizationalBehavior&HumanDecisionProcesses74(3).

Zeelenberg,M.,W.W.vanDijk,A.S.R.MansteadandJ.vanderPligt(1998)."TheExperienceofRegret

andDisappointment."CognitionandEmotion12(2).

Zeelenberg,M.,W.W.vanDijk,J.vanderPligt,A.S.R.Manstead,P.vanEmpelenandD.Reinderman

(1998)."EmotionalReactionstotheOutcomesofDecisions:TheRoleofCounterfactualThoughtin

theExperienceofRegretandDisappointment."OrganizationalBehavior&HumanDecision

Processes75(2).

Zheng,H.,D.Huang,S.Chen,S.Wang,W.Guo,J.Luo,H.YeandY.Chen(2016)."ModulatingtheActivity

ofVentromedialPrefrontalCortexbyAnodaltDCSEnhancestheTrustee’sRepaymentthrough

Altruism."FrontiersinPsychology7:1437.

Zhu,L.,K.E.MathewsonandM.Hsu(2012)."Dissociableneuralrepresentationsofreinforcementand

beliefpredictionerrorsunderliestrategiclearning."ProceedingsoftheNationalAcademyof

Sciences109(5):1419-1424.

the effects of counterfactual comparison on learning and...

Documents