incidental learning of rewarded associations bolsters ...nov 16, 2015  · 3. reward may strengthen...

18
Incidental Learning of Rewarded Associations Bolsters Learning on an Associative Task Michael Freedberg, Jonathan Schacherer, and Eliot Hazeltine University of Iowa Reward has been shown to change behavior as a result of incentive learning (by motivating the individual to increase their effort) and instrumental learning (by increasing the frequency of a particular behavior). However, Palminteri et al. (2011) demonstrated that reward can also improve the incidental learning of a motor skill even when participants are unaware of the relationship between the reward and the motor act. Nonetheless, it remains unknown whether these effects of reward are the indirect results of manipulations of top– down factors. To identify the locus of the benefit associated with rewarded incidental learning, we used a chord-learning task (Seibel, 1963) in which the correct performance of some chords was consistently rewarded with points necessary to complete the block whereas the correct performance of other chords was not rewarded. Following training, participants performed a transfer phase without reward and then answered a questionnaire to assess explicit awareness about the rewards. Experiment 1 revealed that rewarded chords were performed more quickly than unrewarded chords, and there was little awareness about the relationship between chords and reward. Experiment 2 obtained similar findings with simplified responses to show that the advantage for rewarded stimulus combinations reflected more efficient binding of stimulus–response (S-R) associations, rather than a response bias for rewarded associations or improved motor learning. These results indicate that rewards can be used to significantly improve the learning of S-R associations without directly manipulating top– down factors. Keywords: reward, associative learning, incidental learning, feedback An organism strives to maximize environmental gains, or rewards, so it is not surprising that reward has a powerful impact on a range of behaviors. Previously, researchers have examined the effect of reward on learning during procedural tasks (Abe et al., 2011; Palminteri et al., 2011; Wächter, Lungu, Lui, Willingham, & Ashe, 2009). Typically, it is assumed that these effects reflect either instrumental learning, in which reward increases the frequency of a particular behavior through reinforcement, or incentive learning, in which reward bolsters the effort of the learner. Thus, in instrumental learning, the presentation of reward directly increases the likelihood of a response, whereas in incentive learning, the presentation of the reward increases effort, which may lead to better performance and associative encoding. However, in most experiments examining the role of reward in learning, participants are aware of how to obtain the reward (usually because an informative cue lets the participant know that they will be rewarded for successful task performance). Therefore, it is difficult to know whether reward itself improves learning or if reward simply boosts top– down factors (such as attention and motivation) that cause stronger learning gains. For this reason, it is an open question whether reward can bolster associative learning while reducing the impact of top– down factors by minimizing awareness of how to achieve the reward. To examine this, we tested whether reward can improve the incidental learning of rewarded associations by training participants in a stimulus– response (S-R) learning task in which the relationship between behavior and reward was not revealed. Reward and Incidental Learning Evidence that reward can modulate incidental learning was reported by Wächter et al. (2009). In their experiment, participants performed the serial reaction time (RT) task (SRTT) under one of three conditions. Participants in the reward group were told that responses that were made faster than their baseline RT would be rewarded with money (an example of incentive learning), while participants in the punishment group lost money for performance that was slower than baseline. A third group received a similar amount of money to the other two groups, but this amount was not performance-based. Participants in the reward group expressed significantly better retention of the sequence over the punishment and neutral groups, indicating that although rewards and punish- ments can be used to improve performance on a procedural learn- ing task, only rewards yielded significant learning gains. These results suggest two possibilities: (a) that reward directly bolsters incidental learning, or (b) that rewards can increase motivation, which results in better performance. One complication with studying the effect of reward on inci- dental learning is that it can be difficult to tease apart from This article was published Online First November 16, 2015. Michael Freedberg, Department of Psychological and Brain Sciences, Interdisciplinary Neuroscience Program, University of Iowa; Jonathan Schacherer, Department of Psychological and Brain Sciences, University of Iowa; Eliot Hazeltine, Department of Psychological and Brain Sciences, Interdisciplinary Neuroscience Program, University of Iowa. We thank Teresa Treat for providing statistical expertise analyzing the data with nonlinear mixed effect models. Correspondence concerning this article should be addressed to Michael Freedberg, Department of Psychology, University of Iowa, E11 Seashore Hall E, Iowa City, IA 52242-1407. E-mail: [email protected] Journal of Experimental Psychology: Learning, Memory, and Cognition © 2015 American Psychological Association 2016, Vol. 42, No. 5, 786 – 803 0278-7393/16/$12.00 http://dx.doi.org/10.1037/xlm0000201 786

Upload: others

Post on 09-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

Incidental Learning of Rewarded Associations Bolsters Learning on anAssociative Task

Michael Freedberg, Jonathan Schacherer, and Eliot HazeltineUniversity of Iowa

Reward has been shown to change behavior as a result of incentive learning (by motivating the individualto increase their effort) and instrumental learning (by increasing the frequency of a particular behavior).However, Palminteri et al. (2011) demonstrated that reward can also improve the incidental learning ofa motor skill even when participants are unaware of the relationship between the reward and the motoract. Nonetheless, it remains unknown whether these effects of reward are the indirect results ofmanipulations of top–down factors. To identify the locus of the benefit associated with rewardedincidental learning, we used a chord-learning task (Seibel, 1963) in which the correct performance ofsome chords was consistently rewarded with points necessary to complete the block whereas the correctperformance of other chords was not rewarded. Following training, participants performed a transferphase without reward and then answered a questionnaire to assess explicit awareness about the rewards.Experiment 1 revealed that rewarded chords were performed more quickly than unrewarded chords, andthere was little awareness about the relationship between chords and reward. Experiment 2 obtainedsimilar findings with simplified responses to show that the advantage for rewarded stimulus combinationsreflected more efficient binding of stimulus–response (S-R) associations, rather than a response bias forrewarded associations or improved motor learning. These results indicate that rewards can be used tosignificantly improve the learning of S-R associations without directly manipulating top–down factors.

Keywords: reward, associative learning, incidental learning, feedback

An organism strives to maximize environmental gains, or rewards,so it is not surprising that reward has a powerful impact on a range ofbehaviors. Previously, researchers have examined the effect of rewardon learning during procedural tasks (Abe et al., 2011; Palminteri et al.,2011; Wächter, Lungu, Lui, Willingham, & Ashe, 2009). Typically, itis assumed that these effects reflect either instrumental learning, inwhich reward increases the frequency of a particular behavior throughreinforcement, or incentive learning, in which reward bolsters theeffort of the learner. Thus, in instrumental learning, the presentation ofreward directly increases the likelihood of a response, whereas inincentive learning, the presentation of the reward increases effort,which may lead to better performance and associative encoding.

However, in most experiments examining the role of reward inlearning, participants are aware of how to obtain the reward(usually because an informative cue lets the participant know thatthey will be rewarded for successful task performance). Therefore,it is difficult to know whether reward itself improves learning or if

reward simply boosts top–down factors (such as attention andmotivation) that cause stronger learning gains. For this reason, it isan open question whether reward can bolster associative learningwhile reducing the impact of top–down factors by minimizingawareness of how to achieve the reward. To examine this, wetested whether reward can improve the incidental learning ofrewarded associations by training participants in a stimulus–response (S-R) learning task in which the relationship betweenbehavior and reward was not revealed.

Reward and Incidental Learning

Evidence that reward can modulate incidental learning wasreported by Wächter et al. (2009). In their experiment, participantsperformed the serial reaction time (RT) task (SRTT) under one ofthree conditions. Participants in the reward group were told thatresponses that were made faster than their baseline RT would berewarded with money (an example of incentive learning), whileparticipants in the punishment group lost money for performancethat was slower than baseline. A third group received a similaramount of money to the other two groups, but this amount was notperformance-based. Participants in the reward group expressedsignificantly better retention of the sequence over the punishmentand neutral groups, indicating that although rewards and punish-ments can be used to improve performance on a procedural learn-ing task, only rewards yielded significant learning gains. Theseresults suggest two possibilities: (a) that reward directly bolstersincidental learning, or (b) that rewards can increase motivation,which results in better performance.

One complication with studying the effect of reward on inci-dental learning is that it can be difficult to tease apart from

This article was published Online First November 16, 2015.Michael Freedberg, Department of Psychological and Brain Sciences,

Interdisciplinary Neuroscience Program, University of Iowa; JonathanSchacherer, Department of Psychological and Brain Sciences, Universityof Iowa; Eliot Hazeltine, Department of Psychological and Brain Sciences,Interdisciplinary Neuroscience Program, University of Iowa.

We thank Teresa Treat for providing statistical expertise analyzing thedata with nonlinear mixed effect models.

Correspondence concerning this article should be addressed to MichaelFreedberg, Department of Psychology, University of Iowa, E11 SeashoreHall E, Iowa City, IA 52242-1407. E-mail: [email protected]

Journal of Experimental Psychology:Learning, Memory, and Cognition

© 2015 American Psychological Association

2016, Vol. 42, No. 5, 786–8030278-7393/16/$12.00 http://dx.doi.org/10.1037/xlm0000201

786

Page 2: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

incentive learning. In other words, reward may modulate learningor it may modulate top–down factors (such as attention or moti-vation) that then cause learning or performance gains. To addressthis ambiguity, Palminteri et al. (2011) presented either negligibleor substantial rewards with no cue, and after the production of themotor response (in this case three-finger triplets); some tripletsconsistently received a substantial reward, whereas others receiveda negligible reward. Participants did not explicitly know whetherthe trial would be rewarded or not when they responded. None-theless, participants responded more quickly to highly rewardedtriplets than negligibly rewarded triplets by the end of training,even though they were presumably unaware of which triplets wererewarded. Because this type of learning cannot be classified asincentive learning, the researchers claimed that the performance ofthese triplets benefitted from reward beyond the effects of practiceand motivation.

Palminteri et al.’s (2011) results also speak to a possible mech-anism responsible for this effect: modulation of dopaminergicrelease in the striatum. The researchers posited that dopaminergicactivity was responsible for the performance differences betweenhigh and low reward triplets. To test their hypothesis, patients withTourette’s syndrome, who experience increased dopaminergictransmission to the striatum (Graybiel & Canales, 2001), per-formed the same task. Strikingly, unmedicated patients withTourette’s syndrome experienced disproportionally better perfor-mance on highly rewarded triplets (compared with low rewardtriplets) relative to comparisons. It is important to note that thisfinding was unrelated to motor learning in general since unmedi-cated patients with Tourette’s syndrome experienced significantlyworse performance relative to comparisons despite a large effect ofreward. This finding suggests that reward may influence the en-coding of S-R contingencies through a mechanism associated withdopamine transmission rather than simply the encoding of a motorskill.

How Does Reward Affect Incidental learning?

Palminteri et al. (2011) provided clear evidence that reward canimprove incidental learning and that this benefit appears to belinked to dopamine. However, the locus of the reward effectremains unknown. In the present study, we consider five possibil-ities:

1. Reward may improve motor learning, so responses rely-ing on rewarded motor programs are produced more

quickly (e.g., Abe et al., 2011; Palminteri et al., 2011;Wächter et al., 2009).

2. Participants may become biased to produce rewardedresponses, and thus produce them quickly (e.g., Rescorla,1968, 1988).

3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance isspeeded regardless of whether a reward is presented, butonly when the particular response is signaled by a par-ticular stimulus or stimulus combination.

4. Performance may be speeded in anticipation of a reward(e.g., Haith, Reppert, & Shadmehr, 2012; Opris, Leb-edev, & Nelson, 2011).

5. Stimuli that are rewarded may be attended and perceivedmore quickly than stimuli that receive smaller rewards(e.g., Anderson, Laurent, & Yantis, 2011; Roper, Vecera,& Vaidya, 2014).

The purpose of the current study is to identify which of theseaccounts best describes the locus of the reward-related learningeffects observed by Palminteri et al. (2011). Table 1 summarizesthe predictions of each of our five starting hypotheses across thetraining and transfer phases of the two experiments. In Experiment1, we used a task in which individual stimuli were rewarded or notrewarded with equal probability to examine whether the benefit ofreward depended on the individual stimuli or the combination ofstimuli and their combined response. Furthermore, after a trainingphase in which rewards were presented for specific combinationsof stimuli, we presented blocks of trials in which no rewards weregiven to determine whether the benefits of reward were associatedwith learning and/or performance. If responses to individual stim-uli are speeded by reward (Hypothesis 5), then no difference inperformance should be observed in Experiment 1. Because thespeed of performance may be sensitive to the anticipation ofreward (Haith et al., 2012), it is possible that differences will notbe observed, or will quickly decay, during transfer (consistent withHypotheses 4 and 5). On the other hand, the hypotheses that focuson the effects of reward on encoding rather than performance(Hypotheses 1, 2, and 3) predict an advantage for previouslyrewarded combinations during training and transfer.

Table 1List of Hypotheses to Explain the Effect of Reward on Incidental Learning Along With Their Respective Predictions

Hypothesis Exp. 1 training Exp. 1 transfer Exp. 2 training Exp. 2 transfer

1: Reward improves motor learning R1 R1 R � U R � U2: Reward biases response production R1 R1 R � U R � U3: Reward strengthens the learning of associations between

stimuli and responses R1 R1 R1 R14: Reward anticipation speeds performance R1 R � U R1 R � U5: Rewarded biases attention/perception R � U R � U R � U R � U

Note. We offer one of two predictions for each condition and for each hypothesis: (a) that rewarded combinations will be performed faster and/or moreaccurately than unrewarded combinations (R1) or (b) that both chord types will be performed similarly (R � U). Exp. � experiment.

787REWARD AND INCIDENTAL LEARNING

Page 3: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

In Experiment 2, we used a task in which combinations ofstimuli were mapped to just two equally rewarded single-keypressresponses. By choosing to include overlapping responses for re-warded and unrewarded combinations, we attempted to determinewhether reward selectively reinforces S-R associations (Hypothe-sis 3), biases the production of particular responses (Hypothesis 2)or improves motor learning (Hypothesis 1). If reward improves thelearning of motor programs or biases the production of particularresponses, then no differences between rewarded and unrewardedresponses should be observed in Experiment 2. However, if rewardimproves the learning of S-R associations, then rewarded combi-nations should be performed better late in training and the advan-tage should persist when rewards are eliminated. Note that onlyHypothesis 3 predicts an advantage for rewarded combinationsover unrewarded combinations during training and transfer forboth experiments.

Rewards were only presented after performance of half of theavailable combinations and only if the trial was performed cor-rectly as in Palminteri et al. (2011). We hypothesized that re-warded combinations would be performed more efficiently by theend of training compared with unrewarded combinations. Addi-tionally, we predicted that rewarded associations would be betterencoded than unrewarded associations and thus would be per-formed with greater efficiency even when the rewards were with-drawn (Hypothesis 3).

The Chord Task

To examine whether rewarded associations are formed moreeffectively than unrewarded associations, we needed a task that (a)shows robust incidental learning, and (b) in which learned associ-ations can be easily compared with unlearned associations. There-fore, we adapted Seibel’s (1963) chord task in which participantsare presented with multiple stimuli and respond simultaneouslywith multiple effectors. This procedure allows for the comparisonof responses to practiced combinations of stimuli to unpracticedcombinations of stimuli, which consistently reveals better perfor-mance with practiced combinations (Freedberg, Wagschal, & Ha-zeltine, 2014; Hazeltine, Aparicio, Weinstein, & Ivry, 2007; Wifallet al., 2014).

In the current experiment, correct performance was rewardedfor some chords but not for others, and we assessed whetherthere were performance differences between the two sets. Per-formance differences between rewarded and unrewarded chordswere assessed during training. However, to determine whethertraining with rewarded responses significantly bolstered learn-ing, we compared the performance of rewarded and unrewardedchords during transfer where no rewards were given. Here, inthe absence of reward, we compared performance on threechord types: rewarded chords, which consistently yielded re-wards when participants responded correctly, unrewardedchords, which never yielded rewards, and novel chords, whichwere not practiced during training and only appeared in thetransfer block at the end of the experiment. The novel chordsallowed us to assess combination-specific learning by compar-ing the performance of rewarded and unrewarded chords tonovel chords that were withheld during training.

Experiment 1

Method

Participants. Thirty-two participants with normal or corrected-tonormal vision were recruited from the undergraduate population at theUniversity of Iowa (M age � 20.5 � 3.02; 16 female). All participantsreceived credit toward completion of an introductory psychology courseat the University. All procedures were approved by the University ofIowa’s institutional review board. Consent was obtained prior to theexperiment.

Apparatus and stimuli. The experiment was implementedusing E-Prime 2.0 software (Schneider, Eschman, & Zuccolotto,2002). Participants responded manually to stimuli presented on a19-inch computer monitor. The visual stimuli consisted of 8 facesof Caucasian males on a gray background. The same race andgender was used for all faces to minimize grouping strategies. Facestimuli were used so that the current design could be easilytransferred to a functional imaging protocol in future work. Eachstimulus was 2° wide and appeared 1.87° to the left and right of thefixation cross (1° � 1°). Responses were made using the fournonthumb fingers of the left and right hand on a standard “qwerty”keyboard. Four of the stimuli (Set A) corresponded to left handresponses and the remaining four stimuli (Set B) corresponded toright hand responses on each trial. To respond to one of the fourstimuli to the left of the fixation, participants pushed the q, w, e, orr key. To respond to one of the four right stimuli, participantspressed the u, i, o, or p key. There was a direct mapping of eachstimulus to a key press such that each face corresponded to onebutton (see Figure 1). The mapping of the 8 faces to the keys wascounterbalanced across participants so that half of all participantsresponded to Set B with the left hand and Set A with the righthand.

Procedure. Participants were instructed to respond as accu-rately, and as simultaneously as possible. On average participantsresponded with an asynchrony of 44 ms between hands. Asyn-chronies did not differ between rewarded and unrewarded chordsfor the last four blocks of training, t(31) � 1.07, ns, or duringtransfer (t � 1). Each trial began with the presentation of a fixationcross for 1,000 ms, followed by the presentation of two of the eightdescribed faces (each presented on one side of the fixation cross).The stimuli remained on the screen until both responses wereproduced.

Participants first performed a practice block of 50 trials in orderto learn the response mapping. Two stimuli were presented andparticipants made two responses (one with each hand). The par-ticipants had three seconds to make their responses or the trial wascounted as incorrect. Only the first response made with each handwas recorded, so responses could not be changed. If the participantresponded correctly to both stimuli within the time limit, the word“Correct” appeared below the stimuli for 500 ms in green text. Ifthe participant responded incorrectly with either hand or did notrespond within the time limit, the words “Incorrect Response”appeared below the stimuli for 2,000 ms in red text, followed by ascreen that provided the correct S-R mapping for the participant toreview. The rationale for providing 500 ms of feedback for correcttrials and 2,000 ms of feedback for incorrect trials was to create acontrast between incorrect trials and unrewarded trials; the penalty

788 FREEDBERG, SCHACHERER, AND HAZELTINE

Page 4: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

of 1,500 ms was added to incorrect trials. The average accuracy forthe last 10 trials of practice was 0.84 � 0.15.

Following the practice block, participants were again presentedwith pairs of stimuli, and they were instructed to make the twocorresponding keypresses simultaneously. They were told thataccurate trials would either be rewarded with points or not, andthat inaccurate trials would never be rewarded with points (seeFigure 2). To complete a block, participants had to accumulate2,000 points, so the reward had practical value. Half of the chordswere consistently rewarded when the response was correct and theother half were never rewarded (see Figure 1), but participantswere not given this information.

Rewards were distributed in increments of 10 between 50 and200 points (e.g., 50, 60, 70). The actual value of the reward wasdetermined randomly on a trial-by-trial basis. Reward values wererandomized to discourage awareness and to encourage interest ingaining rewards.1 The rewards followed accurate trials and werepresented immediately below the stimuli in the center of the screenin green text on a black background with the accumulated total foreach block for 500 ms. Chords that yielded no reward werefollowed by the presentation of “�0” in gray text on a blackbackground with the accumulated total immediately below for 500ms (see Figure 2). Inaccurate responses to stimuli were followedby the presentation of the words “Incorrect Response” in red textfor 2,000 ms. With an average of 125 points per rewarded trial, the

theoretical length of a block equaled at least 32 trials, 16 rewarded(16 � 125 points � 2,000) and 16 unrewarded. Because perfor-mance improved with practice, the trials per block generally de-creased through training.

With four left-hand and four right-hand stimuli, there were 16possible pairs (chords). The assignments of rewards to chords weredone in such a way so that each individual stimulus belonged toone rewarded chord and one unrewarded chord. Thus, individualstimuli were completely uninformative as to whether a rewardwould be given for a correct response (see Figure 1). With fourrewarded chords and four unrewarded chords, there remained 8chords that were not practiced during training. These withheldchords were used to assess learning of the practiced chords in thetransfer block. Each stimulus appeared in two withheld chords.

The average distance between digits in each chord (determinedby the number of nonthumb fingers between the left and rightresponse finger) was held constant across chord type (rewarded,unrewarded, and withheld; average number of fingers between

1 In a follow-up experiment, we used the same procedure with rewardvalues that did not change for low reward (�1), and high reward chords(�4). There, we observed a marginally significant effect of reward ontraining accuracies, asymptote: t(444) � 1.90, p � .058, d= � 0.67, but notRTs, and a significant effect of reward on transfer accuracies, t(31) � 2.27,p � .05, d= � 0.50, but not RTs.

Figure 1. Example training set for Experiment 1. The correct keypress is shown adjacent to each face, and eachbox indicates whether the combination was linked to reward. Faces are from the Face Database of NeutralizedFaces by Natalie Ebner (2008) at the Park Aging Mind Laboratory (http://agingmind.utdallas.edu/facedb/view/neutralized-faces-by-natalie-ebner).

789REWARD AND INCIDENTAL LEARNING

Page 5: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

responses equaled 3.0). To avoid results driven by correspondenceeffects (e.g., Ben-Artzi & Marks, 1995), each rewarded and unre-warded group of chords included one spatially compatible chord(e.g., the pinky of the left hand and the index finger of the righthand), one mirror chord (e.g., the pinky of the left hand and thepinky of the right hand), and two spatially incompatible chords(e.g., the pinky of the left hand and the ring finger of the righthand). Finally, rewarded and unrewarded chord assignments werecounterbalanced among participants.

Completion of training blocks was immediately followed by one50-trial transfer block. The transfer block was similar to thetraining blocks, except that withheld chords were now introducedand no rewards were given on any trial. Incorrect trials resulted inthe presentation of the words “Incorrect Response” in red text for2,000 ms.

Explicit knowledge questionnaire. Following completion ofthe task, a questionnaire assessing explicit knowledge was admin-istered. This questionnaire was divided into two parts. First, par-ticipants reported their ability to predict when rewards were given.In this part participants responded to the question “Could youpredict when you would get points?” by responding with either“yes,” “no,” or “sometimes/maybe,” which were coded as a 1, 0,or 0.5, respectively. For the second part of the questionnaire, eachof the rewarded and unrewarded combinations from training waspresented and participants were asked to determine if the chordwas followed by a reward by pressing either the y (yes) or n (no)button on the keyboard.

Statistical analysis. Participants’ RTs and accuracies servedas the dependent variables. For the analysis of RT, trials wereeliminated if (a) an incorrect response was made by either hand orno response was made (18.7% of trials), (b) the trial followed anincorrect response (18.7%), (c) the stimulus combination wasrepeated from the previous trial (1.4%), or (d) an RT less than 300ms or greater than 2,000 ms was recorded (5.4%). In sum, thiseliminated 34.3% of the data from training and transfer. Note thatwhen we performed the same analyses on the training data whileretaining trials after errors and repeat trials (81% of the available

data), the pattern of results did not change for this analysis. Thelonger RT of either the left or right hand responses for each trialwas used in our analysis.

Training. Average RTs and accuracies were modeledacross blocks using a 3-parameter asymptotic exponential func-tion, which prior work has indicated produces a good fit to datatypically obtained in the chord learning task (see Wifall, Mc-Murray, & Hazeltine, 2014). In the asymptotic exponentialfunction, a indexes the asymptote for learning, b indexes thechange (in RT or accuracy) from the initial block to the asymp-tote, and c indexes the rate of change. This model was fit toblock-specific performance using R’s (R Development CoreTeam, 2011) nonlinear mixed-effects package (Pinheiro &Bates, 2000). Maximum likelihood was used to estimate allfixed and random effects simultaneously. The most complexmodel for each analysis included (a) fixed-effect intercepts fora, b, and c; (b) fixed effects of reward on a, b, and c; and (c)random subject-specific effects for a, b, and c. Model-comparison procedures based on the Bayesian information cri-terion (Schwarz, 1978) were used to trim this complex model.Fixed effects for reward on a specific parameter were elimi-nated if a more complex model did not produce a better fit thanthe simpler model. Random effects for specific parameters alsowere eliminated if the complex model did not fit better, indi-cating that parameter estimates did not vary significantly acrossparticipants and only a fixed-effect estimate was necessary. Theparameter estimates for the best-fitting model (e.g., for a, b, andc, as well as the fixed effects of reward on a, b, and c) then wereinterpreted, as described in the results.

Transfer. To determine if differences existed between re-warded and unrewarded chords during the transfer block withoutreward, RTs and accuracies were contrasted using two-tailedpaired t tests. Combination-specific learning was determined bycontrasting the RTs and accuracies of withheld chords to bothtypes of practiced chords (rewarded and unrewarded) using pairedt tests.

Figure 2. Illustration of trial sequence for rewarded and unrewarded chords during training for Experiment 1.

790 FREEDBERG, SCHACHERER, AND HAZELTINE

Page 6: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

Results

Training—RT. The best fitting model excluded the rate pa-rameter as a fixed effect of reward and as a random effect betweensubjects. The fixed effect of reward was reliable on the asymptote,t(444) � 2.44, p � .05, d=� 0.88, and magnitude, t(444) � �2.09,p � .05, d= � 0.75, of the learning function (Figure 3, left panel).The average asymptote values for rewarded and unrewardedchords were 1,202 ms and 1,228 ms respectively, and the averagemagnitudes for rewarded and unrewarded chords were 190 ms and145 ms respectively. These results indicate that by the end oftraining rewarded chords were performed significantly faster thanunrewarded chords.

Training—Accuracy. The fixed effect of reward was reliableon asymptote, t(443) � �2.41, p � .05, d= � 0.87, but not formagnitude, or rate (|ts| � 1). The average asymptote values forrewarded and unrewarded chords were 0.90 and 0.87, respectively(Figure 3, right panel). The average magnitudes values for re-warded and unrewarded chords were 0.08 and 0.07, respectively,and the average rates were 0.45 and 0.88, respectively. In sum,rewarded chords were performed more accurately than unrewardedchords by the end of training.

Training—After rewards. It is possible that receiving re-wards affected performance, perhaps through motivation, on theimmediately subsequent trial. Thus, to determine whether rewardaffected performance on the subsequent trial, we modeled the samedata according to whether reward was given on the previous trial.The best-fitting model for this analysis excluded the rate parameteras an effect of reward and as a random effect. The fixed effect ofprior reward was not reliable on asymptote, after reward: 1,210ms; after no reward: 1,225 ms; t(444) � �1.52, ns, or the mag-nitudes of RT decrements, after reward: 172 ms; after no reward:169 ms; t � 1.

For accuracy, the best fitting model excluded the asymptoteparameter as a fixed effect of reward and as a random effectbetween subjects. The magnitude parameters were similar for trialsafter rewards (0.11) and trials after no rewards (0.10; |t| � 1). Wedetected a marginally significant difference in the rate parameter

between trials after rewards (0.11) and trials after no rewards(0.38), t(444) � 1.77; p � .08, d= � 0.63. Overall, this analysisshows that reward had very little effect on the subsequent trial.

Transfer—RT. During transfer, eight novel chords were in-troduced along with the previously practiced rewarded and unre-warded chords. Critically, no rewards were presented during thisportion of the experiment. Both rewarded (1,273 ms) and unre-warded (1,282 ms) chords were performed significantly faster thanwithheld chords (1,370 ms), withheld versus reward: t(31) � 3.42,p � .005, d= � 0.57; withheld versus unrewarded: t(31) � 3.95,p � .001, d= � 0.49, but rewarded chords did not differ fromunrewarded chords (t � 1; Figure 4, left panel). Note that for eacht test in both experiments we also performed a similar analysis ofcovariance using composite awareness (see below) as a covariateand the same pattern of significance was observed.

Transfer—Accuracy. Withheld chords (0.80) were per-formed significantly less accurately than the rewarded chords(0.91), t(31) � 4.05, p � .001, d= � 0.96, but were not performedworse than unrewarded chords (0.83), t(31) � 0.90, p � .38. Inaddition, we observed significantly higher accuracies for rewardedchords over unrewarded chords, t(31) � 2.68, p � .01, d= � 0.67(Figure 4, right panel). Overall these results indicate that strongcombination-specific learning occurred for rewarded, but not un-rewarded chords, and that rewarded chords were performed sig-nificantly more accurately than unrewarded chords.2

2 We divided the subjects according to gender to see if using male facesinfluenced learning. Thus, the gender of the participant was added as afactor to the transfer accuracy analysis from Exp. One and the RT analysisfrom Exp. Two (in which learning differences were identified betweenrewarded and unrewarded combinations). It is surprising that the transferdata revealed a significant Gender � Reward interaction, F(1, 30) � 6.073,p � .05, �p

2 � 0.168, indicating that females showed a larger reward effectthan males. We found somewhat similar results in Exp. 2; again femalesshowed a larger reward effect than males, F(1, 44) � 3.29, p � .08, �p

2 �0.070. Further work is necessary to determine whether this differencerelates to the choice of stimuli or other factors.

Figure 3. Training results for Experiment 1. Solid bold lines represent model-fitted exponential curves forrewarded and unrewarded chords for reaction time (RT; left panel) and accuracy (right panel) data. Dashed linesrepresent average observed data, and the shaded regions represent the SEMs for the observed data. Rewardedchords showed a significantly smaller RT asymptote and a significantly greater accuracy asymptote thanunrewarded chords. Rewarded chords also showed a significantly larger magnitude in RT change from Block 1compared with unrewarded chords. See text for details.

791REWARD AND INCIDENTAL LEARNING

Page 7: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

Questionnaire. The first part of the questionnaire assessedparticipants’ self-reported awareness, which is their subjectiveconfidence in being able to discern which chords were rewardedand which were not. The average rating of self-reported awarenesswas 0.53 � 0.42. The second part of the questionnaire measuredparticipants’ “recall,” which is their ability to identify whichcombinations were actually rewarded or unrewarded. The averagerecall score was 0.65 � 0.20. These two measures did not correlatesignificantly with each other (R2 � 0.16, ns), indicating thatparticipants’ subjective confidence was not related to their abilityto recall which combinations were rewarded.

Participants were 65% accurate (SD � 0.20) indicating whichcombinations were rewarded and which chords were not re-warded; which was not significantly better than chance perfor-mance (Z � 0.73, ns). Nonetheless, to assess whether thedifference in accuracies between rewarded and unrewardedchords during transfer may have been associated with aware-ness, a correlational analysis was conducted between the sum ofthe awareness scores and the difference between rewarded andunrewarded accuracy during transfer. To account for both par-ticipants’ awareness (measured by self-report) and recall (mea-sured by identification of the rewarded combinations duringrecall) of rewarded combinations, we summed each partici-pant’s self-report and recall scores to form a composite aware-ness score. No significant correlation between compositeawareness and the difference between rewarded and unre-

warded chord accuracies during transfer was revealed(r � �0.222, N � 32, ns; Figure 5). In addition, we alsoperformed the same analysis for each awareness score sepa-rately. Self-reported confidence did not correlate significantlywith the difference in accuracy between rewarded and unre-warded chords (R2 � �0.132, ns). Recall scores did not cor-relate significantly with the difference in accuracy betweenrewarded and unrewarded chords (R2 � �0.172, ns). Thus,awareness appears to have had little to no effect on accuracydifferences between rewarded and unrewarded chords duringtransfer.

Discussion

The results of Experiment 1 indicate that participants learnedto perform rewarded chords significantly faster and more accu-rately than unrewarded chords. Thus, the training data fromExperiment 1 are consistent with four of our starting hypothe-ses: (a) reward may improve motor learning; (b) participantsbecome biased to produce rewarded responses; (c) reward maystrengthen the learning of associations between stimuli andresponses; and (d) performance may be speeded in anticipationof a reward. However, we assessed learning in a transfer blockin which no rewards were given so that the effects of reward onlearning could be distinguished from the effects of reward onperformance, and we again observed significantly higher accu-

Figure 4. Transfer reaction times (left panel) and accuracies (right panel) for rewarded, unrewarded, andwithheld chords.

792 FREEDBERG, SCHACHERER, AND HAZELTINE

Page 8: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

racy for rewarded chords than unrewarded chords. In this way,the results from the transfer blocks, along with the accuracyadvantage for the rewarded chords during training, contraindi-cate the fourth and fifth hypotheses that reward led to morerapid response production. Thus, the results indicate that rewardaffects learning, be it in terms of improved encoding of motorprograms (Hypothesis 1), biases to make particular responses(Hypothesis 2), or stronger S-R associations (Hypothesis 3).

This advantage for rewarded chords does not appear to berelated to motivation or the degree of awareness the participantgarnered during the course of the experiment as we observed norelationship between participants’ awareness of which chords wererewarded (which was generally poor) and their improved perfor-mance for rewarded chords during transfer. Thus, these resultsdemonstrate that participants need not be fully aware of how toobtain a reward to experience reward-related learning gains; re-warding certain chords was enough to observe stronger learningover unrewarded chords.

Experiment 2

Experiment 1 was designed so that four unique S-R combi-nations presented during training were indicative of a reward,while the remaining four were not. However, it is possible thatour results demonstrate learning related to making specificresponses rather than the binding of specific S-R combinations.In Experiment 2, we mapped the same eight combinations toone of two possible single-keypress responses that were eachequally likely to yield a reward (50%). In this case, the motordemands are minimal (i.e., a single keypress), making motorprocesses an unlikely target of learning. To be clear, we are notarguing that the task complexity is different between tasks.

Although Experiment 1 required two simultaneous keypressesand Experiment 2 only required one keypress, Experiment 2requires the learning of an arbitrary mapping of two faces to oneof two keypresses, compared with the consistent 1:1 mapping offaces to keys in Experiment 1. However, it is not unreasonableto suggest that the motor demands are decreased from Experi-ment 1 to 2.

Because the two responses were rewarded with equal frequency,the hypotheses that reward biases the production of particularresponses (Hypothesis 2) or improves motor learning of particularresponses (Hypothesis 1) hold that no differences in performanceon rewarded and unrewarded stimulus combinations should beobserved during training or transfer. In contrast, if reward affectsthe encoding of S-R associations (Hypothesis 3), then differencesbetween the two types of combinations should be observed eventhough they involve the same set of equally rewarded responses.

Method

Participants. Forty-eight right-handed participants with nor-mal or corrected-to-normal vision were recruited from the under-graduate population at the University of Iowa (mean age � 19.37 �1.25; 34 female). All participants received credit toward comple-tion of an introductory psychology course at the University. Twoparticipants were excluded for performing at chance level duringtransfer. All procedures used in the experiment were approved bythe University of Iowa’s institutional review board.

Apparatus and stimuli. The stimuli were the same as inExperiment 1, but the responses were now one of two keys ona serial response box (see Figure 6). Participants respondedwith either their right index or middle finger. Four of the facepairs were mapped to the left button (Key 1, index finger) and

Figure 5. Correlation between reward effect at transfer and composite awareness in Experiment 1, R2 � �0.22.See the online article for the color version of this figure.

793REWARD AND INCIDENTAL LEARNING

Page 9: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

four pairs were mapped to the right button (Key 2, middlefinger). The remaining eight pairs were not presented. Eachindividual stimulus belonged to a pair mapped to Key 1 and apair mapped to Key 2, so the individual keys were uncorrelatedwith the correct response. In addition, two of the combinationsthat were mapped to the left and right keys were linked torewards (random point values of between 50 and 150), and theother two combinations were never rewarded. The combina-tions were chosen so that each individual face stimulus be-longed to a rewarded pair and an unrewarded pair, so individualstimuli were also uncorrelated with reward. Moreover, half ofthe pairs linked to the left button and half of the pairs linked tothe right button were rewarded. Rewarded pairs were counter-balanced across participants. With an average of 100 points perrewarded trial, the theoretical length of a block equaled at least40 trials, 20 rewarded (20 � 100 points � 2,000) and 20unrewarded. Generally, trials per block decreased through train-ing.

Procedure. To learn the mappings between face pairs andresponses, participants first entered a 75-trial familiarizationphase in which they initially guessed and were given feedbackbased on their response, “correct” or “incorrect.” Because therewere only two response options, an “incorrect” response indi-cated that the participant should have pressed the other key. Themain purpose of the familiarization phase was to familiarize the

participant to the demands of the task. Thus, it is unlikely thatparticipants acquired a significant amount of information dur-ing this phase. Indeed, overall accuracy for the last 10 trials ofthe familiarization block was 59% across participants.

After familiarization, participants performed an 8 block train-ing phase, which was identical to the training phase of Exper-iment 1 with the exception that the response demands changed(only one response was required as opposed to two responses).Correct responses to rewarded combinations yielded pointsnecessary to complete the block, and correct responses to un-rewarded combinations yielded no points. Incorrect responsesyielded no points, but the word “Incorrect” was displayed ascorrective feedback.

Following training, participants immediately performed a 50-trial transfer phase in which they performed the same task duringtraining, but without rewards. No withheld combinations wereincluded in this experiment.

Explicit knowledge questionnaire. Upon completing thetransfer phase, participants were given a postexperiment ques-tionnaire exactly as in Experiment 1, with the exception that wenow assessed self-reported awareness by using a 0 to 9 Likertscale where a 0 indicated a complete lack of awareness and a 9indicated complete awareness. Self-reported awareness scoreswere normed by dividing by 9 so that they ranged from 0 to 1.

Figure 6. Example training set for Experiment 2. Each box indicates the correct keypress for each combination offaces and whether the combination was linked to a reward. Blank boxes represent combinations not used in theexperiment.

794 FREEDBERG, SCHACHERER, AND HAZELTINE

Page 10: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

Statistical analysis. Participants’ RTs and accuracies servedas the main dependent variables. For the analysis of RT, trials wereeliminated if (a) an incorrect response was made or no responsewas made (27.1% of trials), (b) the trial followed an incorrectresponse (27.2%), (c) the stimulus combination was repeated fromthe previous trial (12.7%), or (d) an RT less than 300 ms or greaterthan 3000 ms was recorded (3.6%). In sum, we eliminated 54% ofall transfer and training data.3 All training data and transfer datawere analyzed similarly to Experiment 1.

Results

Training—RT. For the analysis of RT, the best fitting modelexcluded the rate parameter as a fixed effect for reward. The fixedeffect of reward was reliable on asymptote, rewarded: 1,195 ms,unrewarded 1,250 ms, t(637) � 2.76, p � .01, d= � 0.81, but not onmagnitude, rewarded: 38 ms, unrewarded: 88 ms; t(637) � 1.34, ns.These results indicate that by the end of training rewarded combina-tions were performed faster than unrewarded combinations (Figure 7,left panel). Note, that in Experiment 2 the RTs increase slightly withblock whereas in Experiment 1 RTs decreased with block. Despitethis fundamental difference in the pattern of RT between the twoexperiments, the effect of reward was similar.

Training—Accuracy. The best fitting model excluded therate parameter as a fixed effect of reward and a random effectbetween subjects. The effect of reward was reliable on asymptote,rewarded: 1.02, unrewarded: 0.94, t(639) � �2.41, p � .05, d= �0.71, but not for magnitude, rewarded: 0.46, unrewarded: 0.40,t(639) � 1.18, ns. These results indicate that by the end of trainingrewarded combinations were performed with higher accuracy thanunrewarded combinations (Figure 7, right panel).

Training—After Reward. To determine whether reward af-fected the subsequent trial, we modeled the same data across trainingfor trials after rewarded and unrewarded trials as in Experiment 1. The

RT asymptote values for trials after reward (1,204 ms) were notstatistically different from trials after no reward, 1,221 ms;t(639) � �1.08, ns. Additionally, we saw no statistical differences formodel-predicted magnitude (rewarded: 49, unrewarded: 29 ms, |t| �1). However, the effect of reward was reliable on rate, rewarded: 0.48,unrewarded: 0.78, t(639) � 2.10, p � .05, d= � 0.62, indicating afaster rate for trials following unrewarded combinations over trialsfollowing rewarded combinations. Overall, these results suggest thatreward had little effect on the subsequent trial with the exception thattrials after rewarded trials were produced slightly slower by the end oftraining than unrewarded trials.

The same analysis was performed for accuracy. The best fittingmodel excluded the rate parameter as a fixed effect of reward andas a random effect between subjects. The model-predicted asymp-tote for trials after reward (0.90) was statistically similar to trialsafter no rewards (0.89; |t| � 1). No statistical difference wasrevealed for model-predicted magnitude values (after reward: 0.33,after no reward: 0.32; |t| � 1). In short, reward had no significantimpact on the RT or accuracy of the following trial.

Transfer—RT. The RTs for rewarded combinations (1,160ms) during the transfer block were shorter than the unrewardedcombinations (1,244 ms), t(46) � 3.02, p � .05, d=� 0.37 (Figure8, Panel A, left) in the transfer blocks when no rewards were givenon any trials.

3 Although this proportion is high, it is attributable to the fact thatparticipants needed to acquire information about the mapping through trialand error. Indeed most data we eliminated was in the first four blocks oftraining (65%); 45% of data were eliminated from the last four blocks oftraining; and only 34.5% of data were eliminated from transfer. Moreover,we performed the same analyses on the training data retaining trials aftererrors and repeat trials (including 69% of the available data). The signif-icant results did not change for this analysis.

Figure 7. Training results for Experiment 2. Solid bold lines represent model-fitted exponential curves forrewarded and unrewarded combinations for reaction time (RT; left panel) and accuracy (right panel) data.Dashed lines represent average observed data, and the shaded regions represent the SEMs for the observed data.Rewarded combinations showed a significantly smaller RT asymptote and a significantly greater accuracyasymptote than unrewarded combinations. See text for details. See the online article for the color version of thisfigure.

795REWARD AND INCIDENTAL LEARNING

Page 11: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

Figure 8. Transfer results for Experiment 2. The left panels represent reaction times (RTs) of the twocombination types and the right panels represents accuracies. These data are represented for all participants (A),participants that experienced a decrease in RT through training (B), and participants who did not experience adecrease in RTs through training (C). Error bars represent the SEM. All comparisons were made with paired ttests.

796 FREEDBERG, SCHACHERER, AND HAZELTINE

Page 12: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

Transfer—Accuracy. Average accuracies for rewarded andunrewarded combinations during the transfer block were 0.86 and0.85, respectively (t � 1; Figure 8, Panel A, right).

Questionnaire. The average self-reported awareness of thecombination-reward relationship was 0.47 (SD � 0.32) where a 1represents full awareness of which combinations were rewardedand 0 equals no awareness. The mean accuracy score, indicatingwhich combinations were rewarded or unrewarded combinationswas 0.59 (SD � 0.18), which did not differ significantly fromchance (Z � 0.48, ns). These two measures did not correlatesignificantly with each other (R2 � 0.04, ns). No significantcorrelation was revealed between awareness and the differencebetween rewarded and unrewarded combination RTs (R2 � 0.13,N � 45, p � .39). In addition, self-reported confidence (R2 � 0.18,ns) and recall (R2 � 0.04, ns) did not correlate significantly withthe difference in RT between rewarded and unrewarded combina-tions. This suggests that awareness had very little effect on RTdifferences between rewarded and unrewarded combinations dur-ing transfer (see Figure 9).

Increasing versus decreasing RTs in Experiment 2. Theresults of Experiment 2 demonstrated that rewarded combinationswere performed more quickly than unrewarded combinations dur-ing training. However, the shape of the learning curve was uncon-ventional as mean RT increased during training. Given that thetasks demanded that participants learn the S-R mapping throughtrial and error, it is possible that some participants simply madequick responses early in training in order to rapidly receive feed-back. Using this strategy would reduce the RTs for those partici-pants early in training because some of those quick responseswould be correct by chance and affect the change in RT acrosstraining. Indeed, further scrutiny of the data revealed that half ofthe participants experienced a drop in RTs from Block 1 to Block

8 (n � 23), whereas the other half showed increasing RTs (n �23). Therefore, we analyzed these groups of participants separatelyto determine if reward affected performance and learning differ-ently between groups.

Results—Participants Whose RTs decreasedAcross Training

Training—RT. For this analysis, eliminating the rate param-eter as a fixed effect for reward and as a random effect betweensubjects produced the best-fitting model. The effect of reward wasreliable on the asymptote, rewarded: 1,053 ms, unrewarded 1,133ms, t(316) � 2.15, p � .05, d= � 0.63, but not on the magnitude,rewarded: 225 ms, unrewarded: 180 ms (|t| � 1). These resultsindicate that by the end of training rewarded combinations wereperformed faster than unrewarded combinations (Figure 10, PanelA, left) for these participants.

Training—Accuracy. Eliminating the asymptote parameteras a fixed effect of reward and eliminating magnitude and rate asrandom between-subjects effects produced the best model fit. Theeffect of reward was not reliable on magnitude, rewarded: 0.39,unrewarded: 0.43, t(317) � �1.18, ns, or rate, rewarded: 0.21,unrewarded: 0.15, t(317) � �1.25, ns. These results indicate nodifferences between rewarded and unrewarded combinations interms of accuracy for this group (Figure 10, Panel A, right).4

Transfer—RT. The RTs for rewarded combinations (1,065ms) during the transfer block were less than the unrewarded

4 To determine whether reward affected the subsequent trial for bothgroups, we modeled the same data across training for trials after rewardedand unrewarded trials. We saw no significant effect of prior reward on anyparameter for both groups.

Figure 9. Correlation between reward effect at transfer and composite awareness for Experiment 2, R2 � 0.13.See the online article for the color version of this figure.

797REWARD AND INCIDENTAL LEARNING

Page 13: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

combinations (1,206 ms), t(22) � 4.75, p � .001, d= � 0.66(Figure 8, Panel B, left) for this group.

Transfer—Accuracy. Average accuracies for rewarded andunrewarded combinations during the transfer block were 0.86 and0.89, respectively, t � 1 (Figure 8, Panel B, right).

Questionnaire. One participant chose not to fill out the post-experiment questionnaire. The average self-reported awareness ofthe combination-reward relationship was 0.47 (SD � 0.34). Par-ticipants in this group were 58% accurate (SD � 0.17) indicatingwhich combinations were rewarded or unrewarded combinations,which did not differ significantly from chance (Z � 0.50, ns).These two measures did not correlate significantly with each other(R2 � 0.12, ns). No significant correlation between composite

awareness and the difference between rewarded and unrewardedcombination RTs was revealed (R2 � 0.08, N � 22, p � .73). Inaddition self-reported confidence (R2 � 0.09, ns) and recall(R2 � �0.0002, ns) did not correlate significantly with the effectof reward. These results suggest that awareness had very littleeffect on RT differences between rewarded and unrewarded com-binations during transfer for this group.

Results—Participants Whose RTs increasedAcross Training

Training—RT. For this analysis, a model comparison re-vealed that eliminating the rate parameter as a fixed effect for

Figure 10. Training results for Experiment 2. Panel A represents the training data for participants whosereaction times (RTs) decreased over the course of training, while Panel B represents the training data for theremaining participants. Solid bold lines represent model-fitted exponential curves for rewarded and unrewardedcombinations for RT (left panels) and accuracy (right panels) data. Dashed lines represent average observed data,and the shaded regions represent the SEMs for the observed data. Rewarded combinations showed a significantlysmaller RT asymptote only for participants whose RTs decreased during training. See the online article for thecolor version of this figure.

798 FREEDBERG, SCHACHERER, AND HAZELTINE

Page 14: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

reward and as a random effect between subjects produced thebest-fitting model. The effect of reward was not reliable on as-ymptote, rewarded: 1,273 ms, unrewarded 1,316 ms, t(317) � 1.3,ns, or magnitude, rewarded: 261 ms, unrewarded: 317 ms;t(317) � �1.00. These results indicate that reward had no effecton RTs (Figure 10, Panel B, left) for these participants.

Training—Accuracy. For this analysis, a model comparisonrevealed that eliminating the asymptote parameter as a fixed effectof reward and magnitude and rate as random effects produced thebest-fitting model. The effect of reward was not reliable on mag-nitude, rewarded: 0.49, unrewarded: 0.49 (t � 1), or rate, re-warded: 0.13, unrewarded: 0.10 (t � 1). Overall, reward had noeffect on RTs or accuracy for this group (Figure 10, Panel B,right).

Transfer—RT. The RTs for rewarded (1,255 ms) and unre-warded combinations (1,282 ms) during the transfer block did notdiffer (t � 1; Figure 8, Panel C, left) for this group.

Transfer—Accuracy. Average accuracies for rewarded andunrewarded combinations during the transfer block were 0.86 and0.81, respectively. This difference was not statistically significant,t(22) � 1.35, ns (Figure 8, Panel C, right).

Questionnaire. The average self-reported awareness of thecombination-reward relationship was 0.47 (SD � 0.30). Partici-pants were 59% accurate (SD � 0.19) indicating which combina-tions were rewarded or unrewarded combinations, which did notdiffer significantly from chance (Z � 0.44, ns). These two mea-sures did not correlate significantly with each other (R2 � �0.02,ns). No significant correlation between awareness and the differ-ence between rewarded and unrewarded combination RTs wasrevealed (R2 � 0.334, N � 23, p � .12). In addition, self-reportedconfidence (R2 � 0.29, ns) and recall (R2 � 0.16, ns) did notcorrelate significantly with the effect of reward. These resultssuggest that awareness had very little effect on RT differencesbetween rewarded and unrewarded combinations during transferfor this group. These results suggest that awareness had very littleeffect on RT differences between rewarded and unrewarded com-binations during transfer.

Discussion

The motor demands of Experiment 2 were greatly simplified (sin-gle keypresses) and both responses were equally likely to be re-warded. Thus, the advantage for rewarded combinations does notreflect a response bias (Hypothesis 2) or an advantage in learning themotor programs associated with the responses (Hypothesis 1). In-stead, reward appears to facilitate the learning of the S-R mappingsbetween the combination of stimuli and the particular response (Hy-pothesis 3). Moreover, in both experiments, advantages for previouslyrewarded combinations extended to the transfer blocks, when therewas no possibility of reward, suggesting that the advantage stemmedfrom learning rather than anticipation of reward. Also similar toExperiment 1, participants demonstrated poor awareness of thecombination-reward relationships, and the correlation between aware-ness and the reward effect at transfer (R2 � 0.15) was not significant.

Intriguingly, we split the participants in two groups: those whoseRTs decreased through training and those whose did not. One reasonfor this difference is that some participants may have respondedquickly during the first blocks of training in order to learn the S-Rmapping. Consistent with this hypothesis, both groups of participants

showed increases in accuracy throughout training, and those whoresponded quickly tended to have lower accuracies in the early blocks.The effect of reward was only significant for the former group,although both groups showed numerically larger decrements in RTfor rewarded combinations. Critically, participants who experienced adrop in RT and a rise in accuracy across training experienced strongtransfer of the reward effect.

General Discussion

The present experiments provide evidence that rewards can bolsterincidental learning of S-R associations, extending the work of Pal-minteri et al. (2011) to show that benefits for rewarded associationsare obtained even when response biases were eliminated. Thus, wedemonstrate that reward can facilitate the acquisition of S-R associ-ations in an incidental learning task rather than affecting motor pro-duction. Although we cannot rule out the possibility that reward altersperformance in other ways, such as biasing response selection orimproving the learning of motor programs, it appears that rewardingperformance strengthens the means by which complex stimuli acti-vate particular response options (see Wifall et al., 2014). This wouldbe sufficient to explain the results in both experiments, but reward maytarget other mechanisms as well.

Participants’ responses on a postexperiment questionnaire did notreveal much awareness of which combinations were rewarded. Thus,our manipulation was successful at minimizing participants’ aware-ness of which combinations were rewarded. These results are similarto previous research by Pessiglione et al. (2008) demonstrating in-strumental learning using subliminal rewards. Thus, it appears thatreward can influence instrumental as well as incidental learning, andthat direct manipulation of top–down factors is not needed to observethese effects.

It is possible that some participants became aware of which com-binations were rewarded and that this awareness was not accuratelyexpressed through our postexperiment questionnaire. However, it isunlikely that participants were able to explicitly recognize whichcombinations were rewarded, because the best strategy for obtainingrewards in these experiments was to ensure an accurate response onthese trials by slowing their responses. Although rewarded pairs wereconsistently performed more accurately than unrewarded pairs, theywere also performed more quickly. This pattern, along with the poorability to report which pairs were rewarded, suggests that the perfor-mance advantage for rewarded pairs resulted from stronger learnedassociations between the stimuli and responses.

Reward and Motivation

There is clear evidence that reward can motivate improved perfor-mance on declarative learning (e.g., Adcock, Thangavel, Whitfield-Gabrieli, Knutson, & Gabrieli, 2006; Wittmann, Dolan, & Düzel,2011; Wittmann et al., 2005), probabilistic learning (e.g., Delgado,Miller, Inati, & Phelps, 2005; Galvan et al., 2005), and procedurallearning tasks (e.g., Abe et al., 2011; Wächter et al., 2009). Consid-ering that reward can influence top–down factors, including attention(e.g., Anderson et al., 2011; Anderson & Yantis, 2014, 2013; Roper,Vecera & Vaidya, 2014), cognitive control (e.g., van Steenbergen,Band, & Hommel, 2009, 2012), and visual working memory (e.g.,Gong & Li, 2014), there are many possible ways in which reward canimprove performance in learning tasks. However, given that we foundthat the benefit of reward generalized to a situation where rewards are

799REWARD AND INCIDENTAL LEARNING

Page 15: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

no longer available, we conclude that reward can also improve learn-ing without direct manipulation of top–down factors. Although en-hancing motivation is a useful tool for teachers, it appears that rewardsdo not necessarily need to be used as a motivational tool in order tobe effective; simply reinforcing associative learning with rewards isenough to incur the benefit.

The Basal Ganglia, Dopamine, andIncidental Learning

These findings are consistent with current models of the basalganglia, where dopamine acts as a teaching signal to train thedorsal striatum to make accurate and efficient responses (Frank,2005; Wickens, 1997). Although we did not measure neural ac-tivity in our experiments, the current results are consistent withgradual basal-ganglia mediated learning (Ashby & Maddox, 2011;Delgado et al., 2005; Frank, 2005; Galvan et al., 2005; Palminteriet al., 2011; Palminteri et al., 2009).

Overwhelming evidence indicates a link between incidentallearning and the basal ganglia (e.g., Ashby & Maddox, 2011;Frank, 2005; Grafton, Hazeltine, & Ivry, 1995; Packard & Knowl-ton, 2002; Seger, 2006). Rats with dorsal striatum lesions areimpaired at S-R learning (Packard, Hirsh, & White, 1989; Packard& McGaugh, 1996), and patients with degeneration of the basalganglia such as Huntington’s (Paulsen, Butters, Salmon, Heindel,& Swenson, 1993), and Parkinson’s disease (PD) patients (Knowl-ton, Mangels, & Squire, 1996; Shohamy, Myers, Grossman, Sage,& Gluck, 2005; Siegert, Taylor, Weatherall, & Abernethy, 2006)have been observed to experience incidental learning deficits.

The specific contributions of dopaminergic modulation of thebasal ganglia to incidental learning are beginning to be understood(de Vries, Ulte, Zwitserlood, Szymanski, & Knecht, 2010; Frank,2005; Frank, Seeberger, & O’Reilly, 2004; Palminteri et al., 2009;Palminteri et al., 2011). Within the basal ganglia, midbrain dopa-minergic neurons synapse on neurons in the dorsal and ventralstriatum (Parent, 1990). Midbrain dopaminergic neurons are espe-cially sensitive to the presence of an unexpected reward as well asthe absence of an expected reward (Schultz, 1998; Zaghloul et al.,2009). Because dopaminergic neuron axon terminals synapse onthe striatum, it is logical to assume that reward might play asignificant role in forming associations. Evidence from Tourette’spatients, who experience increased dopaminergic transmission tothe striatum (Graybiel & Canales, 2001), supports the notion thatdopamine can reinforce motor skill learning (Palminteri et al.,2011); unmedicated patients with Tourette’s syndrome experiencesignificantly greater reward-related improvements on a motor-learning task than age-matched comparisons. Additionally, dopa-mine agonists have been shown to improve the formation of S-Rrelationships in rats (Packard & White, 1991) and the acquisitionof statistical regularities in humans (de Vries et al., 2010). Ifdopamine supports incidental learning, is it possible that increaseddopaminergic transmission to the striatum in anticipation of re-ward can bolster the encoding of associations?

Michael Frank and colleagues developed a model to explain thereinforcing role of dopamine and the incidental learning deficien-cies extant in patients with PD (Frank, 2005, 2006). Although itsrelease is diffuse, occurring through varicosities in the striatum,dopamine can selectively reinforce associations through two typesof medium spiny neurons (MSNs): D1 and D2 receptor neurons

(Ince, Ciliax, & Levey, 1997). MSNs expressing D1 receptorsexperience increased potentiation by dopamine and stimulate thedirect or “go” pathway, ultimately enhancing cortical activation.Conversely, MSNs expressing the D2 receptor experience de-creased potentiation in the presence of dopamine, which deacti-vates the indirect or “no-go” pathway. Although the release ofdopamine may serve to reinforce particular associations, a de-crease in dopamine release can decrease the activity of the “go”pathway, while increasing the activity of the “no-go” pathway.Thus, associations that are linked to nigrostriatal dopamine releasemay be selectively strengthened within the cortex, whereas thosethat are accompanied by dips in nigrostriatal dopaminergic trans-mission may be selectively weakened.

This model has been supported by functional imaging evidencedemonstrating positive changes in activation of the striatum im-mediately following reward-related learning (Galvan et al., 2005;Knutson, Adams, Fong, & Hommer, 2001; Liu et al., 2007; Wäch-ter et al., 2009; Wittmann et al., 2005). To conclude, if we assumebased on prior research that (a) nigrostriatal dopamine release ispredictive of reward (Schultz, 1998), and (b) nigrostriatal dopa-mine release is attenuated by an absence of an expected reward(Schultz, 1998), then ostensibly associations that are rewardedshould be reinforced within the fronto-striatal circuitry more ef-fectively than those that are not. Such results would be consistentwith several findings that support dopamine’s role in reinforce-ment learning (Berns & Sejnowski, 1998; Doya, 2000; Hikosaka,Nakamura, Sakai, & Nakahara, 2002; Suri & Schultz, 1998).

Future Directions

Several issues remain to be addressed to better understand therelationship between reward, dopamine, and learning. For exam-ple, one critical question is whether dopamine release in the basalganglia causes learning or if dopamine release is the result oflearning (Berridge, 2007). If dopamine in the basal ganglia does infact cause the increased binding of associations that occurred inour experiments, we would expect the effect observed in our twoexperiments to be modulated by dopamine manipulation similarlyto previous research (Frank et al., 2004; Palminteri et al., 2009).Along these lines, we would expect dopamine agonists or antag-onists to disrupt this effect, especially in patients with PD, whereselective dopamine neurotransmission to rewards would be mini-mized. Thus, one would predict that medicated patients with PDwould experience significantly less reward-related incidentallearning gains than age-matched comparisons.

It will also be vital to dissociate this effect from incentivelearning (Berridge, 2007; Flagel et al., 2011). It is possible thatreward simply acts to link incentive to a conditioned stimulus.Therefore, future research will need to determine whether dopa-mine simply updates what is incentivized, or if it in fact directlycauses the formation of associations. By mitigating the role ofincentive learning using the chord learning paradigm, it may bepossible to further examine the role of dopaminergic neurotrans-mission in rewarded incidental learning in the future. Additionally,mouse models may be a viable path to test whether our findings arethe function of dopaminergic neurotransmission. It also may bepossible to tease apart the motivational role of reward from itseffects on learning through fMRI (Javadi, Schmidt, & Smolka,2014; Miller, Shankar, Knutson, & McClure, 2014).

800 FREEDBERG, SCHACHERER, AND HAZELTINE

Page 16: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

Finally, it is vital to not discount the involvement of other brainareas that may contribute to this effect such as the amygdala andorbitofrontal cortex (Holland & Gallagher, 2004; O’Doherty,2004; Small et al., 2003; Watanabe Sakagami, & Haruno, 2013),and the medial temporal cortex (Wimmer & Shohamy, 2012;Wittmann et al., 2005). It is important to note that dopaminerelease contacts a myriad of structures in the brain through me-solimbic and nigrostriatal pathways including the motor cortex(Molina-Luna et al., 2009). Thus, the direct instantiation of re-warded incidental learning in the brain could exist through one ormore of these areas.

Limitations of the Current Study

In the current study, we used reward to manipulate behavior.However, we cannot claim that our manipulation (i.e., pointstoward completing the experimental blocks) induced positive feel-ings. Indeed, our reward in these experiments was abstract (time)and not a primary reward (such as food). However, as noted byPalminteri et al. (2009), money is not necessary to induce reward-related behavioral changes (Frank et al., 2004; Schmidt et al.,2008). Thus, although we cannot make claims about the emotionalaspect of the reward granted in our experiment, it is clear that ourmanipulation had a robust behavioral effect.

One might also argue that participants could have viewed re-wards as a neutral condition, and no rewards as a punishmentcondition. However, previous studies have noted that punishmentsdo not induce significant and long-lasting learning gains in motorlearning studies (Abe et al., 2011; Wächter et al., 2009). Therefore,we see it as unlikely that this effect was mediated through pun-ishment circuitry. This does not mean that this effect cannot bemediated through punishment circuitry, however we do maintainthat even if this is the case that the direct and indirect pathways ofthe basal ganglia could still possibly have mediated this type oflearning (Frank, 2005) and that this effect may be driven bychanges in dopamine release.

Conclusion

We examined the learning of rewarded and unrewarded associ-ations while minimizing the role of top–down factors such asattention and motivation. Consistent with previous findings fromPalminteri et al. (2011), we demonstrated that participants canbenefit from reward even when awareness of which combinationsare rewarded remained low. This effect was observed even whenmultiple rewarded and unrewarded combinations shared a singleresponse. We conclude that incidental learning processes are af-fected by reward in such a way to allow for the specific strength-ening of rewarded associations.

References

Abe, M., Schambra, H., Wassermann, E. M., Luckenbaugh, D.,Schweighofer, N., & Cohen, L. G. (2011). Reward improves long-termretention of a motor memory through induction of offline memory gains.Current Biology, 21, 557–562. http://dx.doi.org/10.1016/j.cub.2011.02.030

Adcock, R. A., Thangavel, A., Whitfield-Gabrieli, S., Knutson, B., &Gabrieli, J. D. (2006). Reward-motivated learning: Mesolimbic activa-

tion precedes memory formation. Neuron, 50, 507–517. http://dx.doi.org/10.1016/j.neuron.2006.03.036

Anderson, B. A., Laurent, P. A., & Yantis, S. (2011). Value-driven atten-tional capture. Proceedings of the National Academy of Sciences of theUnited States of America, 108, 10367–10371. http://dx.doi.org/10.1073/pnas.1104047108

Anderson, B. A., & Yantis, S. (2012). Value-driven attentional and ocul-omotor capture during goal-directed, unconstrained viewing. Attention,Perception, & Psychophysics, 74, 1644–1653. http://dx.doi.org/10.3758/s13414-012-0348-2

Anderson, B. A., & Yantis, S. (2013). Persistence of value-driven atten-tional capture. Journal of Experimental Psychology: Human Perceptionand Performance, 39, 6–9. http://dx.doi.org/10.1037/a0030860

Ashby, F. G., & Maddox, W. T. (2011). Human category learning 2.0.Annals of the New York Academy of Sciences, 1224, 147–161. http://dx.doi.org/10.1111/j.1749-6632.2010.05874.x

Ben-Artzi, E., & Marks, L. E. (1995). Visual–auditory interaction inspeeded classification: Role of stimulus difference. Perception & Psy-chophysics, 57, 1151–1162. http://dx.doi.org/10.3758/BF03208371

Berns, G. S., & Sejnowski, T. J. (1998). A computational model of how thebasal ganglia produce sequences. Journal of Cognitive Neuroscience, 10,108–121. http://dx.doi.org/10.1162/089892998563815

Berridge, K. C. (2007). The debate over dopamine’s role in reward: Thecase for incentive salience. Psychopharmacology, 191, 391–431. http://dx.doi.org/10.1007/s00213-006-0578-x

Delgado, M. R., Miller, M. M., Inati, S., & Phelps, E. A. (2005). An fMRIstudy of reward-related probability learning. NeuroImage, 24, 862–873.http://dx.doi.org/10.1016/j.neuroimage.2004.10.002

de Vries, M. H., Ulte, C., Zwitserlood, P., Szymanski, B., & Knecht, S.(2010). Increasing dopamine levels in the brain improves feedback-based procedural learning in healthy participants: An artificial-grammar-learning experiment. Neuropsychologia, 48, 3193–3197. http://dx.doi.org/10.1016/j.neuropsychologia.2010.06.024

Doya, K. (2000). Complementary roles of basal ganglia and cerebellum inlearning and motor control. Current Opinion in Neurobiology, 10, 732–739. http://dx.doi.org/10.1016/S0959-4388(00)00153-7

Ebner, N. (2008). Age of face matters: Age-group differences in ratings ofyoung and old faces. Behavior Research Methods, 40, 130–136. http://dx.doi.org/10.3758/BRM.40.1.130

Flagel, S. B., Clark, J. J., Robinson, T. E., Mayo, L., Czuj, A., Willuhn, I., . . .Akil, H. (2011, January 6). A selective role for dopamine in stimulus–reward learning. Nature, 469, 53–57. http://dx.doi.org/10.1038/nature09588

Frank, M. J. (2005). Dynamic dopamine modulation in the basal ganglia:A neurocomputational account of cognitive deficits in medicated andnonmedicated Parkinsonism. Journal of Cognitive Neuroscience, 17,51–72. http://dx.doi.org/10.1162/0898929052880093

Frank, M. J. (2006). Hold your horses: A dynamic computational role forthe subthalamic nucleus in decision making. Neural Networks, 19,1120–1136. http://dx.doi.org/10.1016/j.neunet.2006.03.006

Frank, M. J., Seeberger, L. C., & O’Reilly, R. C. (2004, December 10). Bycarrot or by stick: Cognitive reinforcement learning in Parkinsonism.Science, 306, 1940–1943. http://dx.doi.org/10.1126/science.1102941

Freedberg, M., Wagschal, T. T., & Hazeltine, E. (2014). Incidental learningand task boundaries. Journal of Experimental Psychology: Learning,Memory, and Cognition, 40, 1680–1700. http://dx.doi.org/10.1037/xlm0000010

Galvan, A., Hare, T. A., Davidson, M., Spicer, J., Glover, G., & Casey,B. J. (2005). The role of ventral frontostriatal circuitry in reward-basedlearning in humans. The Journal of Neuroscience, 25, 8650–8656.http://dx.doi.org/10.1523/JNEUROSCI.2431-05.2005

Gong, M., & Li, S. (2014). Learned reward association improves visualworking memory. Journal of Experimental Psychology: Human Percep-tion and Performance, 40, 841–856. http://dx.doi.org/10.1037/a0035131

801REWARD AND INCIDENTAL LEARNING

Page 17: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

Grafton, S. T., Hazeltine, E., & Ivry, R. (1995). Functional mapping ofsequence learning in normal humans. Journal of Cognitive Neurosci-ence, 7, 497–510.

Graybiel, A. M., & Canales, J. J. (2001). The neurobiology of repetitivebehaviors: Clues to the neurobiology of Tourette syndrome. Advances inNeurology, 85, 123–131.

Haith, A. M., Reppert, T. R., & Shadmehr, R. (2012). Evidence forhyperbolic temporal discounting of reward in control of movements. TheJournal of Neuroscience, 32, 11727–11736. http://dx.doi.org/10.1523/JNEUROSCI.0424-12.2012

Hazeltine, E., Aparicio, P., Weinstein, A., & Ivry, R. B. (2007). Configuralresponse learning: The acquisition of a nonpredictive motor skill. Jour-nal of Experimental Psychology: Human Perception and Performance,33, 1451–1467. http://dx.doi.org/10.1037/0096-1523.33.6.1451

Hikosaka, O., Nakamura, K., Sakai, K., & Nakahara, H. (2002). Centralmechanisms of motor skill learning. Current Opinion in Neurobiology,12, 217–222. http://dx.doi.org/10.1016/S0959-4388(02)00307-0

Holland, P. C., & Gallagher, M. (2004). Amygdala-frontal interactions andreward expectancy. Current Opinion in Neurobiology, 14, 148–155.http://dx.doi.org/10.1016/j.conb.2004.03.007

Ince, E., Ciliax, B. J., & Levey, A. I. (1997). Differential expression of D1and D2 dopamine and m4 muscarinic acetylcholine receptor proteins inidentified striatonigral neurons. Synapse, 27, 357–366. http://dx.doi.org/10.1002/(SICI)1098-2396(199712)27:4�357::AID-SYN93.0.CO;2-B

Javadi, A. H., Schmidt, D. H., & Smolka, M. N. (2014). Adolescents adaptmore slowly than adults to varying reward contingencies. Journal ofCognitive Neuroscience, 26, 2670–2681. http://dx.doi.org/10.1162/jocn_a_00677

Knowlton, B. J., Mangels, J. A., & Squire, L. R. (1996, September 6). Aneostriatal habit learning system in humans. Science, 273, 1399–1402.http://dx.doi.org/10.1126/science.273.5280.1399

Knutson, B., Adams, C. M., Fong, G. W., & Hommer, D. (2001). Antic-ipation of increasing monetary reward selectively recruits nucleus ac-cumbens. The Journal of Neuroscience, 21, 1–5.

Liu, X., Powell, D. K., Wang, H., Gold, B. T., Corbly, C. R., & Joseph,J. E. (2007). Functional dissociation in frontal and striatal areas forprocessing of positive and negative reward information. The Journal ofNeuroscience, 27, 4587–4597. http://dx.doi.org/10.1523/JNEUROSCI.5227-06.2007

Miller, E. M., Shankar, M. U., Knutson, B., & McClure, S. M. (2014).Dissociating motivation from reward in human striatal activity. Journalof Cognitive Neuroscience, 26, 1075–1084. http://dx.doi.org/10.1162/jocn_a_00535

Molina-Luna, K., Pekanovic, A., Röhrich, S., Hertler, B., Schubring-Giese,M., Rioult-Pedotti, M. S., & Luft, A. R. (2009). Dopamine in motorcortex is necessary for skill learning and synaptic plasticity. PLoS ONE,4, e7082. http://dx.doi.org/10.1371/journal.pone.0007082

O’Doherty, J. P. (2004). Reward representations and reward-related learn-ing in the human brain: Insights from neuroimaging. Current Opinion inNeurobiology, 14, 769–776. http://dx.doi.org/10.1016/j.conb.2004.10.016

Opris, I., Lebedev, M., & Nelson, R. J. (2011). Motor planning underunpredictable reward: Modulations of movement vigor and primatestriatum activity. Frontiers in Neuroscience, 5, 61.

Packard, M. G., Hirsh, R., & White, N. M. (1989). Differential effects offornix and caudate nucleus lesions on two radial maze tasks: Evidencefor multiple memory systems. The Journal of Neuroscience, 9, 1465–1472.

Packard, M. G., & Knowlton, B. J. (2002). Learning and memory functionsof the basal ganglia. Annual Review of Neuroscience, 25, 563–593.http://dx.doi.org/10.1146/annurev.neuro.25.112701.142937

Packard, M. G., & McGaugh, J. L. (1996). Inactivation of hippocampus orcaudate nucleus with lidocaine differentially affects expression of place

and response learning. Neurobiology of Learning and Memory, 65,65–72. http://dx.doi.org/10.1006/nlme.1996.0007

Packard, M. G., & White, N. M. (1991). Dissociation of hippocampus andcaudate nucleus memory systems by posttraining intracerebral injectionof dopamine agonists. Behavioral Neuroscience, 105, 295–306. http://dx.doi.org/10.1037/0735-7044.105.2.295

Palminteri, S., Lebreton, M., Worbe, Y., Grabli, D., Hartmann, A., &Pessiglione, M. (2009). Pharmacological modulation of subliminallearning in Parkinson’s and Tourette’s syndromes. Proceedings of theNational Academy of Sciences of the United States of America, 106,19179–19184. http://dx.doi.org/10.1073/pnas.0904035106

Palminteri, S., Lebreton, M., Worbe, Y., Hartmann, A., Lehéricy, S.,Vidailhet, M., . . . Pessiglione, M. (2011). Dopamine-dependent rein-forcement of motor skill learning: Evidence from Gilles de la Tourettesyndrome. Brain: A Journal of Neurology, 134, 2287–2301. http://dx.doi.org/10.1093/brain/awr147

Parent, A. (1990). Extrinsic connections of the basal ganglia. Trends inNeurosciences, 13, 254 –258. http://dx.doi.org/10.1016/0166-2236(90)90105-J

Paulsen, J. S., Butters, N., Salmon, D. P., Heindel, W. C., & Swenson,M. R. (1993). Prism adaptation in Alzheimer’s and Huntington’s dis-ease. Neuropsychology, 7, 73–81. http://dx.doi.org/10.1037/0894-4105.7.1.73

Pessiglione, M., Petrovic, P., Daunizeau, J., Palminteri, S., Dolan, R. J., &Frith, C. D. (2008). Subliminal instrumental conditioning demonstratedin the human brain. Neuron, 59, 561–567. http://dx.doi.org/10.1016/j.neuron.2008.07.005

Pinheiro, J. C., & Bates, D. M. (2000). Mixed-effects models in S andS-PLUS. New York, NY: Springer Publishing.

R Development Core Team. (2011). R: A language and environment forstatistical computing. Vienna, Austria: R Foundation for StatisticalComputing.

Rescorla, R. A. (1968). Probability of shock in the presence and absence ofCS in fear conditioning. Journal of Comparative and PhysiologicalPsychology, 66(1), 1.

Rescorla, R. A. (1988). Behavioral studies of Pavlovian conditioning.Annual Review of Neuroscience, 11(1), 329–352.

Roper, Z. J., Vecera, S. P., & Vaidya, J. G. (2014). Value-driven attentionalcapture in adolescence. Psychological Science, 25, 1987–1993. http://dx.doi.org/10.1177/0956797614545654

Schmidt, L., d’Arc, B. F., Lafargue, G., Galanaud, D., Czernecki, V.,Grabli, D., . . . Pessiglione, M. (2008). Disconnecting force from money:Effects of basal ganglia damage on incentive motivation. Brain, 131,1303–1310.

Schneider, W., Eschman, A., & Zuccolotto, A. (2002). E-Prime user’sguide. Pittsburgh, PA: Psychology Software Tools Inc.

Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journalof Neurophysiology, 80, 1–27.

Schwarz, G. (1978). Estimating the dimension of a model. Annals ofStatistics, 6, 461–464. http://dx.doi.org/10.1214/aos/1176344136

Seger, C. A. (2006). The basal ganglia in human learning. The Neurosci-entist, 12, 285–290. http://dx.doi.org/10.1177/1073858405285632

Seibel, R. (1963). Discrimination reaction time for a 1,023-alternative task.Journal of Experimental Psychology, 66, 215–226. http://dx.doi.org/10.1037/h0048914

Shohamy, D., Myers, C. E., Grossman, S., Sage, J., & Gluck, M. A. (2005).The role of dopamine in cognitive sequence learning: Evidence fromParkinson’s disease. Behavioural Brain Research, 156, 191–199. http://dx.doi.org/10.1016/j.bbr.2004.05.023

Siegert, R. J., Taylor, K. D., Weatherall, M., & Abernethy, D. A. (2006).Is implicit sequence learning impaired in Parkinson’s disease? A meta-analysis. Neuropsychology, 20, 490–495. http://dx.doi.org/10.1037/0894-4105.20.4.490

802 FREEDBERG, SCHACHERER, AND HAZELTINE

Page 18: Incidental Learning of Rewarded Associations Bolsters ...Nov 16, 2015  · 3. Reward may strengthen the learning of associations be-tween stimuli and responses, so that performance

Small, D. M., Gregory, M. D., Mak, Y. E., Gitelman, D., Mesulam, M. M.,& Parrish, T. (2003). Dissociation of neural representation of intensityand affective valuation in human gustation. Neuron, 39, 701–711. http://dx.doi.org/10.1016/S0896-6273(03)00467-7

Suri, R. E., & Schultz, W. (1998). Learning of sequential movements byneural network model with dopamine-like reinforcement signal. Exper-imental Brain Research, 121, 350–354. http://dx.doi.org/10.1007/s002210050467

van Steenbergen, H., Band, G. P., & Hommel, B. (2009). Reward coun-teracts conflict adaptation. Evidence for a role of affect in executivecontrol. Psychological Science, 20, 1473–1477. http://dx.doi.org/10.1111/j.1467-9280.2009.02470.x

van Steenbergen, H., Band, G. P., & Hommel, B. (2012). Reward valencemodulates conflict-driven attentional adaptation: Electrophysiologicalevidence. Biological Psychology, 90, 234–241. http://dx.doi.org/10.1016/j.biopsycho.2012.03.018

Wächter, T., Lungu, O., Lui, T., Willingham, D., & Ashe, J. (2009).Differential effect of reward and punishment on procedural learning.Journal of Neuroscience, 29, 436–443.

Watanabe, N., Sakagami, M., & Haruno, M. (2013). Reward predictionerror signal enhanced by striatum-amygdala interaction explains theacceleration of probabilistic reward learning by emotion. The Journal ofNeuroscience, 33, 4487–4493. http://dx.doi.org/10.1523/JNEUROSCI.3400-12.2013

Wickens, J. (1997). Basal ganglia: Structure and computations. Network, 8,77–109. http://dx.doi.org/10.1088/0954-898X_8_4_001

Wifall, T., McMurray, B., & Hazeltine, E. (2014). Perceptual similarityaffects the learning curve (but not necessarily learning). Journal ofExperimental Psychology: General, 143, 312–331. http://dx.doi.org/10.1037/a0030865

Wimmer, G. E., & Shohamy, D. (2012, October 12). Preference by asso-ciation: How memory mechanisms in the hippocampus bias decisions.Science, 338, 270–273. http://dx.doi.org/10.1126/science.1223252

Wittmann, B. C., Dolan, R. J., & Düzel, E. (2011). Behavioral specifica-tions of reward-associated long-term memory enhancement in humans.Learning & Memory, 18, 296 –300. http://dx.doi.org/10.1101/lm.1996811

Wittmann, B. C., Schott, B. H., Guderian, S., Frey, J. U., Heinze, H. J., &Düzel, E. (2005). Reward-related FMRI activation of dopaminergicmidbrain is associated with enhanced hippocampus-dependent long-termmemory formation. Neuron, 45, 459–467. http://dx.doi.org/10.1016/j.neuron.2005.01.010

Zaghloul, K. A., Blanco, J. A., Weidemann, C. T., McGill, K., Jaggi, J. L.,Baltuch, G. H., & Kahana, M. J. (2009, March 13). Human substantianigra neurons encode unexpected financial rewards. Science, 323, 1496–1499. http://dx.doi.org/10.1126/science.1167342

Received March 28, 2015Revision received August 20, 2015

Accepted August 24, 2015 �

803REWARD AND INCIDENTAL LEARNING