static-content.springer.com10.3758/s134… · web viewin the case of mental imagery, ... journal...
Post on 06-Feb-2018
219 Views
Preview:
TRANSCRIPT
Working Memory Training Harms Recognition Memory Performance 1
Practice Makes Imperfect: Working Memory Training Can Harm Recognition Memory
Performance
Laura E. Matzen1, Michael C. Trumbo1,2, Michael J. Haass1, Michael A. Hunter2, Austin Silva1,
Susan M. Stevens-Adams1, Michael F. Bunting3, Polly O’Rourke3
1Sandia National Laboratories
2University of New Mexico
3Center for the Advanced Study of Language
Working Memory Training Harms Recognition Memory Performance 2
Supplemental Materials
Analysis of Spacing and Testing Effects in the Recognition Memory Task
The recognition memory task included several conditions that allowed us to investigate
the impact of WM training on repetition effects, spacing effects, and testing effects. Some of the
study words in the recognition task were presented only once and others were presented twice
with either short or long lags between the two presentations. Another set of words was quizzed
during the study session, with quizzes occurring at a short or long lag after the corresponding
study item. In general, participants should have better memory for repeated items than for items
that were presented only once. This benefit should be larger for repetitions that occur after a long
lag than for repetitions that occur after a short lag (e.g., Braun & Rubin, 1998; Cepeda, Pashler,
Vul, Wixted, & Rohrer, 2006). Participants should also have better memory for items that are
quizzed during the study session, particularly for quizzes that occur at longer lags (Karpicke &
Roediger, 2007). There are numerous explanations of the spacing effect and the testing effect,
many of which relate to the amount of time that individual items spend in WM. For example,
participants may have superior memory for spaced items because they spend additional time in
WM and serve as retrieval cues for prior presentations of the same items (Braun & Rubin, 1998).
Similarly, for the testing effect, retrieving a previously-studied item seems to provide the biggest
benefit to subsequent memory performance if it occurs after the item has been cleared from
working memory (Karpicke & Roediger, 2007). Therefore, we hypothesized that an improved
WM span might reduce the size of the spacing and testing effects. If the initial presentation of
each item spends more time in WM, it is more likely to be remembered at test. However, the
benefit of additional opportunities to retrieve the item during study will be reduced if the item is
still active in WM when the repetitions and quizzes appear.
Working Memory Training Harms Recognition Memory Performance 3
Pre-training performance. To assess the effectiveness of the spacing and testing manipulations,
we analyzed the participants’ performance on each condition in the pre-training recognition memory test.
For this analysis, performance was collapsed across training groups because the participants had not yet
been exposed to any of the training interventions. Paired t-tests showed that, as predicted, the
participants’ d’ scores for items on the final memory test were significantly higher for words that were
repeated or quizzed during study relative to words that were studied only once (all ts > 10.45, all ps <
0.001). A two-way repeated measures ANOVA for the repeated and quizzed items showed that there was
a significant main effect of spacing (F(1, 216) = 37.24, p < 0.001, ηp2 = 0.15), a significant main effect of
testing (F(1, 216) = 48.70, p < 0.001, ηp2 = 0.18), and a significant interaction between the two (F(1, 216)
= 3.93, p < 0.05, ηp2 = 0.02). Post-hoc paired t-tests showed that there was a spacing effect for both the
repeated and the quizzed study items such that participants had significantly higher d’ scores for long lag
items than for short lag items (t(72) = 3.29, p < 0.001, Hedges’s gav = 0.18 for repeated words; t (72) =
5.70, p < 0.001, Hedges’s gav = 0.35 for quizzed words). There was also a testing effect where participants
had significantly higher d’ scores for conditions that were quizzed relative to conditions where the words
were merely studied twice (t(72) = 4.09, p < 0.001, Hedges’s gav = 0.22 for short lag items; t(72) = 5.67, p
< 0.001, Hedges’s gav = 0.37 for long lag items). The results from the pre-training session indicate that the
structure of the recognition task successfully produced spacing and testing effects.
Spacing and testing effects after training. For the post-training recognition memory test, memory
performance was analyzed separately for each training group. Paired t-tests showed that, as in the pre-
training session, the participants’ d’ scores for items on the final memory test were significantly higher
for words that were repeated or quizzed than for words that were studied only once. This was true for all
three training groups (for the control group: all ts > 4.39, all ps < 0.001, all Hedges’s gav > 0.43; for the
imagery training group: all ts > 6.48, all ps < 0.001, all Hedges’s gav > 0.47; for the WM training group:
all ts > 7.37, all ps < 0.001, all Hedges’s gav > 0.57).
For each training group, a two-way repeated measures ANOVA was used to test the spacing and
testing effects for the repeated and quizzed items in the post-training session. For the control group, there
Working Memory Training Harms Recognition Memory Performance 4
was a significant main effect of testing (F(1, 72) = 4.68, p < 0.04, ηp2 = 0.06), but there was not a
significant main effect for spacing (F(1, 72) = 1.01, p = 0.32, ηp2 = 0.01). Post-hoc paired t-tests showed
that participants in the control group had significantly better memory performance for the long-lag
quizzes than for the long-lag repetitions (t(24) = 2.04, p < 0.03, Hedges’s gav = 0.16), but no significant
testing effect for the short-lag items (t(24) = 1.12, p = 0.14, Hedges’s gav = 0.11). The imagery training
group had a significant main effect of spacing (F(1, 69) = 9.23, p < 0.01, ηp2 = 0.12), but no significant
main effect of testing (F(1, 69) = 1.94, p = 0.17, ηp2 = 0.03). Post-hoc paired t-tests showed that the
spacing effect was significant for the quizzed items (t(23) = 3.09, p < 0.01, Hedges’s gav = 0.25) but not
for the repeated items (t(23) = 1.31, p = 0.10, Hedges’s gav = 0.12). Like the control group, the WM
training group had a significant main effect of testing (F(1, 69) = 13.19, p < 0.01, ηp2 = 0.16), but no
significant main effect of spacing (F(1, 69) = 0.40, p = 0.52, ηp2 = 0.01). Post-hoc paired t-tests showed
that participants in the WM training group had significantly better memory performance for the long-lag
quizzes than for the long-lag repetitions (t(23) = 3.94, p < 0.01, Hedges’s gav = 0.39), but no significant
testing effect for the short-lag items (t(24) = 1.40, p = 0.09, Hedges’s gav = 0.17).
To assess how training condition impacted the spacing and testing effects for each group, we
calculated the average benefit from spacing and the average benefit from testing for each participant, both
before and after training. To calculate the size of the spacing effect for each participant, we calculated the
difference in d’ scores (Δd’) between all long-lag items (including both repetitions and quizzes) and all
short-lag items. Similarly, for the testing effect, we calculated the difference in d’ scores between all
quizzed items and all repeated items. The average spacing and testing effects for the pre- and post-
training sessions for each training group are shown in Figure S3.
For the average spacing effect size, 3x2 repeated measures ANOVA (training group x session)
showed that there was a significant main effect of training group (F(2, 70) = 4.57, p < 0.02, ηp2 = 0.12)
and a significant main effect of session (pre-training or post-training; F(1, 70) = 5.47, p = 0.02, ηp2 =
0.07). Post-hoc paired t-tests showed that participants in the WM training group had significantly smaller
average spacing effects in the post-training session relative to the pre-training session (t(23) = 3.08, p <
Working Memory Training Harms Recognition Memory Performance 5
0.01, Hedges’s gav = 0.81). The average spacing effect sizes for the other two groups did not differ
significantly across training sessions (control group: t(24) = 0.08, p = 0.47, Hedges’s gav = 0.03; imagery
group: t(23) = 1.30, p = 0.10, Hedges’s gav = 0.27). For the average testing effect size, 3x2 repeated
measures ANOVA (training group x session) showed that there was a significant main effect of training
group (F(2, 70) = 3.47, p < 0.04, ηp2 = 0.09) and a significant main effect of session (pre-training or post-
training; F(1, 70) = 5.21, p < 0.03, ηp2 = 0.07). Post-hoc paired t-tests showed that participants in the WM
training group had significantly smaller average testing effects in the post-training session relative to the
pre-training session (t(23) = 2.15, p = 0.02, Hedges’s gav = 0.54). The average testing effect sizes for the
other two groups did not differ significantly across training sessions (control group: t(24) = 1.39, p =
0.09, Hedges’s gav = 0.26; imagery group: t(23) = 0.48, p = 0.32, Hedges’s gav = 0.12).
Relationship between self-reported memory strategies and the spacing and testing effects.
Our analyses of the participants’ overall memory performance showed that there were significant
differences between participants who reported using mental imagery, semantic, and shallow
strategies. We also assessed how the magnitude of the spacing and testing effects differed for
participants who reported different memory strategies. In the post-training session, the average
spacing effect was 0.18 (SD = 0.24) for participants who reported using an imagery strategy,
0.05 (SD = 0.22) for participants who reported using a semantic strategy, and -0.06 (SD = 0.21)
for participants who reported using a shallow memory strategy. A one-way ANOVA showed that
the average spacing effect size differed significantly across the three groups (F(2, 53) = 5.83, p <
0.01, ηp2 = 0.18). Post-hoc t-tests showed that the participants who reported using an imagery
strategy had an average spacing effect size that was significantly larger than participants who
reported using a shallow memory strategy (t(40) = 3.38, p < 0.001, Hedges’s gav = 0.98) and
marginally larger than participants who reported using a semantic strategy (t(33) = 1.52, p =
Working Memory Training Harms Recognition Memory Performance 6
0.07, Hedges’s gav = 0.55). There was also a marginal difference between the semantic and
shallow strategy groups (t(33) = 1.55, p = 0.06, Hedges’s gav = 0.45).
The average testing effect sizes in the post-training session were -0.05 (SD = 0.25) for
the participants who reported using an imagery strategy, 0.14 (SD = 0.18) for participants who
reported using a semantic strategy, and 0.20 (SD = 0.17) for participants who reported using a
shallow memory strategy. A one-way ANOVA showed that the average testing effect size
differed significantly across the three groups (F(2, 53) = 8.01, p < 0.001, ηp2 = 0.23). Post-hoc t-
tests showed that the participants who reported using an imagery strategy had an average testing
effect size that was significantly smaller than participants who reported using a shallow memory
strategy (t(40) = 3.73, p < 0.001, Hedges’s gav = 1.11) and participants who reported using a
semantic strategy (t(33) = 2.46, p < 0.01, Hedges’s gav = 0.80). There was not a significant
difference between the testing effect sizes for the semantic and shallow strategy groups (t(33) =
0.88, p = 0.19).
Discussion of the Spacing and Testing Effect Results
The recognition memory task’s conditions produced spacing and testing effects that
differed for the three memory training groups and for participant groupings that were based on
self-reported memory strategies. We initially hypothesized that if the WM training truly
improved the participants’ WM capacity, the studied words would spend more time in working
memory, leading to better memory overall, but potentially a reduction in the size of the spacing
and testing effects. We found that while the sizes of the spacing and testing effects were
consistent before and after training for the control and imagery training groups, both were
significantly smaller after training for the WM training group. However, we did not see an
improvement in memory performance for the once-presented items or short-lag repetitions as we
would expect if all items were spending more time in WM. Given the overall pattern of results
Working Memory Training Harms Recognition Memory Performance 7
and the fact that the participants in the WM training group did not outperform the other two
groups on the near-transfer WM tasks (the rotation span and listening span tasks), it seems
unlikely that the reduction in the size of the spacing and testing effects is due to enhanced WM
capacity. Rather, the changes to the spacing and testing effect sizes seem to be driven by the WM
training group’s poorer overall recognition memory performance.
Additional support for this interpretation comes from the analysis of the spacing and
testing effects when the participants were grouped according to their self-reported memory
strategies. This analysis showed that the participants who reported using an imagery strategy had
larger spacing effects than participants who reported using shallow memory strategies. This
finding is consistent with past research showing large spacing effects for participants using deep
encoding strategies, and no spacing effect for participants using a rote rehearsal strategy
(Delaney & Knowles, 2005). It is not clear why participants who reported using an imagery
strategy had marginally larger spacing effects than participants who reported other deep
encoding strategies, but it is possible that mental imagery offers a unique benefit for the spacing
effect. The encoding variability account of the spacing effect posits that spaced repetitions are
more likely to have unique encoding contexts than massed repetitions, providing additional cues
for retrieval (cf. Balota, Duchek & Logan, 2007; Cepeda et al., 2006). In the case of mental
imagery, participants may be more likely to encode different mental images for words that are
spaced at long lags, increasing their chances of remembering one of those images at test.
The analysis of the testing effect showed that the participants who reported using a
mental imagery strategy had smaller testing effects than participants who reported using other
strategies. Since the testing effect is stronger when the information is retrieved after being
cleared from working memory (Karpicke & Roediger, 2007), this pattern suggests that study
Working Memory Training Harms Recognition Memory Performance 8
items may have spent more time in WM for participants who used mental imagery relative to
participants who used other strategies. Given the small numbers of participants in each strategy
group, these interpretations are speculative. However, the patterns of results indicate that the
interplay between WM capacity, strategy use, and spacing and testing effects warrants further
investigation.
Additional analyses of the imagery training sessions
The participants’ ratings of how easy it was to create mental images (where 1 was easy
and 5 was difficult) increased from an average of 3.23 for the memory tests in the first training
session, to an average of 3.60 for the memory tests in the second training session, and finally to
an average of 3.76 in the third training session. A repeated measures ANOVA showed that there
was a significant effect of test number on the participants’ difficulty ratings (F(13, 273) = 2.39, p
< 0.01, ηp2 = 0.10). Paired t-tests showed that participants’ ratings of difficulty increased
significantly on the second training session relative to the first (t(23) = 2.00, p < 0.03, Hedges’s
gav = 0.53), but there was not a significant difference between the participants’ average difficulty
ratings on the second and third training sessions (t(21) = 0.87). Similarly, the participants’
ratings of the effectiveness of the memory strategy (where 1 is not effective and 5 is very
effective) decreased from an average of 3.77 in the first training session to 3.02 in the second
training session, and 2.33 in the third training session. A repeated measures ANOVA showed
that there was a significant effect of test number on the participants’ difficulty ratings (F(13,
273) = 5.35, p < 0.001, ηp2 = 0.20). Paired t-tests showed that participants’ ratings of the
effectiveness of the imagery strategy were significantly lower during the second training session
than the first (t(23) = 3.59, p < 0.001, Hedges’s gav = 0.86) and significantly lower during the
third training session than during the second training session (t(21) = 2.02, p < 0.03, Hedges’s gav
Working Memory Training Harms Recognition Memory Performance 9
= 0.37). Therefore, improvement with regard to imaging proficiency was countered by
heightened task difficulty, such that participants did not feel that mental imagery was as
efficacious at the end of training (for difficult study lists) as it was at the beginning (for easy
study lists).
Working Memory Training Harms Recognition Memory Performance 10
References
Balota, D. A., Duchek, J. M., & Logan, J. M. (2007). Is expanded retrieval practice a superior form of
spaced retrieval? A critical review of the extant literature. In J. S. Nairne (Ed.), The foundations of
remembering: Essays in honor of Henry L. Roediger, III (pp. 83-105). New York: Psychology
Press.
Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006). Distributed practice in verbal
recall tasks: A review and quantitative synthesis. Psychological bulletin, 132, 354-380.
Delaney, P. F., & Knowles, M. E. (2005). Encoding strategy changes and spacing effects in the free recall
of unmixed lists. Journal of Memory and Language, 52, 120–130.
Karpicke, J. D., & Roediger, H. L. (2007). Expanding retrieval practice promotes short-term retention, but
equally spaced retrieval enhances long-term retention. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 33, 704-719.
Working Memory Training Harms Recognition Memory Performance 11
Table S1.
Design of recall tests used in mental imagery training sessions.
Imagery Training
Study List
Training Session
Total Number of Study Words
Number of Study Words With Low
Imagability
Encoding Time Per Word
1 1 10 1 Self-paced2 1 10 2 3 seconds3 2 10 2 3 seconds4 2 10 3 3 seconds5 2 10 3 2 seconds6 2 10 3 2 seconds7 2 12 3 2 seconds8 2 12 4 2 seconds9 3 12 4 3 seconds10 3 12 4 2 seconds11 3 14 4 2 seconds12 3 14 5 2 seconds13 3 16 5 2 seconds14 3 16 6 2 seconds
Working Memory Training Harms Recognition Memory Performance 12
Table S2.
Mean and Standard Deviation Proportion Correct for Each Condition of the Recognition
Memory Task for Each Training Group
Condition Training GroupPre-training Proportion
Correct
Post-training Proportion
Correct
Change in Performance (Post – Pre)
Wor
ds q
uizz
ed d
urin
g st
udy
sess
ion
Short-lag quizzes
Control Group 0.92 (0.08) 0.93 (0.09) 0.01 (0.07)Mental Imagery 0.96 (0.06) 0.98 (0.03) 0.02 (0.06)Working Memory 0.92 (0.01) 0.91 (0.01) -0.01 (0.08)
Long-lag quizzes
Control Group 0.69 (0.17) 0.69 (0.18) 0.00 (0.16)Mental Imagery 0.74 (0.24) 0.80 (0.24) 0.06 (0.23)Working Memory 0.69 (0.20) 0.62 (0.24) -0.07 (0.15)*
New words quizzed
Control Group 0.88 (0.20) 0.92 (0.09) 0.04 (0.15)Mental Imagery 0.91 (0.10) 0.93 (0.11) 0.02 (0.09)Working Memory 0.93 (0.05) 0.93 (0.07) 0.00 (0.06)
Wor
ds te
sted
dur
ing
subs
eque
nt m
emor
y te
st
Once-studied words
Control Group 0.47 (0.16) 0.49 (0.18) 0.02 (0.13)Mental Imagery 0.57 (0.15) 0.64 (0.20) 0.07 (0.19)*Working Memory 0.46 (0.17) 0.42 (0.20) -0.05 (0.15)*
Short-lag repetitions
Control Group 0.59 (0.15) 0.60 (0.20) 0.01 (0.16)Mental Imagery 0.68 (0.15) 0.76 (0.17) 0.08 (0.16)*Working Memory 0.60 (0.16) 0.57 (0.19) -0.03 (0.12)
Long-lag repetitions
Control Group 0.60 (0.16) 0.60 (0.22) 0.00 (0.20)Mental Imagery 0.75 (0.14) 0.78 (0.19) 0.03 (0.16)Working Memory 0.63 (0.16) 0.54 (0.22) -0.09 (0.17)*
Short-lag quizzes
Control Group 0.64 (0.15) 0.63 (0.18) -0.01 (0.15)Mental Imagery 0.72 (0.13) 0.77 (0.15) 0.05 (0.12)*Working Memory 0.66 (0.16) 0.59 (0.18) -0.07 (0.14)*
Long-lag quizzes
Control Group 0.67 (0.15) 0.65 (0.21) -0.02 (0.14)Mental Imagery 0.78 (0.14) 0.81 (0.20) 0.03 (0.14)Working Memory 0.76 (0.15) 0.63 (0.20) -0.13 (0.12)*
New wordsControl Group 0.83 (0.14) 0.82 (0.15) -0.01 (0.09)Mental Imagery 0.83 (0.10) 0.84 (0.11) 0.01 (0.13)Working Memory 0.84 (0.09) 0.83 (0.11) -0.01 (0.08)
Working Memory Training Harms Recognition Memory Performance 13
Table S3.
Representative Examples of Each Memory Strategy
Category Participant’s Descriptions
Imagery “Tried to create a mental image that represented the word.”
“Tried to create a vivid image of the word. If it was an object it was much easier than something abstract.”
“String them together in groups that form a little story while thinking of pictures to go with the story.”
“I tried to put pictures with the words or associate them with other words or feelings.”
Semantic “Associated them with something I know or experienced.”
“I made sentences in my head.”
“Tried to blend them together in a story.”
“I tried to associate the words with each other.”
Shallow “Saying the words in my head.”
“Repetition.”
“Just tried to remember.”
“No real strategy just like absorbing them and add them to a list.”
“Not much of a strategy… I let the words sort of float to the back of my mind.”
“I would remember the first letters of the words. Certain words would have meaning and were easily remembered without trying.”
Working Memory Training Harms Recognition Memory Performance 14
Figure Captions
Figure S1. Breakdown of age, gender and education levels of study participants.
Figure S2. Average d’ score on each condition of the post-training recognition memory test for
participants who reported using imagery, semantic, and shallow memory strategies on the post-
training recognition test. Error bars show standard error.
Figure S3. Average spacing effect size (A) and average testing effect size (B) on the pre-training and
post-training recognition memory task for participants in each training group.
Working Memory Training Harms Recognition Memory Performance 15
Figure S1
18-2021-25
26-3031-35
36-4041-45
46-5051-55
56-6060-63
Not reporte
d02468
1012141618
Age Range of Participants
FemaleMale
Num
ber
of P
artic
ipan
ts
High School
Some colle
ge
Associa
te Deg
ree
Bachelo
r's D
egree
Master'
s Deg
reePh.D
.
Not reporte
d0
5
10
15
20
25
30
35
Highest Level of Education Completed
Num
ber
of P
artic
ipan
ts
Working Memory Training Harms Recognition Memory Performance 16
Figure S2.
Imagery Strategy Semantic Strategy Shallow Strategy0
0.5
1
1.5
2
2.5
Average d' Score on Post-training Recognition Memory Test
Studied OnceStudied Twice, Short LagStudied Twice, Long LagQuizzed, Short LagQuizzed, Long Lag
Ave
rage
d' S
core
Working Memory Training Harms Recognition Memory Performance 17
Figure S3.
A.
Control Group Imagery Training Group
WM Training Group0
0.05
0.1
0.15
0.2
0.25
0.3
Average Spacing Effect Size
Pre-trainingPost-training
Lon
g L
ag d
' Sco
res
Min
us S
hort
Lag
d'
Scor
es
B.
Control Group Imagery Training Group
WM Training Group0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Average Testing Effect Size
Pre-trainingPost-training
Qui
zzed
Ite
m d
' Sco
res
Min
us R
epea
ted
Item
d'
Sco
res
top related