static-content.springer.com10.3758/s134… · web viewin the case of mental imagery, ... journal...

Working Memory Training Harms Recognition Memory Performance 1

Practice Makes Imperfect: Working Memory Training Can Harm Recognition Memory

Performance

Laura E. Matzen1, Michael C. Trumbo1,2, Michael J. Haass1, Michael A. Hunter2, Austin Silva1,

Susan M. Stevens-Adams1, Michael F. Bunting3, Polly O’Rourke3

1Sandia National Laboratories

2University of New Mexico

3Center for the Advanced Study of Language

Supplemental Materials

Analysis of Spacing and Testing Effects in the Recognition Memory Task

The recognition memory task included several conditions that allowed us to investigate

the impact of WM training on repetition effects, spacing effects, and testing effects. Some of the

study words in the recognition task were presented only once and others were presented twice

with either short or long lags between the two presentations. Another set of words was quizzed

during the study session, with quizzes occurring at a short or long lag after the corresponding

study item. In general, participants should have better memory for repeated items than for items

that were presented only once. This benefit should be larger for repetitions that occur after a long

lag than for repetitions that occur after a short lag (e.g., Braun & Rubin, 1998; Cepeda, Pashler,

Vul, Wixted, & Rohrer, 2006). Participants should also have better memory for items that are

quizzed during the study session, particularly for quizzes that occur at longer lags (Karpicke &

Roediger, 2007). There are numerous explanations of the spacing effect and the testing effect,

many of which relate to the amount of time that individual items spend in WM. For example,

participants may have superior memory for spaced items because they spend additional time in

WM and serve as retrieval cues for prior presentations of the same items (Braun & Rubin, 1998).

Similarly, for the testing effect, retrieving a previously-studied item seems to provide the biggest

benefit to subsequent memory performance if it occurs after the item has been cleared from

working memory (Karpicke & Roediger, 2007). Therefore, we hypothesized that an improved

WM span might reduce the size of the spacing and testing effects. If the initial presentation of

each item spends more time in WM, it is more likely to be remembered at test. However, the

benefit of additional opportunities to retrieve the item during study will be reduced if the item is

still active in WM when the repetitions and quizzes appear.

Pre-training performance. To assess the effectiveness of the spacing and testing manipulations,

we analyzed the participants’ performance on each condition in the pre-training recognition memory test.

For this analysis, performance was collapsed across training groups because the participants had not yet

been exposed to any of the training interventions. Paired t-tests showed that, as predicted, the

participants’ d’ scores for items on the final memory test were significantly higher for words that were

repeated or quizzed during study relative to words that were studied only once (all ts > 10.45, all ps <

0.001). A two-way repeated measures ANOVA for the repeated and quizzed items showed that there was

a significant main effect of spacing (F(1, 216) = 37.24, p < 0.001, ηp2 = 0.15), a significant main effect of

testing (F(1, 216) = 48.70, p < 0.001, ηp2 = 0.18), and a significant interaction between the two (F(1, 216)

= 3.93, p < 0.05, ηp2 = 0.02). Post-hoc paired t-tests showed that there was a spacing effect for both the

repeated and the quizzed study items such that participants had significantly higher d’ scores for long lag

items than for short lag items (t(72) = 3.29, p < 0.001, Hedges’s gav = 0.18 for repeated words; t (72) =

5.70, p < 0.001, Hedges’s gav = 0.35 for quizzed words). There was also a testing effect where participants

had significantly higher d’ scores for conditions that were quizzed relative to conditions where the words

were merely studied twice (t(72) = 4.09, p < 0.001, Hedges’s gav = 0.22 for short lag items; t(72) = 5.67, p

< 0.001, Hedges’s gav = 0.37 for long lag items). The results from the pre-training session indicate that the

structure of the recognition task successfully produced spacing and testing effects.

Spacing and testing effects after training. For the post-training recognition memory test, memory

performance was analyzed separately for each training group. Paired t-tests showed that, as in the pre-

training session, the participants’ d’ scores for items on the final memory test were significantly higher

for words that were repeated or quizzed than for words that were studied only once. This was true for all

three training groups (for the control group: all ts > 4.39, all ps < 0.001, all Hedges’s gav > 0.43; for the

imagery training group: all ts > 6.48, all ps < 0.001, all Hedges’s gav > 0.47; for the WM training group:

all ts > 7.37, all ps < 0.001, all Hedges’s gav > 0.57).

For each training group, a two-way repeated measures ANOVA was used to test the spacing and

testing effects for the repeated and quizzed items in the post-training session. For the control group, there

was a significant main effect of testing (F(1, 72) = 4.68, p < 0.04, ηp2 = 0.06), but there was not a

significant main effect for spacing (F(1, 72) = 1.01, p = 0.32, ηp2 = 0.01). Post-hoc paired t-tests showed

that participants in the control group had significantly better memory performance for the long-lag

quizzes than for the long-lag repetitions (t(24) = 2.04, p < 0.03, Hedges’s gav = 0.16), but no significant

testing effect for the short-lag items (t(24) = 1.12, p = 0.14, Hedges’s gav = 0.11). The imagery training

group had a significant main effect of spacing (F(1, 69) = 9.23, p < 0.01, ηp2 = 0.12), but no significant

main effect of testing (F(1, 69) = 1.94, p = 0.17, ηp2 = 0.03). Post-hoc paired t-tests showed that the

spacing effect was significant for the quizzed items (t(23) = 3.09, p < 0.01, Hedges’s gav = 0.25) but not

for the repeated items (t(23) = 1.31, p = 0.10, Hedges’s gav = 0.12). Like the control group, the WM

training group had a significant main effect of testing (F(1, 69) = 13.19, p < 0.01, ηp2 = 0.16), but no

significant main effect of spacing (F(1, 69) = 0.40, p = 0.52, ηp2 = 0.01). Post-hoc paired t-tests showed

that participants in the WM training group had significantly better memory performance for the long-lag

quizzes than for the long-lag repetitions (t(23) = 3.94, p < 0.01, Hedges’s gav = 0.39), but no significant

testing effect for the short-lag items (t(24) = 1.40, p = 0.09, Hedges’s gav = 0.17).

To assess how training condition impacted the spacing and testing effects for each group, we

calculated the average benefit from spacing and the average benefit from testing for each participant, both

before and after training. To calculate the size of the spacing effect for each participant, we calculated the

difference in d’ scores (Δd’) between all long-lag items (including both repetitions and quizzes) and all

short-lag items. Similarly, for the testing effect, we calculated the difference in d’ scores between all

quizzed items and all repeated items. The average spacing and testing effects for the pre- and post-

training sessions for each training group are shown in Figure S3.

For the average spacing effect size, 3x2 repeated measures ANOVA (training group x session)

showed that there was a significant main effect of training group (F(2, 70) = 4.57, p < 0.02, ηp2 = 0.12)

and a significant main effect of session (pre-training or post-training; F(1, 70) = 5.47, p = 0.02, ηp2 =

0.07). Post-hoc paired t-tests showed that participants in the WM training group had significantly smaller

average spacing effects in the post-training session relative to the pre-training session (t(23) = 3.08, p <

0.01, Hedges’s gav = 0.81). The average spacing effect sizes for the other two groups did not differ

significantly across training sessions (control group: t(24) = 0.08, p = 0.47, Hedges’s gav = 0.03; imagery

group: t(23) = 1.30, p = 0.10, Hedges’s gav = 0.27). For the average testing effect size, 3x2 repeated

measures ANOVA (training group x session) showed that there was a significant main effect of training

group (F(2, 70) = 3.47, p < 0.04, ηp2 = 0.09) and a significant main effect of session (pre-training or post-

training; F(1, 70) = 5.21, p < 0.03, ηp2 = 0.07). Post-hoc paired t-tests showed that participants in the WM

training group had significantly smaller average testing effects in the post-training session relative to the

pre-training session (t(23) = 2.15, p = 0.02, Hedges’s gav = 0.54). The average testing effect sizes for the

other two groups did not differ significantly across training sessions (control group: t(24) = 1.39, p =

0.09, Hedges’s gav = 0.26; imagery group: t(23) = 0.48, p = 0.32, Hedges’s gav = 0.12).

Relationship between self-reported memory strategies and the spacing and testing effects.

Our analyses of the participants’ overall memory performance showed that there were significant

differences between participants who reported using mental imagery, semantic, and shallow

strategies. We also assessed how the magnitude of the spacing and testing effects differed for

participants who reported different memory strategies. In the post-training session, the average

spacing effect was 0.18 (SD = 0.24) for participants who reported using an imagery strategy,

0.05 (SD = 0.22) for participants who reported using a semantic strategy, and -0.06 (SD = 0.21)

for participants who reported using a shallow memory strategy. A one-way ANOVA showed that

the average spacing effect size differed significantly across the three groups (F(2, 53) = 5.83, p <

0.01, ηp2 = 0.18). Post-hoc t-tests showed that the participants who reported using an imagery

strategy had an average spacing effect size that was significantly larger than participants who

reported using a shallow memory strategy (t(40) = 3.38, p < 0.001, Hedges’s gav = 0.98) and

marginally larger than participants who reported using a semantic strategy (t(33) = 1.52, p =

0.07, Hedges’s gav = 0.55). There was also a marginal difference between the semantic and

shallow strategy groups (t(33) = 1.55, p = 0.06, Hedges’s gav = 0.45).

The average testing effect sizes in the post-training session were -0.05 (SD = 0.25) for

the participants who reported using an imagery strategy, 0.14 (SD = 0.18) for participants who

reported using a semantic strategy, and 0.20 (SD = 0.17) for participants who reported using a

shallow memory strategy. A one-way ANOVA showed that the average testing effect size

differed significantly across the three groups (F(2, 53) = 8.01, p < 0.001, ηp2 = 0.23). Post-hoc t-

tests showed that the participants who reported using an imagery strategy had an average testing

effect size that was significantly smaller than participants who reported using a shallow memory

strategy (t(40) = 3.73, p < 0.001, Hedges’s gav = 1.11) and participants who reported using a

semantic strategy (t(33) = 2.46, p < 0.01, Hedges’s gav = 0.80). There was not a significant

difference between the testing effect sizes for the semantic and shallow strategy groups (t(33) =

0.88, p = 0.19).

Discussion of the Spacing and Testing Effect Results

The recognition memory task’s conditions produced spacing and testing effects that

differed for the three memory training groups and for participant groupings that were based on

self-reported memory strategies. We initially hypothesized that if the WM training truly

improved the participants’ WM capacity, the studied words would spend more time in working

memory, leading to better memory overall, but potentially a reduction in the size of the spacing

and testing effects. We found that while the sizes of the spacing and testing effects were

consistent before and after training for the control and imagery training groups, both were

significantly smaller after training for the WM training group. However, we did not see an

improvement in memory performance for the once-presented items or short-lag repetitions as we

would expect if all items were spending more time in WM. Given the overall pattern of results

and the fact that the participants in the WM training group did not outperform the other two

groups on the near-transfer WM tasks (the rotation span and listening span tasks), it seems

unlikely that the reduction in the size of the spacing and testing effects is due to enhanced WM

capacity. Rather, the changes to the spacing and testing effect sizes seem to be driven by the WM

training group’s poorer overall recognition memory performance.

Additional support for this interpretation comes from the analysis of the spacing and

testing effects when the participants were grouped according to their self-reported memory

strategies. This analysis showed that the participants who reported using an imagery strategy had

larger spacing effects than participants who reported using shallow memory strategies. This

finding is consistent with past research showing large spacing effects for participants using deep

encoding strategies, and no spacing effect for participants using a rote rehearsal strategy

(Delaney & Knowles, 2005). It is not clear why participants who reported using an imagery

strategy had marginally larger spacing effects than participants who reported other deep

encoding strategies, but it is possible that mental imagery offers a unique benefit for the spacing

effect. The encoding variability account of the spacing effect posits that spaced repetitions are

more likely to have unique encoding contexts than massed repetitions, providing additional cues

for retrieval (cf. Balota, Duchek & Logan, 2007; Cepeda et al., 2006). In the case of mental

imagery, participants may be more likely to encode different mental images for words that are

spaced at long lags, increasing their chances of remembering one of those images at test.

The analysis of the testing effect showed that the participants who reported using a

mental imagery strategy had smaller testing effects than participants who reported using other

strategies. Since the testing effect is stronger when the information is retrieved after being

cleared from working memory (Karpicke & Roediger, 2007), this pattern suggests that study

items may have spent more time in WM for participants who used mental imagery relative to

participants who used other strategies. Given the small numbers of participants in each strategy

group, these interpretations are speculative. However, the patterns of results indicate that the

interplay between WM capacity, strategy use, and spacing and testing effects warrants further

investigation.

Additional analyses of the imagery training sessions

The participants’ ratings of how easy it was to create mental images (where 1 was easy

and 5 was difficult) increased from an average of 3.23 for the memory tests in the first training

session, to an average of 3.60 for the memory tests in the second training session, and finally to

an average of 3.76 in the third training session. A repeated measures ANOVA showed that there

was a significant effect of test number on the participants’ difficulty ratings (F(13, 273) = 2.39, p

< 0.01, ηp2 = 0.10). Paired t-tests showed that participants’ ratings of difficulty increased

significantly on the second training session relative to the first (t(23) = 2.00, p < 0.03, Hedges’s

gav = 0.53), but there was not a significant difference between the participants’ average difficulty

ratings on the second and third training sessions (t(21) = 0.87). Similarly, the participants’

ratings of the effectiveness of the memory strategy (where 1 is not effective and 5 is very

effective) decreased from an average of 3.77 in the first training session to 3.02 in the second

training session, and 2.33 in the third training session. A repeated measures ANOVA showed

that there was a significant effect of test number on the participants’ difficulty ratings (F(13,

273) = 5.35, p < 0.001, ηp2 = 0.20). Paired t-tests showed that participants’ ratings of the

effectiveness of the imagery strategy were significantly lower during the second training session

than the first (t(23) = 3.59, p < 0.001, Hedges’s gav = 0.86) and significantly lower during the

third training session than during the second training session (t(21) = 2.02, p < 0.03, Hedges’s gav

= 0.37). Therefore, improvement with regard to imaging proficiency was countered by

heightened task difficulty, such that participants did not feel that mental imagery was as

efficacious at the end of training (for difficult study lists) as it was at the beginning (for easy

study lists).

References

Balota, D. A., Duchek, J. M., & Logan, J. M. (2007). Is expanded retrieval practice a superior form of

spaced retrieval? A critical review of the extant literature. In J. S. Nairne (Ed.), The foundations of

remembering: Essays in honor of Henry L. Roediger, III (pp. 83-105). New York: Psychology

Press.

Cepeda, N. J., Pashler, H., Vul, E., Wixted, J. T., & Rohrer, D. (2006). Distributed practice in verbal

recall tasks: A review and quantitative synthesis. Psychological bulletin, 132, 354-380.

Delaney, P. F., & Knowles, M. E. (2005). Encoding strategy changes and spacing effects in the free recall

of unmixed lists. Journal of Memory and Language, 52, 120–130.

Karpicke, J. D., & Roediger, H. L. (2007). Expanding retrieval practice promotes short-term retention, but

equally spaced retrieval enhances long-term retention. Journal of Experimental Psychology:

Learning, Memory, and Cognition, 33, 704-719.

Table S1.

Design of recall tests used in mental imagery training sessions.

Imagery Training

Study List

Training Session

Total Number of Study Words

Number of Study Words With Low

Imagability

Encoding Time Per Word

1 1 10 1 Self-paced2 1 10 2 3 seconds3 2 10 2 3 seconds4 2 10 3 3 seconds5 2 10 3 2 seconds6 2 10 3 2 seconds7 2 12 3 2 seconds8 2 12 4 2 seconds9 3 12 4 3 seconds10 3 12 4 2 seconds11 3 14 4 2 seconds12 3 14 5 2 seconds13 3 16 5 2 seconds14 3 16 6 2 seconds

Table S2.

Mean and Standard Deviation Proportion Correct for Each Condition of the Recognition

Memory Task for Each Training Group

Condition Training GroupPre-training Proportion

Correct

Post-training Proportion

Correct

Change in Performance (Post – Pre)

Short-lag quizzes

Control Group 0.92 (0.08) 0.93 (0.09) 0.01 (0.07)Mental Imagery 0.96 (0.06) 0.98 (0.03) 0.02 (0.06)Working Memory 0.92 (0.01) 0.91 (0.01) -0.01 (0.08)

Long-lag quizzes

Control Group 0.69 (0.17) 0.69 (0.18) 0.00 (0.16)Mental Imagery 0.74 (0.24) 0.80 (0.24) 0.06 (0.23)Working Memory 0.69 (0.20) 0.62 (0.24) -0.07 (0.15)*

New words quizzed

Control Group 0.88 (0.20) 0.92 (0.09) 0.04 (0.15)Mental Imagery 0.91 (0.10) 0.93 (0.11) 0.02 (0.09)Working Memory 0.93 (0.05) 0.93 (0.07) 0.00 (0.06)

Once-studied words

Control Group 0.47 (0.16) 0.49 (0.18) 0.02 (0.13)Mental Imagery 0.57 (0.15) 0.64 (0.20) 0.07 (0.19)*Working Memory 0.46 (0.17) 0.42 (0.20) -0.05 (0.15)*

Short-lag repetitions

Control Group 0.59 (0.15) 0.60 (0.20) 0.01 (0.16)Mental Imagery 0.68 (0.15) 0.76 (0.17) 0.08 (0.16)*Working Memory 0.60 (0.16) 0.57 (0.19) -0.03 (0.12)

Long-lag repetitions

Control Group 0.60 (0.16) 0.60 (0.22) 0.00 (0.20)Mental Imagery 0.75 (0.14) 0.78 (0.19) 0.03 (0.16)Working Memory 0.63 (0.16) 0.54 (0.22) -0.09 (0.17)*

Short-lag quizzes

Control Group 0.64 (0.15) 0.63 (0.18) -0.01 (0.15)Mental Imagery 0.72 (0.13) 0.77 (0.15) 0.05 (0.12)*Working Memory 0.66 (0.16) 0.59 (0.18) -0.07 (0.14)*

Long-lag quizzes

Control Group 0.67 (0.15) 0.65 (0.21) -0.02 (0.14)Mental Imagery 0.78 (0.14) 0.81 (0.20) 0.03 (0.14)Working Memory 0.76 (0.15) 0.63 (0.20) -0.13 (0.12)*

New wordsControl Group 0.83 (0.14) 0.82 (0.15) -0.01 (0.09)Mental Imagery 0.83 (0.10) 0.84 (0.11) 0.01 (0.13)Working Memory 0.84 (0.09) 0.83 (0.11) -0.01 (0.08)

Table S3.

Representative Examples of Each Memory Strategy

Category Participant’s Descriptions

Imagery “Tried to create a mental image that represented the word.”

“Tried to create a vivid image of the word. If it was an object it was much easier than something abstract.”

“String them together in groups that form a little story while thinking of pictures to go with the story.”

“I tried to put pictures with the words or associate them with other words or feelings.”

Semantic “Associated them with something I know or experienced.”

“I made sentences in my head.”

“Tried to blend them together in a story.”

“I tried to associate the words with each other.”

Shallow “Saying the words in my head.”

“Repetition.”

“Just tried to remember.”

“No real strategy just like absorbing them and add them to a list.”

“Not much of a strategy… I let the words sort of float to the back of my mind.”

“I would remember the first letters of the words. Certain words would have meaning and were easily remembered without trying.”

Figure Captions

Figure S1. Breakdown of age, gender and education levels of study participants.

Figure S2. Average d’ score on each condition of the post-training recognition memory test for

participants who reported using imagery, semantic, and shallow memory strategies on the post-

training recognition test. Error bars show standard error.

Figure S3. Average spacing effect size (A) and average testing effect size (B) on the pre-training and

post-training recognition memory task for participants in each training group.

Figure S1

18-2021-25

26-3031-35

36-4041-45

46-5051-55

56-6060-63

Not reporte

d02468

1012141618

Age Range of Participants

FemaleMale

High School

Some colle

Associa

te Deg

Bachelo

Master'

reePh.D

Not reporte

Highest Level of Education Completed

Figure S2.

Imagery Strategy Semantic Strategy Shallow Strategy0

Average d' Score on Post-training Recognition Memory Test

Studied OnceStudied Twice, Short LagStudied Twice, Long LagQuizzed, Short LagQuizzed, Long Lag

Figure S3.

Control Group Imagery Training Group

WM Training Group0

Average Spacing Effect Size

Pre-trainingPost-training

Control Group Imagery Training Group

WM Training Group0

Average Testing Effect Size

Pre-trainingPost-training

static-content.springer.com10.3758/s134… · web viewin the case of mental imagery, ... journal...

Documents

mental capacity and mental illness

mapa mental diapositiva mapa mental

understanding mental health & mental illness w4beg mental...

jonathan haidt, j. patrick seder, and selin...

mental strength not mental breakdown

concepts of mental & mental

mental health. normal emotional responses behavior mental...

abstracts - bmj open quality · pediatrics....

static-content.springer.com10.3758/s134… · web...

full wwpdb x-ray structure validation report iftp.wwpdb.org...

peningkatan kapasitas pengendalian internal melalui...

salud mental trastorno mental

mental health versus mental illness

mental health vs mental illness

page 1 index - revenue.ie · page 1 index . plant varieties...

mental health and mental disorders

das kleine sternenbuch - delius klasing ·...

higiene mental y salud mental

retardasi mental retardasi mental

static-content.springer.com10.3758... · web...