burying the dodo: why the common factors debate is not over yet · 2018. 4. 1. · the dodo is a...

Burying the Dodo: Why the Common

Factors Debate is Not Over Yet

Robert J. DeRubeis

Department of PsychologyUniversity of Pennsylvania

Australian Regional Group Meeting Society for Psychotherapy Research

Brisbane, 1 December, 2009

Key Collaborators

• Dan Strunk (Assistant Professor, the Ohio

State University)

• Jay Fournier (on clinical internship at the

Western Psychiatric Institute and Clinics)

The Dodo Verdict

• A meme that thrives in the psychotherapy

research community

• What does it convey?

• Why does it convey “it” so effectively?

• If we could agree on a clear assertion related

to the Dodo, or the Verdict, or its use in

psychotherapy rhetoric, what could available

findings tell us about the assertion?

From the Oxford University

Museum of Natural History

Meet the Dodo

From Wikipedia’s “Dodo” entry

• Raphus cucullatus, a flightless bird endemic to the island of

Mauritius.

• Has been extinct since the mid-to-late 17th century. Commonly used as the archetype of an extinct species because its extinction occurred during recorded human history, and was directly attributable to human activity.

• The word is probably related to the Dutch word “dodaars”

("knot-arse"), referring to the knot of feathers on the Dodo’s hind end.

Why the Dodo?

• Because it’s extinct?

• No. Because of what the Dodo famously said

in Lewis Carroll’s “Alice in Wonderland,” and

how that saying has been applied to

psychotherapy research findings.

The Dodo, presenting Alice with a thimble to honor

his proclamation that, after the Caucus race,

“Everyone has won, and all must have prizes.”

The connection between

psychotherapy and the Dodo

• Saul Rosenzweig (1936)

– Invoked the Dodo in reference to psychotherapy

– First to conjecture that there are/were implicit common factors in

diverse methods of psychotherapy

– Made the conjecture absent empirical data

• In 1975, Lester Luborsky revived the Dodo/psychotherapy

connection, in his influential paper in the Archives of General

Psychiatry, “Comparative studies of psychotherapies: Is it true

that "Everybody has won and all must have prizes"?

• Is now a common meme used to express (or deny) the view

that “all psychotherapies are equally effective”

A bit of irony

• Carroll used the scene to mock the futility of UK’s

political caucuses.

• The “Caucus Race” was run helter-skelter, with no

rules and no finish line (and Alice’s own thimble

was returned to her by the Dodo as her “prize”).

• The saying might apply better, then, to reflect a

belief that there have been few if any comparative

psychotherapy studies with agreed-upon rules –

not that we have identified any winners.

Two contrasting views

The Dodo is a wise old bird

• Psychotherapy works (thus,

there cannot only be losers),

and

• There are enough comparative

(and other) data to tell us that

the type of therapy does not

matter, so all are winners

• Insofar as traditional therapies

have less evidence behind

them, absence of evidence does

not equal evidence of absence

The Dodo is an anachronism

• The effectiveness of a psycho-

therapy must be established

by research findings

• For many problems, some

treatment(s) have been shown

to be superior to other

treatments

• In the absence of comparative

evidence, prefer the therapy

that has been shown to work

What is at stake?

• Curricula in training programs

• Funding of treatment by insurance

companies and governments

• The direction of psychotherapy research

What kind of evidence might (or should)

strengthen or bury the Dodo?

Strengthen

• Strong evidence that

variation in factors common

to all treatments account

for a whole bunch of the

variance in outcome

• Repeated demonstrations

in comparative studies that

the differences are

negligible

Bury

• Strong evidence that

variation in amount/quality

of technique accounts for

outcome, over and above

common factors

• Some replicable, trust-

worthy evidence that

differences in important

outcomes result from two

different treatments

We haven’t all agreed on

the ground rules for the race(s)

• What do we make of the correlations between

outcome and measures of common factors (e.g., the

alliance)?

• What size of effect (differences between treatments)

is large enough for us to care about?

• What kinds of studies need to be done before

researchers will agree about whether Treatment A >

Treatment B (under certain or all circumstances)?

• How do we take into account the wisdom of the

therapist?



• What do we make of the correlations

between outcome and measures of common

factors (e.g., the alliance)?

Model of Change Process

Treatment Manipulation

Active

ComponentsCompetencies

Long-term Outcome

Application of Components

Prognostic Indices

Extra-therapy

Factors

Patient Processes

Acute Outcome

Proposed contributors to the process of change

• Therapeutic Alliance

– Meta-analysis, r = .22 (Martin et al. 2000)

– Temporal confound in most studies

• Adherence to Methods of Cognitive Therapy

– Two published studies from our research group

– “Concrete” methods of CT predict subsequent change

– Participants: 60 moderately to severely depressed

adults

– Symptom Measures

• Beck Depression Inventory (BDI-II)

• Hamilton Rating Scale for Depression

Strunk et al. (2009)

Observer Rated Measures of Process

– Concrete and Abstract Adherence

– Working Alliance Inventory (WAI)

Intraclass correlations coefficients: .59 - .77

Examine Available Evidence:

Did the therapist help the client to use currently

available evidence or information (including the

client’s prior experiences) to test the validity of the

client’s beliefs?

Sample Concrete Adherence Item

Not at all

1 2 3 4 5 6 7

some considerably extensively

Session-to-Session Change

Process Measure

Symptoms

Session 1 Session 2 Session 3 Session 4 Session 5

( ) ( ) ( ) ( )

r p

Cognitive Therapy --

Concrete.41 .001

Cognitive Therapy --

Abstract.27 .04

Working Alliance .15 .96

Summary

• Adherence, especially concrete adherence,

predicted session-to-session symptom

change

• Therapeutic Alliance did not predict

symptom change

• Is the null alliance finding an anomaly?

* Indicates an average correlation when multiple

outcome measures used

Study n

Correlation

of Alliance and

Outcome

Statistically

Significant?

DeRubeis & Feeley, 1990 25 r = .10 No

Feeley, DeRubeis &

Gelfand, 1999 25 r = -.27 No

Barber et al.,1999 252 r = .01* No

Barber et al., 2000 86 r = .30* Yes

Klein et al., 2003 367 r = .14 Yes

Strunk et al., 2009 60 r = .15 No

Previous Studies with Forward Predictions



• What size of effect is big enough for us to care

about?

How large was the observed

drug vs. placebo advantage, in ES terms?

• For patients in the mild-to-moderate range,

d = .11

• For patients in the severe range,

d = .17

• For patients in the very-severe group,

d = .47

Why we can’t expect big effects in

psychotherapy research

Potentially measurable aspects of

the therapeutic process:

• Fit – the correspondence between what the client most needs in order to thrive and change, and the therapeutic

plan, based on the therapist’s judgment of the client’s

needs.

– It could – but need not – refer to the fit between a

brand name therapy and a client’s needs.

– Could refer to the aggregation of judgments and plans

made by the therapist, in relation to the client’s needs at each moment.

• Fit – the correspondence between what the client most needs in order to thrive and change, and the therapeutic plan, based on the therapist’s judgment of the client’s needs.

• Implementation – the degree to which the therapist delivers on his or her plan. Skill is another word for this.

• Relationship – the connection between the therapist and the client, such that the client engages the process of therapy. (Controversial point #1: Strategic uses of the relationship fall best under “fit”and “implementation” in this nosology.)



• Fit – the correspondence between what the client most needs in order to thrive and change, and the therapeutic plan, based on the therapist’s judgment of the client’s needs.

• Implementation – the degree to which the therapist delivers on his or her plan. Skill is another word for this.

• Relationship – the connection between the therapist and the client, such that the client engages the process of therapy. (Note: Strategic uses of the relationship fall best under “fit” and “implementation” in this nosology.)



0

20

40

60

80

100

Absent Minimal Creditable Very Good Exquisite

Quality of Therapy (aFit + bSkill + cRelationship)

% I

mp

rove

me

nt

by P

os

t-tr

ea

tme

nt

0

20

40

60

80

100



% I

mp

rove

me

nt

by P

os

t-tr

ea

tme

nt

Responsive

0

20

40

60

80

100



% I

mp

rove

me

nt

by P

os

t-tr

ea

tme

nt Spontaneous remitter

Not amenable to change

Responsive

0

20

40

60

80

100



% I

mp

rove

me

nt

by P

os

t-tr

ea

tme



Needs very littleResponsive

Demanding

0

20

40

60

80

100



% I

mp

rove

me

nt

by P

os

t-tr

ea

tme

nt

0

20

40

60

80

100



% I

mp

rove

me

nt

by P

os

t-tr

ea

tme



Needs very littleResponsive

Demanding

0%

10%

20%

30%

40%

50%%

of

Sa

mp

le


Quality of Therapy

Ideal Distribution in a Study

Relating Therapy Quality to Outcome

0%

10%

20%

30%

40%

50%%

of

Sa

mp

le


Quality of Therapy

Ideal Distribution in a Study


Upper bound on

correlation between Quality and % Improvement = .44*

*Assumes patients are distributed evenly across the five groups:(Spontaneous Remitter, Needs Very Little, Responsive, Demanding, Not Amenable to Change)

0%

10%

20%

30%

40%

50%%

of

Sa

mp

le


Quality of Therapy

Realistic Distribution in a Study


0%

10%

20%

30%

40%

50%%

of

Sa

mp

le


Quality of Therapy

Realistic Distribution in a Study


*Assumes patients are distributed evenly across the five groups:(Spontaneous Remitter, Needs Very Little, Responsive, Demanding, Not Amenable to Change)

Upper bound on

correlation between Quality and % Improvement

.27*

Is .27 Good or Bad?

• Recall that the .27 correlation assumes perfect measurement of Quality and of % Improvement.

• If we’re lucky, the validity coefficient for the % Improvement variable would be, say, 0.80. That reduces the lower bound only a little, from .27 to about .23.

Is .23 Good or Bad?

• The .23 assumes perfect measurement of

Quality

• But if Quality is composed of Fit,

Implementation, and Relationship, then we

need to:

– Know what Fit is, and measure it.

– Construct an index of Implementation.

– Apply a measure of Relationship

Now Comes the Hard Part

• Need to combine Fit, Implementation, and Relationship Measures optimally

– 1/3(Fit) + 1/3(Implementation) + 1/3 (Relationship)

might be a good start

– 1/6(Fit) + 1/6(Implementation) + 2/3 (Relationship) could work

– Could be nonlinear:

• (Fit+Implementation) X Relationship

• etc.

What if we examine only one of the factors?

If we look only at technique, the model we’re testing is:

Outcome = 0(Relationship) + 0(Fit) + b(Implementation), or

Outcome = Implementation

Likewise, investigations of the relationship test this model:

Outcome = a(Relationship) + 0(Fit) + 0(Implementation), or

Outcome = Relationship

Question: If my analysis is even close to being

correct, how is it possible to obtain process-

outcome correlations in the .20--.40 range?

• Fit, implementation, and relationship phenomena are probably correlated, so we get 3 for the price of 1

• We mis-specify the model

– Attribute causal status to the effect (measure the process during or after outcome, then infer that the process caused the outcome)

– Attribute process-outcome correlation to process, when both are caused by a third variable (the client)

• First take care of the confounds (reverse causality & 3rd variables)

• Include a range of therapy quality

• Conduct training experiments

• Examine critical events

• Identify “responsive” patients

• Recognize that our favorite pieces are probably just that: pieces– Combine variables

– Examine interactions (but don’t count on them)

What to do?

What size effects do the

meta-analysts find for therapy type?

• Smith, Glass, and Miller (1980) – choose your number (and very few of them are very small)

• Wampold et al. (1997) – .19

• Meta-Analyses by Weisz, Weiss, and colleagues re

adolescent treatments – typical result is a difference in

ES between behavioral and non-behavioral treatments of approximately .50

• Shadish, Matt, Navarro, and Phillips (2000) – behavioral

vs. nonbehavioral mean ES = .41

The most heated

battles in the Dodo Wars

• How many (kinds of) studies can we lump together,

and who gets to do the lumping?

• Bona-fide vs. non-bona-fide

• Allegiance effects

• Primary vs. secondary measures

• How large is large?

• A few key studies, with sufficient power, that

compare two or three very different

psychotherapeutic approaches to each other

• Adversarial collaboration

• Agreement in advance from key advocates

about how the data will be interpreted (should be

applied to meta-analyses in the meanwhile)

What we need

burying the dodo: why the common factors debate is not over yet · 2018. 4. 1. · the dodo is a...

Documents