advising on test validity: comments on denny borsboom neil k. aaronson the netherlands cancer...

24
Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods Amsterdam, March 29, 2007

Upload: dennis-jordan

Post on 28-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Advising on Test Validity:Comments on Denny Borsboom

Neil K. Aaronson

The Netherlands Cancer Institute

KNAW Colloquium on

Advising on Research Methods

Amsterdam, March 29, 2007

Page 2: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

The way to capture an audience’s attention is with a demonstration where there is a possibility the speaker may die.

Jearl Walker, Cleveland State University

Page 3: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

It usually takes more than three weeks to prepare a good impromptu speech.

Mark Twain

Page 4: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods
Page 5: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Who am I?

• Health outcomes researcher

• Clinical oncology

• Develop questionnaires to assess patients’ illness and treatment experience from their own perspective

• For use in observational and evaluative studies in clinical research and practice

Page 6: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

What are we attempting to measure?

• Health outcomes

• Health status

• Quality of life

• Health-related quality of life

• Patient-reported outcomes (PROs)

Page 7: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

State of affairs in defining QL

• "Quality of life is a vague and ethereal entity, something that many people talk about, but which nobody clearly knows what to do about.“ Campbell et al., 1976

• “The idea has become a kind of umbrella under which are placed many different indexes dealing with whatever the user wants to focus on.” Feinstein, 1987

• “Quality of life is an ill-defined term…it means different things to different people, and takes on different meanings according to the area of application.” Fayers & Machin, 2000

Page 8: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods
Page 9: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Key dimensions of quality of life as defined by David Karnofsky (1949), the

WHO (1949) and ASCO (1995)

Physical Symptoms commonly caused by cancer and the toxicities of treatment

Psychological Effects of cancer and its treatment on cognitive function and emotional

state

Social Effects of cancer and its treatment on interpersonal relationships, school,

work and recreation

Page 10: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods
Page 11: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Attributes of QL definitions• Non-specific versus health-related

• Health states (or status) versus personal evaluation of those states (e.g., expectations, discrepancies, satisfaction)

• Scope of concerns (e.g., spirituality or existential issues)

• Polarity of concerns (dysfunction and its resolution vs. positive well-being)

Page 12: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods
Page 13: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Does it matter?• Yes, because the content of QL

questionnaires reflects the underlying definition.

• It may be less important in clinical trials, where group comparisons will be internally valid, regardless of the definition used.

• It is more important in comparing results across trials and in observational (e.g., prevalence) studies.

Page 14: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Examples of QL definitions

“The difference between the hopes and expectations of the individual and the individual’s present experience.”

Calman, 1987

“The functional effect of an illness and its consequent therapy upon a patient, as perceived by the patient.” Schipper et al. 1996

Page 15: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Covinsky et al. Am J Med 1999; 106:435-440

• 493 elderly patients rated their physical functioning,

psychological distress and overall QL

• More than 40% of those who reported the worst physical functioning and/or the highest levels of psychological distress rated their QL as “good or excellent”

• Approximately 20% of those with the best physical functioning and lowest levels of distress rated their QL as “poor”

Page 16: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Generic HRQL instruments

• Sickness Impact Profile (SIP)• Nottingham Health Profile (NHP)• Spitzer QL Index• COOP/WONCA Charts• MOS 36-Item Health Survey (SF-36)• World Health Organization (WHOQoL)

Page 17: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Cancer-specific QL questionnaires

• Functional Living Index – Cancer (FLIC)

• Cancer Rehabilitation Evaluation System (CARES)

• Rotterdam Symptom Checklist (RSCL)

• EORTC QLQ-C30

• Functional Assessment of Cancer

Therapy (FACT-G)

Page 18: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Key psychometric attributes of HRQL instruments• measurement model

• reliability

• validity

• responsiveness

• interpretability

• cultural adaptability

• burden

Page 19: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Assessing validity of HRQL instruments:classical approaches

(SAC/MOT 2001)

Content-related• evidence that the content domain of an

instrument is appropriate relative to its intended use

• the use of lay and expert panel (clinician) judgments

• complete the questionnaire(s) yourself

Page 20: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Future perspective items

SF-36 “I expect my health to get worse.”

FACT-G “I worry about dying.”

CARES-SF “I worry about whether the cancer will progress.”

QLQ-C30 --

Page 21: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Assessing validity of HRQL instruments:classical approaches

(SAC/MOT 2001)

Construct-related • evidence that supports a proposed interpretation

of scores based on theoretical implications associated with the constructs being measured.

• examine interscale correlations• examine patterns of scores for groups known to

differ on relevant variables • Disease-stage; treatment status, response to

treatment, etc.

Page 22: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Questions for Denny and audience (1)

• Examining correlations between measures purported to assess the same concept indeed tends to yield little useful information for instrument developers or for end-users – the exercise is theoretically and empirically anemic

• However, the “known groups” comparison approach is intuitively appealing and tends to be well-understood and accepted by end-users

• Is this latter approach equally “suspect”; i.e. does it also fail to truly address the validity of a measure?

Page 23: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Questions for Denny and audience (2)

• Item response theory (IRT) approaches are quickly coming to dominate the field of HRQL instrument development (NIH PROMIS INITIATIVE)

• Generating large item banks for each domain of interest, primarily based on existing literature (e.g., depression, pain, fatigue)

• Collecting large datasets to model item and scale information curves

• Generating computer-adaptive versions of measures • Will this approach really yield theoretically grounded and

valid measures, or is it yet another example of “dustbowl empiricism”?

Page 24: Advising on Test Validity: Comments on Denny Borsboom Neil K. Aaronson The Netherlands Cancer Institute KNAW Colloquium on Advising on Research Methods

Suggested reading

• Fayers P, Hays R (eds). Assessing quality of life in clinical trials: Methods and practice. Oxford: Oxford University Press, 2005

• Lipscomb J, Gotay CC, Snyder CF (eds.) Outcomes Assessment in Cancer. Cambridge: Cambridge University Press , 2005.