outcome-measures-lll event-london region-2009-kennedy.pdf

11/17/2009

1

An Evidence-Based Approach to

the Selection of Outcome Measures for AHPs

Donna Kennedy, BSc OT, MSc, CHT

Clinical Specialist Hand Therapy

Honorary Research Associate

What are outcome measures?

Any measurement of a patient’s health

status that can change as a result of time,

treatment or disease (MacDermid J 2002)

How many outcome measures are you

aware of?

How many outcome measures do you

use?

Pub Med Nov 2009

Outcome measures - 486,379

Outcome measures and OT - 2169

Standardised Outcome Measures

• Published

• Detailed instructions for administration,

scoring and interpreting the test

• Defined purpose

• Population specific

• Published data indicating acceptable

reliability and validity(MacDermid J 2002)

How can we use outcome measures?

• To determine if treatment is causing a change

• To demonstrate to others that treatment has

resulted in clinically important change

• To evaluate programs of care

• To identify subgroups of patients who most

benefit from care

• To evaluate quality improvements

• Clinical research

11/17/2009

2

The International Classification of

Functioning, Disability and Health (ICF)(WHO 2002)

• Impairments- loss or abnormality of psychologic, physiologic, or anatomic structure or function

• Activity limitations- difficulties in performing activities in a manner or within a range that is considered normal

• Participation Restriction- a disadvantage resulting from impairment or activity limitation that limits or prevents fulfilment of a role that is normal for the individual

Measuring for Quality Improvement in

the NHS

"We can only be sure to improve what we

can actually measure“

Lord Darzi, High Quality Care for All, June 2008

Barriers

• Time

• Cost

• Training requirements

Helpful hints…….

• Be organised

• Keep notes

• Date your work

PI(C)O Questions

• Patient group

• Intervention

• (Control)

• Outcome

(CEMB 2009)

PI (C) O

Element Define Example

Patient “How would I succinctly describe

these patients?”

Do adults with traumatic lower limb amputation…

Intervention “What is the main action I am

considering?”

…who receive OT in the acute care setting

(Control) “What is (are) the other option(s)?”

compared with patients who do not receive OT

Outcome “What do I/ the patient want to happen/ not

happen?”

demonstrate greater independence in ADL at

discharge?

11/17/2009

3

Practical Example

“Do patients with rheumatoid arthritis

demonstrate

improved hand

following

occupational

therapy?”

Ask a PICO Question

P: Adults with RA

I: OT

C: (no OT)

O: Hand function (Activity limitation)

When searching, throw a big net!Planning your search

Inclusions

(P) Adults, RA

(I) OT, Hand therapy

Exclusions

(P) Paediatrics, rheumatologic disease other

than RA

(I) Hand Surgery, rheumatologic medication

(O) Impairment (grip strength, ROM)

Participation restriction (quality of life)

1: rheumatoid adj arthritis

2: adults

3: (1 and 2)

4: occupational therapy

5: hand therapy

6: (4 or 5)

7: activities adj of adj daily adj living

8: hand adj function

9: (7 or 8)

10: assessment

11. evaluation

12. outcome measure

13. (10 or 11 or 12)

14. (3 and 6 and 9 and 13)

Literature Searching

1. Conduct

electronic search

2. Apply

inclusion/exclusion

criteria to titles,

abstracts

3. Hand search

reference lists for

additional tools

Psychometric Properties

11/17/2009

4

Reliability

Is the measurement consistent and free from error?

Validity

Does the test measure

what it is intended to

measure?

Responsiveness

Is the measure able to

detect change over

time?

Search 2; Psychometric Properties1: Michigan adj Hand adj Outcomes adj Measure

2: MHQ

3: (1 or 2)

4: Patient adj Evaluation adj Measure

5: PEM

6: (4 or 5)

7: reliability

8: validity

9: responsiveness

10: (7 or 8 or 9)

11: (3 or 6 and 10)

Hierarchy of EvidenceLevel 1a Systematic reviews & meta-analysis

Level 1b Randomized controlled trial (RCT)

Level 2a Systematic reviews & meta-analysis of randomized & non-randomized controlled trials

Level 2b Controlled trials, cohort & poor quality RCTs

Level 4 Case series

Level 5 Expert opinion including literature/ narrative reviews, consensus statements, description studies & individual

case studies

Level ? What someone told me once or I learnt 15 years ago

Types of

Reliability

Intrarater

Interrater

Test-retest

Ratio

Interval

Ordinal

Nominal

Scales of Measurement

Units with equal intervals, measured

from true zero

Distance, age, time, weight

Equal intervals between numbers,

but not related to

true zero

Calendar years, IQ, degrees

centigrade

Rank order of observations

MMT, functional status, pain

Category labels or classification

( from Portney and Watkins 2000)

Sex, nationality,

blood type

11/17/2009

5

Statistical Analysis of Reliability

Interval or ratio data (age, time, weight, grip strength, IQ) - Intraclass correlation coefficients (ICC)

Interpretation< .50 – poor

.50 to .75 - moderate

> .75 - good

> .90 – suggested for clinical measurements

(Portney and Watkins 2000)

Statistical Analysis of Reliability

Nominal data (sex, blood type, diagnosis) -Kappa statistic

Interpretation< 40% - poor to fair agreement

40 – 60% - moderate agreement

> 60% - substantial agreement

> 80% - excellent agreement

(Landis and Loch 1977)

Standard Error of Measurement (SEM)

7.6 8.4 9.2 10 10.8 11.6 12.4

mean

grip

Test-retest reliability of pain-free grip strength for one trial left and right

hands (Kennedy D 2008)

ICC 2,1 SEM (Kg)

One grip lefthand

0.96 0.8

One grip right hand

0.92 1.2

68% chance grip is +/- 1 SEM or G +/- 0.8

95% chance grip is +/- 2 SEM or G +/- 1.6

Reliability

• Reliability estimates, standard errors reported?

• Are methods of collecting reliability data clear?

• Might reliability estimates or standard errors of measurement differ substantially for various populations?

• Rationale for time elapsed between tests and in study design to ensure changes in health status were minimal?

Validity• Face validity- (weakest form) indicates a

tool appears to test what it is supposed to test

• Content validity - indicates that the items in a tool adequately sample the content that defines the variable being measured

• Construct validity- ability to measure an abstract concept

• Criterion- related validity- (most practical and most objective) indicates that the outcomes of one tool, the tool being assessed, can be used as a substitute measure for a gold standard

• (Portney and Wakins 2000, pg 82)

Criterion-related and predictive validity

• Statistics -Spearman’s rank or Pearson’s

correlation

• Score 0 to 1.0 - scores closer to 1 have higher

correlation. 1.0

0

11/17/2009

6

Validity

• Clear description of methods to collect validity data?

• Is validation sample described in enough detail (gender, age, ethnicity, and language)?

• Is there reason to believe validity will differ substantially for various populations?

• Is evidence of content validity presented?

• Is evidence of construct validity presented for each proposed use?

• Are criterion validity data presented with a clear rationale and support for the choice of criteria measure?

Responsiveness

• The ability to detect change over time

• If testing effectiveness, then score must change in proportion to the patient’s status change, and remain the same when the patient has not changed

• For research - the change must be large enough to be statistically significant

• For clinical purposes- the change must be precise enough to show increments of meaningful change

(Portney and Watkins 2000)

Analysis of Responsiveness

• Independent samples t-test – compares the

mean scores of two different groups of people or

conditions

• Paired-samples t-test- compares mean scores

for the same group of people on two different

occasions

• Analysis of variance- used with 3 or more

conditions or groups(Pallant 2005)

Effect Size

• T-test tells us if the difference between groups is

statistically significant

• Effect size indicates the relative magnitude of

the differences between the means

• Interpretation:

< .4 – small

.5 moderate

.8 large(Cohen 1988)

Responsiveness

• Is information provided on change scores?

• Is effect size reported with information on

methods used in calculation?

• Are responsiveness claims derived from

longitudinal data?

• Is the population being tested clearly identified?

.4 .8

Interpretability

• Is information provided on the relationship of

scores to clinically recognised conditions or

need for specific treatments?

• Is information provided on the relationship of

scores or changes in scores to commonly

recognised life events?

• Is information provided on how well scores

predict known relevant events?

11/17/2009

7

Respondent Burden

• Does the instrument place undue strain on the respondent?

• Information provided on time needed to complete the instrument?

• Information provided about the reading level assumed?

• Information provided about special requirements or requests placed on subjects?

• Information provided on the acceptability of the instrument?

Administrative Burden

• Information provided on

amount of training/

education/expertise needed

by staff to administer, score

or use instrument?

• Information provided about

any resources required for

administration of instrument,

such a computer hardware?

What do we do now?

Ask yourself……..

Can you demonstrate that

your treatment is causing

a change?

Can you demonstrate to

others that your treatment

has resulted in clinically

important change?

Next Steps

Identify and implement outcome measures….

• in your setting

• Locally

• Nationally

• Internationally

• Andresen EM (2000) “Criteria for Assessing the Tools of Disability Outcomes Research”, Archives of Physical Medicine and Rehabilitation, 81:2, S15-S20.

• Brettle A, Grant MJ (2003) Finding Evidence for Practice: a workbook for health professionals. Edinburgh: Churchill Livingstone.

• Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Hillsdale, NJ: Lawrence Earlbaum Associates, 1988.

• Jerosch-Herold C (2005) “An Evidence-Based Approach to Choosing Outcome Measures: a Checklist for the Critical Appraisal of Validity, Reliability and Responsiveness Studies”, British Journal of Occupational Therapy, 68:8, 347-353.

• Kendall N (1997) Developing outcome assessments: a step by step approach New Zealand Journal of Physiotherapy Dec, 11 - 17

• Landis JR, Loch GG (1977) “The measurement of observer agreement for categorical data”, Biometrics, 33: 159-74.

• Lohr KN, Aaronson NK, Alonso J, Burnam MA, Patrick DL (1996) “Evaluating Quality –of –Life and Health Status Instruments: Development of Scientific Review Criteria”, Clinical Therapeutics, 18:5, 979-992.

• MacDermid J (2002) “Outcome Measurement in the Upper Extremity” in Rehabilitation of the Hand and Upper Extremity, 5th edition, Mosby, St Louis.

• Oxford Centre for Evidence-Based Medicine (2009) Focusing clinical questions.

• Pallant J (2005)SPSS Survival Manual, 2nd ed. .Open University Press, Berkshire.www.cebm.net/index.aspx?o=1036

• Portney LG, Watkins MP (2000) Foundations of Clinical Research, Prentice Hall Health, New Jersey.

• World Health Organisation (2002) “Towards a Common Language for Functioning, Disability and Health: ICF”, Geneva, http://www.who.int/classification/icf

outcome-measures-lll event-london region-2009-kennedy.pdf

Documents

hand adj function criteria

patients outcome options

hand search12

functional level

degrees level

measurement types of

high quality care

hand therapyelectronic