designs for ehealth impact studies - university of birmingham · “evaluation machine ......

44
Designs for eHealth impact studies Jeremy Wyatt DM FRCP ACMI Fellow Professor of eHealth Innovation & Director Institute for Digital Healthcare,Warwick University

Upload: hoangdang

Post on 30-May-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Designs for eHealth impact

studies

Jeremy Wyatt DM FRCP ACMI FellowProfessor of eHealth Innovation & Director

Institute for Digital Healthcare, Warwick University

What is an impact study ?

“A study of health technology in clinical use to determine its effects on the health problem it is designed to solve and on the patients, health professionals and health system”

Includes studies of benefits, side effects and changes in health-related costs attributable to the innovation

Goes beyond measuring feasibility, deployment, attitudes, usage or perceived impact

Quantifies changes caused by the technology in healthcare structures, processes or patient outcomes

Why carry out impact studies ?

Help manufacturer:

– Improve the technology

– Write persuasive marketing material for evidence-aware health professionals and purchasers

Help policy maker: results feed into technology appraisals; if cost effective, leads to reimbursement (EU “111” proposal – 1 website, 1 minute, 1 euro)

Help professional bodies: results feed into systematic reviews & practice guidelines

– Recommendation to use eHealth prompts clinical use

– Will also lend greater support via press, public and courts

How impact studies can promote use

Impact

studies

Practice

guideline

Technology

appraisal

Professionals keen

to use eHealth

Efficiency,

quality & safety

improves

eHealth

gets used

Improved

technology

Lessons learned by

developer / supplier

Press, public &

legal support

Persuasive marketing

materials

Reimbursement

“We know it works” – motorbike paramedics

“… Full advanced life-support did not decrease mortality or

morbidity... mortality was greater among patients with Glasgow Coma Scale scores < 9” Stiell IG et al. CMAJ. 2008; 178: 1141-52

Solution: do a trial - Liu & Wyatt, JAMIA 2011

Plausible eH technologies that failed

Diagnostic decision support (Wyatt, MedInfo ‘89)

Integrated medicines management for a children’s hospital (Koppel, JAMA 2005)

MSN messenger triage (Eminovic, JTT 2006)

Smart home applications (Martin, Cochrane 2008):

“The effects of smart technologies to support people in their homes are not known. Better quality research is needed.”

Possible impact study designs

1. Before-after studies with external or internal controls – or both

2. Interrupted time series with at least 6 data points

3. “Evaluation machine”

4. Mendelian randomisation / instrumental variable methods

5. Data mining:

a) Exploring special cause variation using statistical process control

b) Forum / text mining – “Dar Wiki nism”

6. Randomised controlled trials:

a) Step wedge, switchback designs (multiple on / off system phases)

Controlled before-after design

0

5

10

15

20

25

30

35

40

45

50

Before After

%d

rug

to

xic

ity

lymphoma, site A

leukaemia, site A

lymphoma, site B

Intervention

Controlled before-after designExternal control:

– same practice in one or more matched external groups of practitioners

– subject to same secular trends & confounders

– not exposed to the intervention

Internal control:– similar practice in the same target practitioners

– subject to same secular trends & confounders

– not susceptible to the intervention

Interrupted time series design

0

5

10

15

20

25

30

35

40

45

50

% d

rug

to

xic

ity

Intervention

time

Interrupted time series design

At least 3 pre- and 3 post-intervention measurements

Aim to demonstrate regression discontinuity

Problems:

– Cost of making repeat measurements - use routine data

– Difficulty separating intervention from baseline drift, seasonal effects...

The evaluation machine

What actually happened

with eHealth

What would have

happened without eH

Compare:

Eg. compare actual TB sputum conversion rates with conversion rates

for those patients predicted by a model, based on data at presentation

Now, just give me the model…

www.instructables.com/id/How-to-Build-

a-Time-Machine-Vortex-Distortion-Spa/

Instrumental variable approach

OutcomeInterventionimproves

Instrumental variable

(usually) determines

availability of

also improves, or

worsens

Also called Mendelian Randomisation.

See: Davey Smith G, James Lind Library.

Example of IV approach

Health statuseHealth systemimproves

NGO funding

(usually) determines

availability of

Infrastructure existsLocal interest in eH

also improves, or

worsens

Mining of retrospective data

Great for generating hypotheses

Very tricky for causation:– Data problems: changes in quality, drifting

definitions, recall bias, social response bias…– Simpson’s Paradox [unmeasured change in case mix]– Confounding by indication [drug choice the marker -

not cause of - outcome]; limits of propensity scoring– Immortal Time Bias…

See: Byar D. Why databases should not replace trials,Biometrics 1980

Statistical Process Control

Origin in manufacturing, production lines

Core tool of continuous quality improvement

Focus on detecting real variation in apparently stable processes

If variation is real, search for causes (ask “Why?” 6 times !)

Associated methods: Six sigma, kaizen

Run charts, control limits and signal data points

Plot your data

Calculate mean (m), item-to-item differences

Calculate mean item-to-item difference (mid)

Subtract 2.66 X mid from mean to give lower control limit

Add 2.66 X mid to mean to give upper control limit

Identify signal data points outside limits

Ask why they occurred – and why again – six times

SPC illustration

0

2

4

6

8

10

12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

Un

its

gen

era

ted

th

at w

ee

k

Week

Weekly units generated Stirton 2009/10

Upper control limit

Reasons why wind generators rarely deliver on expectations

Power output is proportional to wind speed cubed eg. if 12m/s (25mph), generate 8 times as much as 6m/s

However, wind speed follows Weibull (highly negatively skewed) distribution – ie. zero or low speeds >> high speeds

Forum mining and Dar-Wiki-nism

Wisdom of crowds – but also positive feedback loops, mass hysteria

Media & covert industry influence - Carl, the “community manager”

Mismatch in self vs. clinician reported symptoms

Very little clinical exam data

Who contributes to forums ?

RCTs - some unusual trials

Intervention Measure

Butter vs. margarine Plasma lipids

Service dogs Self esteem, community integration, school attendance

Psychological counselling Anxiety in road accident victims

Prayer on behalf of others Recovery of patients in cardiac care unit

Educational visits Use of evidence in 27 obstetric units

Behavioural therapy Success in job finding in unemployed

Insecticide spraying Control of bed bugs and hepatitis B

Breast self exam Death rates in 266000 Chinese women

Liu JLY, Wyatt JC. The case for randomized controlled

trials to assess the impact of clinical information systems.

JAMIA 2011

Step wedge design

RCT in which each unit is randomly allocated to cross over to intervention early or late (or at random time)

Only fair way to allocate scarce resources – a lottery !

Eg. impact study of HIS in 28 hospitals in Limpopo province, South Africa:

– Half randomised to early implementation [but copper cable linking some to data centre stolen – 6 times !]

– Half randomised to late implementation [but chief execs of some persuaded HIS team to implement earlier]

Littlejohns & Wyatt. Evaluating computerised health information systems: hard lessons still to be learnt. BMJ 2003

Impact study trade-offsStudy dimension Scientifically useful Locally useful, acted on

Stakeholders Not involved Closely engaged

Question Generic, abstract Specific, concrete

Study setting Stropped of local context As is, warts and all

Participants Carefully selected subset As generic as possible

Intervention Generic system, no changes allowed

System as installed and tailored

Co-interventions Not allowed As required by implementers

Outcome data definitions

Internationally recognised Locally used definitions

Data capture Validated instruments Locally used forms

Study design Factorial explanatory trial Pragmatic trial

Study funder Remote, disinterested Local decision maker (ownership)

What makes an impact study “high quality” ?

Reliable: internally valid (free from bias & confounding; large enough for results not due to chance)

Relevant: externally valid (measures interest others, cover structure, processes and outcomes; technology readily available)

Specific: state type of patients & professionals & context (eg. 10 / 20

care; fee for service or salaried) using eg. STARE-HI

Accessible: study design / results intelligible, in time to inform decisions of citizens, professionals, policy makers

Multi facetted: reveals likely impact in other settings; how to implement / improve it; patients /professionals likely to benefit

More rigour in systematic reviews

• There are over 100 systematic reviews covering telehealth & telemedicine

• Only 15% of the SRs were eligible for the Medicaid Evidence Based Decisions reports 2009

• Half of these still fell below acceptable standards of rigour assessed by the AMSTAR reporting quality checklist

Better quality reviews are badly needed…

Conclusions

More, better designed impact studies are needed, to:

1. Satisfy funders, patients, clinicians that time and money spent on eHealth systems was used wisely

2. Build evidence base to allow national health systems & professional bodies to invest in eHealth

3. Learn which eHealth systems work where, for whom– and which do not, and how to improve them

4. Transform eHealth activity to one where clinical, social and economic benefits lead, not technology

Barriers to impact evaluationActor Barrier to impact evaluation

System developers

Unaware of importance of performing impact study

Think opinion survey is enough

Scared that result might be disappointing or negative

Commercial interest in concealing negative results

No access to expertise to design impact study

Insufficient funds to carry out impact study

Evaluators

Worry that a controlled impact study is unethical

Worry that RCT results will only apply to trial setting

Study was done but results negative, so not published

Study was done but design does not allow clear interpretation

Lack of capacity across EU to design and carry out eHealth impact studies

Users of results

Study methods / results poorly written

Unclear about the need for impact studies, what makes a high quality study

Policy actions to encourage impact studies and global eH uptake

More high

quality impact

studies

More interest in

study results

Improve

study quality

Education

Funding

112 (cf. 111)

reimbursement

Register, certify & label eH

services with risks & benefits

Shared US / EU

definitions & metrics

Register evaluators,

competing interests

Evaluation code of practice

More capacity

to do studies

Register studies & results

Global network of eH

innovation centres

Work with OECD, WHO,

WEF, EU, ONC, HTAi, AMIA…

Consequences of impact focus

Need to increase evaluation capacity

More chance of engaging clinicians

Focus on reinventing care pathways, not new technologies

eHealth industry based in healthcare, not technology

Ineffective technologies will disappear

Risks of not doing impact studies

Lack of impact studies means we waste resources on ineffective or harmful eHealth

Deluge of impact studies of usual health technologies swamp the few eHealth studies

Excessive industry pressure and no independent studies to refute them

eHealth backlash (telehealth in heart failure)

Conclusions 2To encourage impact studies, we must:

Work with national and global agencies

Agree language and quality criteria

Develop evaluation capacity

Educate evaluators, users, industry, purchasers

Register systems, studies, evaluators

Agree a code of practice (eg. competing interests)

Towards an evaluation framework

Appropriate evaluation methods: just enough extra data, rigour, validity…

How much rigour is needed to:

– Persuade funders to extend / renew, assure auditors of probity

– Convince users of benefits

– Persuade developers to change system

– Attract evaluators, convince editor to accept article

– Write guideline recommendation on system benefits & risks, when to use it

The evaluation mind set

Knowledgeable about outcomes, study designs

Eclectic – whatever method suits the problem

Flexible, opportunistic

Honest, independent

Respectful of local issues, stakeholder dynamics

Understanding of context, timescales, what matters, the nature of evidence

Challenges: funding and feasibility

Cost of studies in relation to implementation costs

Access to clinical settings, high quality data in time to be useful

Ethics, data protection / information governance issues

Getting attention, input of experienced evaluators

Study workforce / training issues

Challenges: relevance

Matching study aims to real stakeholder questions

Using methods that deliver results within the timescale with which the answers needed

Fitting in homework / study plan / execute / analysis / report before committee meets

Challenges: validity

Sufficiently rigorous for findings to be reliable

Sufficiently generic questions & context for others to learn from study results

Issue of sponsorship and “fear of the clear” *Chuck Friedman]

Who to do studies:

– Formative stage: stakeholders, developer

– Summative stage: funder, problem owner, others wanting the system ?

Who benefits from eHealth systems, and how ?

eHealth system

Better managed

healthcare system

More complete, structured data

Better funded

healthcare system

NGO

System userDirect benefits

Indirect benefits

System generated reports

TeleHealth in diabetes, bronchitis & heart failure

Diabetes (Farmer et al SR, 2005):

– Slight reduction of HbA1C by 0.1% (95% CI -0.4% to 0.04%)

– Use of services no different or increased with telehealth

Bronchitis (Polisena et al SR, 2010):

– Mortality may be greater in telephone-support group (RR = 1.2; 95% CI 0.84 to 1.75)

– Reduced hospitalization and A&E visits, but impact on hospital bed days varied

Heart failure (Inglis et al, CDSR 2010):

– Reduced mortality by 44% (RR 0.66, CI 0.54-0.81, p < 0.001)

– Reduced CHF-related admissions by 23% (RR 0.77)

– However, recent large RCT negative (Chaudry NEJMed, Dec 2010)

Factors promoting eHealth uptake ?

1. High quality eH systems – functionality, flexibility, resilience, interoperability

2. Political will and leadership - funding

3. Incentives for professionals – direct benefits (EM Rogers), reimbursement

4. Transparent market – certification, labelling

Evidence from independent impact studies: clarity about which patients benefit, when

Evaluation studies in system lifecycle

eHealth

system in lab

Information / communication problem

Health system

Stakeholders

System requirements

ContextUser needs

System prototype

Qualitative & quantitative studies

Usability

studies

eHealth

system in field

Function studies

Impact studies

Type of studies found by Eminovic

Current evidence base for eHealth

Number of impact studies varies:

Circa 100 reliable studies: decision support, order communications, telehealth, web for health behaviour change

C. 5-10 reliable studies: electronic health records, virtual reality for training…

1 study or less: serious games, smart homes…

When do decision support systems work ?

Success rates across trials

Target clinical practice Clinical practice Patient outcomes

Diagnosis 40% 4/10 0% 0/5

Disease management 62% 23/37 18% 5/27

Single drug prescribing,

dosing

62% 15/24 11% 2/18

Prevention 76% 16/21 0% 0/1

Multi-drug prescribing 80% 4/5 0% 0/4

Overall 64% 62/97 13% 7/52

Garg et al, JAMA 2005, 293: 1223-38