bad science (2015)

31
“Torture numbers and they will tell you anything” * Peter Kamerman Brain Function Research Group, University of the Witwatersrand, South Africa Bad science * Greg Easterbrook

Upload: peter-kamerman

Post on 20-Jan-2017

194 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Bad science (2015)

“Torture numbers and they will tell you anything”*

Peter Kamerman Brain Function Research Group, University of the Witwatersrand, South Africa

Bad science

* Greg Easterbrook

Page 2: Bad science (2015)

Bad science Science under threat

UNIVERSITY OF THE WITWATERSRAND

Page 3: Bad science (2015)

Bad science Paper retractions are on the rise

Retracted biomedical research

Retracted in other scientific fields

Year of retraction

Num

ber o

f ret

ract

ed a

rticl

es

Grieneisen & Zhang, 2012 UNIVERSITY OF THE WITWATERSRAND

Page 4: Bad science (2015)

Bad science Almost half of retractions are for scientific misconduct

Van Noorden, 2011; Wagner & Williams, 2008 UNIVERSITY OF THE WITWATERSRAND

Page 5: Bad science (2015)

Bad science Biomedical publications are more likely to be retracted

Grieneisen & Zhang, 2012 UNIVERSITY OF THE WITWATERSRAND

Percent of all articles (%)

Per

cent

retra

ctio

ns (%

) Medicine

50

40

Page 6: Bad science (2015)

Bad science Fortunately, retractions are rare

Grieneisen & Zhang, 2012 UNIVERSITY OF THE WITWATERSRAND

% retracted in biomedical research

% retracted in other scientific fields

Year

Per

cent

age

of re

cord

s re

trcat

ed

Page 7: Bad science (2015)

“80% of non-randomized studies turn out to be

wrong, as do 25% of supposedly gold-

standard randomized trials, and as much as

10% of the platinum-standard large

randomized trials”

John Ioannidis (Health Research and Policy, Stanford School of Medicine)

UNIVERSITY OF THE WITWATERSRAND

Page 8: Bad science (2015)

Two broad categories:

•  Publication bias

•  Poor study design, execution and analysis

Bad science Where is it going wrong?

UNIVERSITY OF THE WITWATERSRAND

Page 9: Bad science (2015)

Publication bias Vanashing studies

UNIVERSITY OF THE WITWATERSRAND Hopewell et al., 2009

Negative trials (median: 0.4)

Positive trials (median: 0.7)

Proportion published

Page 10: Bad science (2015)

Publication bias Inflated estimates of effect size

UNIVERSITY OF THE WITWATERSRAND Finnerup et al., 2015

Effect size

Pre

cisi

on

Effect size

Trim-and-fill

~10%

Page 11: Bad science (2015)

Publication bias Drugs susceptable to bias

UNIVERSITY OF THE WITWATERSRAND Finnerup et al., 2015

* Number of participants in a negative trial to increase NNT to 11

*"

Page 12: Bad science (2015)

Poor study design, execution and analysis The experimental method

P value

Summary statistics

Tidy data

Raw data

Experimental design

Hypothesis testing

Basic data analysis

Data cleaning

Data collection

UNIVERSITY OF THE WITWATERSRAND Leek & Peng, 2015

Page 13: Bad science (2015)

Poor study design, execution and analysis The experimental method

P value

Summary statistics

Tidy data

Raw data

Experimental design

Hypothesis testing

Basic data analysis

Data cleaning

Data collection

Little scrutiny

Lots of scrutiny

UNIVERSITY OF THE WITWATERSRAND Leek & Peng, 2015

Page 14: Bad science (2015)

The p-value has been likened to:

•  A mosquito (annoying and impossible to swat away);

•  The emperor's new clothes

(fraught with obvious problems that everyone ignores); •  A “sterile intellectual rake”

(ravishes science, but leaves it with no progeny)

The P value: Statistical Hypothesis Inference Testing

UNIVERSITY OF THE WITWATERSRAND Nuzzo, 2014; Lambdin, 2012

Poor study design, execution and analysis

Page 15: Bad science (2015)

“Statistics are like bikinis. What they reveal is

suggestive, but what they conceal is vital”

Aaron Levenstein

(Baruch College, CUNY)

UNIVERSITY OF THE WITWATERSRAND

Page 16: Bad science (2015)

The experimental method

P value

Summary statistics

Tidy data

Raw data

Experimental design

Hypothesis testing

Basic data analysis

Data cleaning

Data collection

Poor decisions

in data analysis

UNIVERSITY OF THE WITWATERSRAND Leek & Peng, 2015

Poor study design, execution and analysis

Page 17: Bad science (2015)

“The vast majority of data analysis is not

performed by people properly trained to

perform data analysis…[there is] a

fundamental shortage of data analytic skill”

Jeff Leek (Johns Hopkins Bloomberg School of Public Health)

UNIVERSITY OF THE WITWATERSRAND

Page 18: Bad science (2015)

•  Reactive rather than prospective analysis plan;

•  Not understanding basic principles underlying choice of statistical test;

•  Not viewing the data;

•  Not assessing or hiding variance and error estimates;

•  Not understanding what a P value means;

•  Not correcting for multiple comparisons;

•  Over-fitting models

Common errors in data analysis

UNIVERSITY OF THE WITWATERSRAND Nuzzo, 2014; Lambdin, 2012

Poor analysis

Page 19: Bad science (2015)

•  Retrospective registration of a trial on a trials database;

•  Primary end-points not clearly stated;

•  Analyses do not directly address the primary end-point(s);

What should you look out for?

UNIVERSITY OF THE WITWATERSRAND Nuzzo, 2014; Lambdin, 2012

Poor analysis

Page 20: Bad science (2015)

"

•  No CONSORT flow diagram

•  Analysis of per protocol vs intention-to-treat population;

•  Method of imputation not specified (e.g., LOCF, BOCF);

•  No correction for multiple comparisons;

What should you look out for?

UNIVERSITY OF THE WITWATERSRAND Nuzzo, 2014; Lambdin, 2012

Poor analysis

Page 21: Bad science (2015)

The experimental method

P value

Summary statistics

Tidy data

Raw data

Experimental design

Hypothesis testing

Basic data analysis

Data cleaning

Data collection Poor design and execution

UNIVERSITY OF THE WITWATERSRAND Leek & Peng, 2015

Poor study design, execution and analysis

Page 22: Bad science (2015)

•  No sample size calculation;

•  No or inappropriate randomization;

•  No concealment;

•  Study too short;

•  Biased sampling;

•  Biased/inappropriate measurements;

•  Not assessing potential confounders

Common errors in study design

UNIVERSITY OF THE WITWATERSRAND

Poor design and execution

Page 23: Bad science (2015)

Filters to apply:

Filter I: Are the methods valid?

Filter II: Are the results clinically important?

Filter III: Are the results important for my practice?

Bad science Interpreting the data

UNIVERSITY OF THE WITWATERSRAND American"Society"for"Reproduc4ve"Medicine,"2008"

Page 24: Bad science (2015)

Filters to apply:

Filter I: Are the methods valid?

•  Was the assignment of patients randomized?

•  Was the randomization concealed?

•  Was follow-up sufficiently long and complete?

•  Were all patients analyzed in the groups they were allocated to?

Bad science Interpreting the data

UNIVERSITY OF THE WITWATERSRAND American"Society"for"Reproduc4ve"Medicine,"2008"

Page 25: Bad science (2015)

Filters to apply:

Filter I: Are the methods valid? Filter II: Are the results clinically important?

•  Was the treatment effect large enough to be clinically relevant?

•  Was the treatment effect precise?

•  Are the conclusions based on the question posed and are the results obtained?

Bad science Interpreting the data

UNIVERSITY OF THE WITWATERSRAND American"Society"for"Reproduc4ve"Medicine,"2008"

Page 26: Bad science (2015)

Is it clinically important?

• Effect size (minimally important clinical difference)

•  Direction of change

•  Precision

Bad science Interpreting the data

UNIVERSITY OF THE WITWATERSRAND

Page 27: Bad science (2015)

Absolute measures

•  Absolute change from baseline

•  Numbers needed to treat (NNT)

Relative measures

•  Percentage change from baseline

•  Risk ratio /relative risk (RR)

•  Odds ratio (OR)

Bad science Typical measures of effect size in pain studies

UNIVERSITY OF THE WITWATERSRAND

Page 28: Bad science (2015)

Bad science Precision of the estimate

UNIVERSITY OF THE WITWATERSRAND

Trial& Mean&&pain&difference:&Drug&2&Placebo&

P&value" Change&from&baseline:&Drug&

95%&CI&&of&change&from&baseline:&Drug&

1" <1.7" <"0.001" <2.1" <2.4"to"<1.8"2" <0.5" 0.2" <1.5" <1.8"to"<1.2"3" <2.3" <"0.001" <3.6" <3.8"to"–"3.3"4" <0.3" 0.1" <3.4" <3.7"to"<3.2"Modelled:"delta"="1,"n=234"per"group,"common"SD"="2.2,"power"="0.9""

Page 29: Bad science (2015)

Bad science Precision of the estimate

UNIVERSITY OF THE WITWATERSRAND

Trial& Mean&&pain&difference:&Drug&2&Placebo&

P&value" Change&from&baseline:&Drug&

95%&CI&&of&change&from&baseline:&Drug&

1" <1.7" <"0.001" <2.1" <2.4"to"<1.8"2" <0.5" 0.2" <1.5" <1.8"to"<1.2"3" <2.3" <"0.001" <3.6" <3.8"to"<3.3"4" <0.3" 0.1" <3.4" <3.7"to"<3.2"Modelled:"delta"="1,"n=234"per"group,"common"SD"="2.2,"power"="0.9""

Page 30: Bad science (2015)

Filters to apply:

Filter I: Are the methods valid? Filter II: Are the results clinically important? Filter III: Are the results important for your practice?

•  Is the study population similar to the patients in your practice?

•  Is the intervention feasible in your own clinical setting?

•  What are your patient’s personal risks and potential benefits from the therapy?

•  What alternative treatments are available?

Bad science Interpreting the data

UNIVERSITY OF THE WITWATERSRAND American"Society"for"Reproduc4ve"Medicine,"2008"

Page 31: Bad science (2015)

“The average human has one breast

and one testicle”

Desmond McHale (School of Mathematical Sciences, University College Cork, Ireland)