planning the data analysis: statistical power, mixed effects linear models, moderator and mediator...

Post on 22-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Planning the Data Analysis: Statistical Power, Mixed Effects Linear Models,

Moderator and Mediator Models and Related Issues

Katie Witkiewitz, PhDDepartment of PsychologyUniversity of New Mexico

2012 NIH Summer Institute onSocial and Behavioral Intervention Research

July 12, 2012

2

Ten questions…1. What is the type of study design?

- Number of treatment groups/arms- Randomization procedures- Any clustering?

2. How many subjects can be recruited/observed in a study period?

3. Will there be an equal number of participants in each group?

4. What are the research hypotheses?5. What do you hope to achieve or learn from each

aim?6. What are the primary outcome measures?7. What types of variables will be included?

- Unit of measurement for DVs, IVs, covariates- Issues with measures (e.g., skewness, counts, inflated

zeroes)

8. How many total evaluations and measurements?9. For repeated measurements, what is the measurement

interval?10. What kinds of missing data patterns do you expect and

how much?

3

Other questions depend on design

• Qualitative research– If focus groups, how many groups and group

composition?– How will data be transcribed and coded?– Which software, if any, do you plan to use?– What are the research questions?– Are you comparing? Aggregating? Contrasting?

Sorting?

• Secondary data analyses– What is the available data?– Return to questions #4-10

4

Components of the data analysis plan

• Study design– Brief overview of design– Sampling plan– Randomization plan, if applicable– Precision/power analysis and sample size

• Data management• Statistical analyses

– Proposed analyses for each primary aim– Secondary or exploratory analyses– Interim analyses, if applicable– Missing data methodology

5

Study design issues for data analysis

• Design issues specific to data analyses– Sampling procedures

• Stratified sampling• Clustering

– Randomization procedures– # of treatment conditions

• Realistically achievable n within each condition• Allocation ratio

– Measurement• Unit of measurement for outcome(s) and any covariates• # of measures/constructs• Duration and intervals of assessment

– Sample size

6

What sample size do I need?

• How many subjects can you realistically recruit?

• Precision – how precise do you want to be in measuring the effect?

• Power – powering your study to detect a significant effect

7

Precision analyses for pilot studies• Sim & Lewis (2012) recommend n>50• Precision increases with greater n• In order to carry out any precision-

based sample size calculation you need: – Width of confidence interval (e.g., 95%) –

narrower interval = more precise estimate– Formula for the relevant standard error

8

Effect of Sample Size on Precision

• Estimating a percentage of 30%Sample size 95% CI

50 ± 12.7%, (17.3% - 42.7%)

100 ± 9.0% (21% - 39%)

500 ± 4.0% (26% - 34%)

1000 ± 2.8% (27.2% - 32.8%)

Precision Analysis

• The sample size required to achieve the desired maximum error can be chosen as

• Example: suppose that we wish to have a 95% assurance that the error in the estimated mean is less than 10% of the standard deviation (i.e., 0.1). The required sample size is

E

2

222/1

E

zn

3852.384)1.0(

96.12

22

2

222/1

E

zn

Precision Analysis• Suppose that we wish to have a 95%

assurance that the error in the estimated mean is less than 25% of the standard deviation (i.e., 0.25). The required sample size is

625.61)25.0(

96.12

22

2

222/1

E

zn

11

Insert Successful, Carefully Controlled Pilot Study Here

I have estimates of the effect from my pilot study, now

what do I do?

12

Effect size has been defined in various ways

• Cohen (1988): “degree to which the phenomenon is present in the population” (p. 9).

• Kelley & Preacher (2012): “a quantitative reflection of the magnitude of some phenomenon that is used for the purpose of addressing a question of interest.” (p. 140).

• Context of social and behavioral interventions: magnitude of the detectable, minimally expected difference between intervention conditions on outcome of interest.– Standardized difference between two means

Cohen’s

pooled

controltreatment

S

xxd

13

Importance of variability in the achieved power of an intervention

14

Do achieved effect sizes from a prior study = the probable effect

size in a subsequent study?Study 1 – Pilot trial Study 2 – Main RCT

pooled

controltreatment

S

xxd

0.2

5.0

5.95.10

d 67.0

5.1

5.95.10

d

15

Can I adjust my SD from my pilot to estimate SD for RCT?

• If you have pilot data, then you can use the precision analysis logic to “inflate” your observed pilot SD.

• Sim & Lewis (2012) provide formulas and calculations

16

Statistical power and sample size

• How many subjects do you need to have sufficient statistical power to detect the hypothesized effect?– What level of power is desirable for your design?– What level for statistical significance?– What are your statistical tests?– What effect size do you suspect to detect?– What is the probable variation in the sample?

17

Step-by-step approach to calculating effect sizes.

• Decide on the question of interest.– Group differences (Cohen’s d/f2; Hedges g)– Strength of association (r, partial r, β, η2)– Difference in proportion (risk ratio (RR), odds

ratio (OR), number needed to treat (NNT))

• Examine prior studies in existing literature to obtain estimates of parameters in the effect size equation.

• Find the appropriate formula (or an online calculator) and input the parameters.

18

Online effect size calculators

• www.campbellcollaboration.org/resources/• www.danielsoper.com/statcalc3/• www.divms.uiowa.edu/~rlenth/Power/• http://statpages.org/#Power • + many more…

19

Step-by-step approach to power analysis.

• Decide on α and β • Using effect size (or range of effect

sizes) estimate how many subjects needed to obtain β, given α, for a particular test.

• Use existing software or simulation study.

20

Software for estimating power

• Within statistics programs:– SAS, R, Stata, SPSS – SamplePower

• Stand-alone and online programs:– http://statpages.org/#Power – www.divms.uiowa.edu/~rlenth/Power/– Optimal Design -

http://sitemaker.umich.edu/group-based/home – G*power – available free at

http://www.psycho.uni-duesseldorf.de/aap/projects/gpower/

– PASS –http://www.ncss.com/pass.html– Power & Precision –

http://www.power-analysis.com/– nQuery – www.statistical-solutions-software.com

21

22

23

24

Simulation studies for estimating power

• Generate multiple datasets that mimic the design with sample size n and hypothesized parameter estimates as starting values with set effect size– SAS, Mplus, R, Stata, Matlab

• More complicated model? Difficulty in determining parameter estimates

25

Data management• Briefly describe your plan for data

entry and management.• Overview of preliminary data checking.

– Check distributions of primary measures– Examine randomization failures– Test for systematic attrition biases

• Clarify how you will handle data issues that may arise.

26

27

Statistical analyses

– Types of Analyses– Proposed analyses for each primary aim– Secondary or exploratory analyses

• Mediation models• Moderation models• Mediated-moderation and moderated-

mediation• Multiple mediator models

– Missing data methodology

28

Types of Analyses

• Intent-to-treat – includes all randomized subjects, whether or not the subjects were compliant or completed the study

• Full analysis set – excludes those who are missing all data by completing no assessments after randomization. May also exclude those who never took a single treatment dose

• Per protocol –subset of subjects who complied with the protocol to ensure that the data would be likely to exhibit the effects of the treatment, according to the underlying scientific model

29

Which statistical test?• What are your research hypotheses?• What is the level of measurement for outcome measure(s)

and IVs?Outcomes IVs Statistical test

Interval/scale 1 DV at 1 time

1 with 2 levels t-test or regression (interval/scale), Mann Whitney (ordinal), χ2 (categorical)

1 with 2+ levels

ANOVA or regression (interval/scale), Kruskal-Wallis (ordinal), χ2 (categorical)

Interval/scale 2+ DVs at 1 time

1+ with 2+ levels

MANOVA, multivariate linear regression, latent variable model

Interval/scale repeated measures

1+ with 2+ levels

Mixed-effects model (aka random effects regression; multilevel model; latent growth model), repeated measures ANOVA, survival model

Categorical DV 1+ with 2+ levels

Logistic regression (multinomial if 2+ categories of DV), binary classification test

Count DV 1+ with 2+ levels

Poisson or negative binomial regression models, generalized linear models

30

31

Latent Variable Models

E1

Latent Variable unobserved; unmeasured; based on relationships between observed variables

observed variable (x1)

observed variable (x2)

E2

observed; measured variables

error or “residual,” not explained by shared variance

observed variable (x3)

E3

31

Latent variables can be continuous or categorical; two representations of the same reality

Y

X

Y

X

Continuous latent variable – correlation explained by underlying factor

Ex. structural equation models, factor models, growth curve models, multilevel models

Categorical latent variable – correlation reflects difference between discrete groups on mean levels of observed variables

Ex. latent class analysis, mixture analyses, latent transition analysis, latent profile analysis

32

Types of latent variable models

Continuous latent variable

Categorical latent variable

Categorical & continuous

latent variable

Cross-sectional Factor analysis* Latent class/profile

analysisFactor mixture

model

Longitudinal

Latent growth curve (i.e., mixed

effects*, multilevel, HLM)

Latent Markov model (i.e., latent

transition analysis)

Growth mixture model (i.e., latent

class growth, semi-parametric group)

33

34

Factor Analysis

• Common tool for examining constructs and creating a measure of related constructs.

• Models should be driven by theory, guided by best-practices for model selection, evaluation.

35

Aim #1: Develop a multidimensional measure of alcohol treatment

outcome. Analysis plan text for factor analysis

model:“Measurement models will be estimated for each construct across studies using a moderated nonlinear factor analysis (MNLFA) approach (Bauer & Hussong, 2009). MLNFA is a novel approach for examining measure structure that allows for items of mixed scale types (i.e., binary, count, continuous) and allows parameters of the factor model to vary as a function of moderator variables (e.g., source of the data, gender).”

36

Mixed Effects Models(aka linear mixed models, random effects regression models, multilevel

models, hierarchical linear models, latent growth curve models)

• Mixed = “fixed” and “random” effects.

• Fixed effects – group level, no variability within individuals (or groups)

• Random effects – individual level, variability within individuals (or groups)

Mixed effects models in pictures: Fixed effects model

BillJaneJoe

Gordan

Sue

time1 2 3

y

37

Mixed effects models in pictures: Random-intercept model

time1 2 3

y

Intercept Slope

Time 1

X1

Time 2

X2

Time 3

X3

ε1 ε2 ε3

38

Mixed effects models in pictures: Random intercept and random slope model

time1 2 3

y

Intercept Slope

Time 1

X1

Time 2

X2

Time 3

X3

ε1 ε2 ε3

39

40

Aim #2: To examine the effect of treatment in reducing heavy drinking

days among help-seeking alcohol dependent patients.

Analysis plan text for mixed effects model:“The primary outcome measure of percent heavy drinking days assessed weekly across 14 weeks of treatment will be examined using a mixed effects model with fixed effects of treatment and random effects of time (week since randomization).”

41

Survival Models

• Modeling the amount time (t) to an event (T), where Survival (t) = Pr (T > t)– Comparison in survival rate across groups– Incorporate time-invariant or time-varying

covariates

• Cox Proportional Hazards Model– Underlying hazard function, h(t)=dS(t)/dt

describes how the hazard (risk) changes over time in response to explanatory covariates.

42

Aim #3: To examine the effect of treatment on the amount of time to

first drinking or drug use lapse following release from jail.

Analysis plan text for survival model:“We will estimate the time to first lapse using a Cox proportional hazards model, where time (t) will be evaluated weekly over the 6-month time interval. The hazard probability for a given week (t) is estimated by the proportion of individuals under observation who are known to have not experienced any drinking or substance use lapses prior to week t that then experienced their first drinking or substance use lapse during week t, conditional on treatment group, gender, and dependence severity.”

Testing Mediation• Purpose: To statistically test whether the

association between the IV and DV is explained by some other variable(s).

• An illustration of mediationa, b, and c are path coefficients. Variables in parentheses are standard errors of those path coefficients.

Comparison of Mediation TestsMacKinnon et al 2002

• Baron and Kenny’s approach criticized for having low power - the product of the regression coefficients α and β is often skewed and highly kurtotic

• Product of coefficients approach more powerfulz’ = α*β/SQRT(β2*sα2 + α2*sβ2)

• Bootstrapping to get range of CI for indirect effect:– ProdClin:

http://www.public.asu.edu/~davidpm/ripl/Prodclin/ – Macros created for SPSS and SAS:

http://www.afhayes.com/spss-sas-and-mplus-macros-and-code.html#sobel

– Mplus code: www.statmodel.com – R code generator:

http://www.quantpsy.org/medmc/medmc.htm

45

Designing your study to test mediation hypotheses

• Key to testing mediation is measurement– Temporal precedence of hypothesized

variables in the causal chain

• Experimental manipulation, if possible

• Mechanism of change?– Mediator ≠ Mechanism– Other steps need to establish

mechanism, see Kazdin (2007)

What is Moderation?

• Variable that affects the direction and/or strength of the relationship between a predictor and a criterion variable– Categorical (e.g., males vs. females)– Continuous (e.g., level of moderator)

• Designing your study to test moderation – need a larger sample size

More Complex Examples

• “Conditional indirect effects” see Preacher, Rucker, & Hayes (2007)– Mediated moderation– Moderated mediation

• Latent mediators/moderators• Multiple mediators and/or

moderators • Mediation/moderation of growth

process

48

Missing Data Issues and Methodology

• Design the study to minimize missing data

• Acknowledge how you will you examine missing data and test missing data assumptions

• Missing data methodologies

49

Examining Missing Data and Missing Data Assumptions

• Missing data pattern – which values are missing and which are observed– Univariate missing – confined to a single variable– Monotone missing – missing for all cases, e.g.,

attrition

• Missing data mechanism – why values are missing and the association between missing values and treatment outcomes– Missing completely at random (MCAR)– Missing at random (MAR)– Missing not at random (MNAR)

50

Missing Data Methodologies

• Commonly used methods under MAR (or MCAR)1. Complete case - discard incomplete cases2. Imputation – fill-in missing values

- Single imputation (e.g., mean, baseline, LOCF, BOCF)- Underestimate standard errors and yield biased estimates

- Multiple imputation – takes into account uncertainty in imputations

3. Analyze incomplete data using method that does not require complete set

- Maximum likelihood- Bayesian methods

51

Problem with single imputation methods

52

Analytic Methods under MNAR

• Sensitivity analyses to estimate degree of bias (see Enders, 2010)

• Pattern Mixture Models – assume that the substantive data are conditional on the missing data mechanism. The conditional model of the observed data is estimated separately for each missing data pattern.

• Selection models – assume that the missing data mechanism is conditional on the substantive data. Observed data are used to predict the probability of missing data.

53

Describing missing data procedures:

“For the proposed study we will use maximum likelihood estimation for all analyses, which provides the variance-covariance matrix for all available data and is a preferred method for estimation when some data are missing (Schafer & Graham, 2002). Sensitivity analyses will be used to test the influence of missing data and missing data assumptions (see Witkiewitz et al., 2012).”

54

Common pitfalls1. Data analysis plan does not match rest of proposal.

- Cannot address aims or answer hypotheses- Not consistent with measures/research design

2. Ignore critical issues or makes unrealistic assumptions.- Effect size over-estimated or not included- Assumes data will be normally distributed, when

unlikely- Does not address missing data and attrition biases- Measurement invariance when comparing across

groups3. Propose complicated models without statistical expertise

in the research team.4. Propose to test models without clear hypothesis or

rationale.5. Include complex measures (e.g., time varying covariates,

genetic information, imaging data), without clear description of how the measures will be analyzed or included in the analyses.

55

References• Sim & Lewis (2012). The size of a pilot study for a clinical trial should be calculated

in relation to considerations of precision and efficiency. J of Clinical Epidemiology, 65, 301-308.

• Vickers, A. J. (2005). Parametric versus non-parametric statistics in the analysis of randomized trials with non-normally distributed data. BMC Medical Research Methodology, 5, 35.

• Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Erlbaum.

• Kelley, K., & Preacher, K. J. (2012). On effect size. Psychological Methods, 17, 137-152.

• MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G., & Sheets, V. (2002). A comparison of methods to test the significance of the mediated effect. Psychological Methods, 7(1), 83-104.

• Preacher, K.J., Rucker, D.D., & Hayes, A.F. (2007). Addressing moderated mediation hypotheses: Theory, methods, and prescriptions. Multivariate Behavioral Research, 42, 185-227.

• Kazdin, A.E. (2007). Mediators and mechanisms of change in psychotherapy research. Annual Review of Clinical Psychology, 3, 1-27.

• National Research Council. (2010). The Prevention and Treatment of Missing Data in Clinical Trials. Panel on Handling Missing Data in Clinical Trials. Washington, DC: The National Academies Press.

• Enders, C.K. (2010). Applied missing data analysis. New York: Guilford Press.• Schafer, J.L. & J.W. Graham, J. W. (2002). Missing Data: Our View of the State of the

Art. Psychological Methods, 7, 147-177.• Witkiewitz, K., Bush, T., Magnusson, L. B., Carlini, B. H., & Zbikowski, S. M. (2012).

Trajectories of cigarettes per day during the course of telephone tobacco cessation counseling services: A comparison of missing data models. Nicotine and Tobacco Research. doi: 10.1093/ntr/ntr291

top related