data analysis workshop
DESCRIPTION
Data Analysis Workshop. Chuck Spiekerman (cspieker@u) Karl Kaiyala (kkaiyala@u). Course Outline. February 20 How to describe your study Choosing an Analysis method March 13 Student presentations of study designs and data-analysis plans March 20 Student presentations of data analyses. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/1.jpg)
Data Analysis Workshop
Chuck Spiekerman (cspieker@u)
Karl Kaiyala (kkaiyala@u)
![Page 2: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/2.jpg)
Course Outline
• February 20– How to describe your study– Choosing an Analysis method
• March 13– Student presentations of study designs and
data-analysis plans
• March 20 – Student presentations of data analyses
![Page 3: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/3.jpg)
Describing your study
• Next session (3/13) we are asking you to present a description of your planned study
• The next few slides give an outline of suggested components of this description
• Attention to all these components should help you (and/or a consultant) decide on appropriate methods of statistical analysis
![Page 4: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/4.jpg)
Study Design Description
• Specific Aims (what?)
• Background (why?)
• Previous work (who?) *
• Study methods (how?)– several components
*optional for student presentations
![Page 5: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/5.jpg)
Specific Aims
• Describe the scientific question(s)
• Be specific and precise
• Stick to the study at hand
![Page 6: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/6.jpg)
Background and Motivation
• Relevance of this research
– Existing knowledge
– Identify gap this research will fill
• Relate to specific aims
• If part of a larger study, where does this
study fit?
![Page 7: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/7.jpg)
Study Methods Components
• Primary outcomes
• Study population• Methods and procedures *
• Data analysis plan
*optional for student presentations
![Page 8: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/8.jpg)
Primary Outcomes
• Precise definition of key measurement (individual data item) of interest
• Justify why this outcome and not something else.
– Relate to specific aim
• Details of collection can be left to methods and procedures section
![Page 9: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/9.jpg)
Study population
• How were the subjects selected?
– Exclusion and inclusion criteria
– Group classification?
– Matching?
– Randomization?
![Page 10: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/10.jpg)
Data analysis plan
• Outline data analysis for each specific aim
• Make clear which procedures are being used toward which aim
• Usually some simple tables and plots should be sufficient
• Keep it simple
![Page 11: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/11.jpg)
Forming an analysis plan
Two important questions
1. What do you want to do/show?
2. What kind of data …i. …will answer your question best?ii. … can you get?iii. … do you have?
![Page 12: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/12.jpg)
Types of data
• Continuous– Differences between values have meaning, and
are interpretable independent of the values themselves
– E.g. difference between 8 and 9 basically the same as difference between 1 and 2.
• Ordinal– Values have an order, but differences are not
easily interpretable (e.g. good, fair, poor)
![Page 13: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/13.jpg)
Types of data (cont.)
• Categorical
– Values are descriptive but do not have any obvious ordering. E.g. tx A, tx B, tx C.
• Binary, Dichotomous
– Fancy names for categorical variables with only two possible values.
![Page 14: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/14.jpg)
Types of data (sampling)
• one-sample– Refers to situation when values of interest all
come from one group and will be compared to a known quantity (e.g. “change greater than zero”)
• two-sample– When data are divided/sampled in two groups
and observed values compared between groups.
![Page 15: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/15.jpg)
What do you want to do?• Show evidence of differences
• Estimate population parameters
• Demonstrate equivalence
• Show evidence of association
• Create/validate a predictive model
• Assess agreement or reliability
• Other?
![Page 16: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/16.jpg)
Showing evidence of differences• Standard hypothesis testing procedures, usually
comparing means or proportions• Which test will depend on type of data. Usual
suspects (YMMV)– T-test or ANOVA for Continuous data– Chi-square test for Categorical data– Rank-based tests (e.g. Wilcoxon) for Ordinal data
• Use Rosner flowchart for guidance• Supplement p-value with estimate of difference
(with confidence interval)
![Page 17: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/17.jpg)
Estimate Population Parameters
• P-values and hypothesis tests aren’t always necessary
• Sometimes you don’t really want to compare things but only estimate values
• Estimate parameters of interest and supplement with confidence intervals (IMPORTANT!) .
![Page 18: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/18.jpg)
Demonstrate equivalence
• In some instances the goal is to show equivalence of, say, two treatments.
• Failing to show a difference using a standard hypothesis test is usually not sufficient evidence of equivalence
• Two strategies– Estimate difference and show ‘worst cases’
with confidence interval– Compute a standard hypothesis test with very
good power (> 95%)
![Page 19: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/19.jpg)
Demonstrate associations
Independent variable
outcome variable
dichotomous continuous
categorical•Chi-square
•Logistic regression
•T-test/ANOVA
•Linear regression
continuous•Logistic regression
•T-test/ANOVA (backwards)
•Correlation
•Linear regression
•Scatterplots
![Page 20: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/20.jpg)
Prediction• Dichotomous outcome
– Logistic regression*
– Sensitivities, specificities†
– ROC curves† (continuous predictor)
• Continuous outcome– Linear regression*
– “Leave one out” statistics or cross validation†
* Predictive model building
† assessing predictive model
![Page 21: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/21.jpg)
Reliability/Agreement
• Kappa statistic is commonly used for categorical data and two raters.
• Intra-class correlation coefficient for multiple raters
• If you have a ‘gold standard’ it makes the most sense to tabulate percent correct or average distance from correct.
![Page 22: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/22.jpg)
more Reliability/Agreement
• If trying to demonstrate agreement between two continuous measures the correlation coefficient is tangential at best
• Better to tabulate statistics related to mean pairwise differences between judges
• See – Bland JM, Altman DG. (1986). Statistical methods for assessing
agreement between two methods of clinical measurement. Lancet, i, 307-
310. – Available at http://www-users.york.ac.uk/~mb55/meas//ba.htm
![Page 23: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/23.jpg)
Other?
• Time-to-event data– Kaplan-Meier survival estimate– Cox regression
• Other other?
![Page 24: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/24.jpg)
Correlated Data Issues• Data consist of “clusters” of correlated
observations. This is common in dental studies (many teeth from same mouth)
• Common Solutions?– Collapse data to independent units (patient-
level averages)– Adjust for correlation using generalized
estimating equations (GEE) or mixed model regression approaches
![Page 25: Data Analysis Workshop](https://reader035.vdocuments.net/reader035/viewer/2022081417/56815b7d550346895dc97aca/html5/thumbnails/25.jpg)
Homework for Feb. 29
• Following the guidelines presented in class today, present a concise description of your study and planned data analysis to the class.
• Plan to keep your talk under ____ minutes
• Limited office hours will be available with myself and Dr. Kaiyala to help. Call or email us for appointments.