advanced statistics presentation outline
Post on 02-Dec-2021
5 Views
Preview:
TRANSCRIPT
Advanced StatisticsPresentation outline:
Regression
Risks & Ratios
Survival Analysis
Survival Curves
Sensitivity & Specificity
Regression
Allows you to use change in independent variables (IVs) to predict
changes in a dependent variable (DV)
– Accomplished through a regression equation
• Uses y-intercept and slope to create straight line through
scatter plot
– IVs must have a linear relationship (i.e., must be correlated) with
the DV
– IVs are not manipulated, but may have multiple levels
Example of Regression analysis:
• Suicide rate regressed on antidepressant prescription rate ,
controlling for factors like age, sex, race, and income.
Interpreting a Multiple Regression
R (Measure of the relationship between variables)
– Range -1 to +1
R2 (coefficient of determination)
– Ratio of residual variability
B and β
– Unstandardized and standardized regression coefficients
– Associated t and p
Can include interaction terms
ANOVA used to determine overall fit
Factors That Create Misleading Results
Susceptible to all the same misinterpretations as correlations
• Linear relationships
• Restricted range, Skewed distribution, Outliers & Extreme
groups
Other considerations:
• Correlation between IVs should not be too large
(multicollinearity)
Logistic Regression
Used for prediction of dichotomous DV
Assess impact of IV and interactions on DV similar to linear
regression
• Physician’s referral for cardiac catheterization was regressed on patient’s sex and race
after controlling for other confounders.
• African Americans had 40% lower odds for recommendation relative to odds for the
Whites.
Example of Logistic Regression
Interpreting Logistic Regression
Each IV will have a Wald chi-square statistic and associated p-value
Uses Hosmer-Lemshow Goodness of Fit Test to evaluate overall
model, except you are looking for ns
Provides Odds Ratio and corresponding CI for each IV in the equation
Classification table for sensitivity and specificity calculation
Measures of Risk
Absolute risk
– Probability of developing a disease within a specified time frame
Relative risk (RR)
– the probability that a member of an exposed group (or group with
a specific behavior, gene etc.) will develop a disease relative to
the probability that a member of an unexposed group will
develop that same disease
Measures of Ratio Odds ratio
– Measure of the likelihood of an event occurring versus the
likelihood of an event not occurring
– Example: If the odds ratio of (smokers v/s non-smokers) for
developing lung cancer is 2.16, it implies that smokers have 166%
higher odds of developing lung cancer than the odds of non-
smokers.
Hazards ratio
– measure of how often a particular event happens in one group
compared to how often it happens in another group, over time.
– Example: A hazard ratio(Group1 v/s Group 2) of 1.6 implies that
Group 1 has 60% greater risk of developing the outcome.
Survival Analysis
Uses
– Duration – time from randomization to relapse
– Time to development of a condition
– Survival – time from randomization until death
Censors data
– the critical event has not yet occurred
– lost to follow-up
– other interventions offered
– event occurred but unrelated cause
Survival Analysis: the tests
Kaplan-Meier analysis provides adjusted mean and median time to
event and CI for a group(s)
Tests for equality of survival distributions between groups (Log Rank
test and associated p-value)
Can use Cox regression to control for covariates
Provides hazard ratios with CI
Survival Curves: The Kaplan-Meier Estimate Nonparametric estimate of the survivor function
Accommodates missing data such as censoring
– Censored Data:
• Mathematically removing a patient at the end of their follow-up time
(usually denoted by a vertical tick mark)
• When a patient is censored it reduces the number at risk for the next
interval
Estimate of absolute risk
Will always have a “staircase” appearance
• Kaplan Meier survival curves were constructed to compare
survival probability between the treatment and control group.
• Cox regression analysis performed to identify association
between patient characteristics and survival days.
Example of survival analysis:
Survival Curves: The Kaplan-Meier Estimate
Helps you clarify your thinking about
– Treatment
– Prognosis
Think of Kaplan-Meier curve as a “movie” rather than a “snapshot”
Avoid focusing on one point on the curve - it’s the entire curve that tells the
story
Survival Curves: The Kaplan-Meier Estimate
Small N can be very misleading
An experience of 50 – 100 can give you the lay of the land
> 100 will most adequately represent the true range of possible
experience
Typical endpoint is survival but alternative endpoints can be:
– Disease Free Survival
– Progression Free Survival
– Response Duration
Survival Curves: The Kaplan-Meier Estimate
Log-rank Test
– Typical test to compare two groups of Kaplan-Meier Survival Curves
• Log-Rank statistic: variety of names, similar results
– Mantel-Haenszel Chi Square Statistic
– Cox-Mantel log-rank statistic
– Mantel log-rank statistic
– or simply the SES: approximate Chi-square test for significance
between observed vs. expected number of events
• Hazard Ratio with 95% CI
– Similar to Odds ratio
– Demonstrates numerically, differences in curves on a plot
Survival Curves: The Kaplan-Meier Estimate
Survival Curves: The Kaplan-Meier Estimate
Interval
(Start-End)
# At Risk at
Start of Interval
# Censored
During
Interval
# At Risk at End of
Interval
# Who Died at End of
Interval
Proportion Surviving This
Interval
Cumulative Survival at End of
Interval
0-1 7 0 7 1 6/7 = 0.86 0.86
1-4 6 2 4 1 3/4 = 0.75 0.86 * 0.75 = 0.64
4-10 3 1 2 1 1/2 = 0.5 0.86 * 0.75 * 0.5 = 0.31
10-12 1 0 1 0 1/1 = 1.0 0.86 * 0.75 * 0.5 * 1.0 = 0.31
Survival Curves: The Kaplan-Meier Estimate
Life Tables Expression of death rates of a particular population during a particular
time
– Probability of death within certain age ranges or counts of events
within a time period
– Types:
• Population (based on census or large scale survey)
• Cohort (longitudinal follow-up with a specific group)
• Clinical
• Time to cardiovascular death and other fatal cardiovascular outcomes was
compared for the two groups in terms of relative risk and hazard ratio.
Example of survival analysis:
Hazard Function Curves
Sensitivity and Specificity
Terms used to evaluate a clinical test
Independent of the population of interest subjected to the test
Positive and negative predictive values are useful when considering the
value of a test to a clinician
– They are dependent on the prevalence of the disease in the population of
interest
The sensitivity and specificity of a quantitative test are dependent on the
cut-off value above or below which the test is positive
– In general, the higher the sensitivity, the lower the specificity, and vice versa
Receiver operator characteristic (ROC) curves are a plot of false
positives against true positives for all cut-off values
– The area under the curve of a perfect test is 1.0 and that of a useless test,
no better than tossing a coin, is 0.5
Sensitivity and Specificity
Sensitivity: If a person has a disease, how often will the
test be positive (true positive rate)?
– If the test is highly sensitive and the test result is negative you
can be nearly certain that they don’t have disease
– Rules out disease (when the result is negative)
Specificity: If a person does not have the disease how
often will the test be negative (true negative rate)?
– If the test result for a highly specific test is positive you can be
nearly certain that they actually have the disease
– rules in disease with a high degree of confidence
Fundamental terms to understanding Sensitivity & Specificity
True positive: the patient has the disease and the test is positive.
False positive: the patient does not have the disease but the test is positive.
True negative: the patient does not have the disease and the test is negative.
False negative: the patient has the disease but the test is negative.
Sensitivity = true positives / (true positive + false negative)
Specificity = true negatives / (true negative + false positives)
Predictive value for a positive result (PV+):
PV+ asks "If the test result is positive what is the probability that the patient actually has the disease?"
PV+ = true positive / (true positive + false positive)
Predictive value for a negative result (PV-):PV- asks "If f the test result is negative what is the probability that the patient does not have disease?"
PV - = true negatives / (true negatives + false negatives)
Diagnostic Test Design: sensitivity & specificity
• Sensitivity, Specificity, Positive and negative predictive value and area under the
receiver operating curve was compared for procalcitonin, C reactive protein and
Leucocyte count.
Example for ROC :
Thank You
top related