lecture 5 – categorical data and survival analyses

Lecture 5 – Categorical Data and Survival Analyses

OUTLINEOUTLINE

• Definition• Common CDA

– Descriptive summaries– Tests of Association– Modeling

• Extensions• Other examples in CDA

What is Categorical Data Analysis?What is Categorical Data Analysis?

• Statistical analysis of data that are categorical (cannot be summarized with mean +/- SD)

• Includes dichotomous, ordinal, nominal outcomes

• Examples: Disease prevalence, Discharge location, Treatment adherence (yes/no)

Examples of Studies with CDAExamples of Studies with CDA

• MI after CABG

• Diagnostic studies looking Sensitivity, Specificity of a new test/procedure

• Discharge location after new surgical intervention.

How to analyze words?How to analyze words?

• Order vs. no order

• Breakdown mean +/- SD for two groups

• Do the same: Breakdown Outcome %’s for two groups

How to analyze words?How to analyze words?

• Comparing length of stay after CABG:– New Trt = 19.2 +/- 2.7– SOC = 21.3 +/- 3.3

• Comparing prevalence of MI:– New Trt = 16%– SOC = 24%

• Are these differences statistically significant? clinically significant?

Choice of End PointChoice of End Point

• Some designs have a binary response variable– MI after 3 years

– Overall Survival – Time to CVD– Time to recurrent MI

• Can Dichotomize as 1 year rate (Yes/No)

What is Categorical Data Analysis? What is Categorical Data Analysis?

• paper example

http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WHH-4J8CX9N-3&_user=582433&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_version=1&_urlVersion=0&_userid=582433&md5=82dd961fea3908fa6bf1d6eb3852149a

Common CDACommon CDA

• Descriptive summaries

• Tests for association

• Modeling

Descriptive summariesDescriptive summaries

Let’s Talk Data…Let’s Talk Data…

Descriptive Summaries in CDADescriptive Summaries in CDA

Nominal – Categorical Data Measured in unordered categories

Ordinal – Categorical Data Measured in ordered categories

Continuous – Quantitative Data Measured on a continuum

(summarize with %’s)

(summarize with %’s)

summarize with many measures

Types of DataTypes of DataNominal – Categorical data

measured in unordered categoriesRace Blood Type

Ordinal – Categorical data measured in ordered categoriesCancer StagesSocio-economic Status (low, medium,

high)

Continuous – Quantitative data measured on a continuumSerum CreatinineHeight/Weight/BMI

Gender

Likert (unlikely, neutral, likely)

Diastolic Blood PressureTumor measurements

What the data might look like…What the data might look like…

Compare Categorical Outcomes between groupsCompare Categorical Outcomes between groups

• How to assess if a predictor is associated with a categorical outcome?

• Intuitive?: Get the %’s of the outcome prevalence within each predictor group.

• Example: New Trt and MI.– New Trt response rate = 16%– SOC response rate = 24%

Contingency TablesContingency Tables

MI

Yes No

New a b a+b

Old c d c+d

a+c b+d n=a+b+c+dGro

up

MI

Yes No

New 12 8 20

Old 4 16 20

16 24 N=40Gro

up

CDA Summary with Contingency TableCDA Summary with Contingency Table

• Research question

Is there a relationship between Group and Attacked Heart?

• Better to convert the table into percentages (easier to see)

What the data might look like…What the data might look like…

MI No MI

New TRT 12 8

Old TRT 4 16

Step 1. Breakdown the frequenciesStep 1. Breakdown the frequencies

(Cell %)Row %Col %

No MI MI Total

New TRT

12(30%)60%75%

8( 20%)40%33%

20

Old TRT

4(10%)20%25%

16(40%)80%67%

20

Total 16 24 40

Step 2. Get the different %’sStep 2. Get the different %’s

Row vs. Column %’s: It’s your choiceRow vs. Column %’s: It’s your choice

• Row %’s: – 40% of New trt patients had MI vs. 80% of

Old trt patients had MI

• Col %’s:– 75% of No MI were in the New trt group vs. 33% of

MI were in New trt group

• P-value for test of association is the same!

Tests for AssociationTests for Association

CDA tests for AssociationCDA tests for Association

• Is there a significant association between Group and MI?

• What is a good way to test for an association between the two?

Test for significant differencesTest for significant differences

• The most common tests are the Chi-square test and Fisher’s Exact test.

• Research question: Is there an association between treatment group and MI?

• To answer this: Compare what you would expect if there was no

association to what you observed

No MI MI Total

New TRT20

Old TRT 20

Total 40

Expect if no relationship?Expect if no relationship?

No MI MI Total

New TRT20

Old TRT 20

Total 16 24 40

Expect if no relationship?Expect if no relationship?

(Cell %)Row %Col %

No MI MI Total

New TRT

8(20%)40%50%

12( 30%)60%50%

20

Old TRT

8(20%)40%50%

12(30%)60%50%

20

Total 16 24 40

Same % with MI by GroupSame % with MI by Group

Test for significant differencesTest for significant differences

• Have exact same response % would favor “no association”

• There is another general way to calculate what you “expect”

• Use Row totals, Column totals, Grand total to calculate “Expected” frequencies

Observed vs. Expected FrequenciesObserved vs. Expected Frequencies

• Observed frequencies = actual counts

• “Expected” frequencies:

= Row total x Column total / Grand total (why?)

Actual No MI MI Total

New TRT 12 8 20

Old TRT 4 16 20

Total 16 24 40

What you actually observed in StudyWhat you actually observed in Study

ActualExpected No MI MI Total

New TRT 128

812

20

Old TRT 48

1612

20

Total 16 24 40

““Expected” frequenciesExpected” frequencies

Chi-square testChi-square test

• Quantify if the actual frequencies are far enough away from the Expected (assuming no association)

• We can quantify using the Chi-square test statistic

• We can get the p-value to determine if there is a significant association.

Chi-square test for association in RxC tableChi-square test for association in RxC table

• H0: There is no association between row and columns

• The classic Pearson’s chi-squared test of independence

• For a 2x2 table, df = (2-1) x (2-1) = 1• Conservatively, we require expected ≥ 5 for all i, j

OVERALL

COLROWij Total

TotalTotal *expected

21

2

1

2

1

2)(

dist

i j ij

ijij

Expected

ExpectedObserved

Chi-square TestChi-square Test

67.6

12

1216

8

84

12

128

8

812

)(

2222

2

1

2

1

22

i j ij

ijijTS Expected

ExpectedObserved

•Associated P-value for this Chi-square value is p=0.0098.

Thus, we conclude group and MI are significantly associated (given α = 0.05).

ActualExpected No MI MI Total

New TRT 128

812

20

Old TRT 48

1612

20

Total 16 24 40

““Expected” frequenciesExpected” frequencies

Fisher’s Exact TestFisher’s Exact Test

• Fisher’s Exact test will test similar hypotheses as the Chi-square test.

• Use Fisher’s Exact test when assumptions of Chi-square test are not satisfied.

• That is, when you have Expected < 5 (basically implying when cell sample size is small).

Confidence Intervals for Confidence Intervals for %’s%’s

Confidence Interval for %’sConfidence Interval for %’s

• You conduct your follow-up after CABG study and accrue 40 patients.

• After 3 years 20 out of all 40 patients have had a MI.

• Q1. What is your best guess at the true (population) MI rate at 3 years? A. Based on your sample, 20/40 = 50%

Sampling VariabilitySampling Variability

MI at 3 yrs = ?MI = 50%

Inference

Sample

Population

Sampling VariabilitySampling Variability

MI at 3 yrs = ?MI = 44%

Inference

Sample

Population


• A good way to make inference about what the range of plausible values of the population % is to calculate a Confidence Interval (CI).

• Q2. How much precision do you have in terms of estimating the MI rate at 3 yrs. in the population based on your sample?

95% Confidence Intervals95% Confidence Intervals

• 95% Confidence Interval for Mean:

• 95% Confidence Interval for Proportion (Standard “Wald” CI):

n

sdX 2

n

ppp

ˆ1ˆ2ˆ


• Q2. How much precision do you have in terms of estimating the MI rate in the population based on your sample? (Remember, 20 of 40 total had MI)

A. A 95% Wilson CI for population MI rate is (35.2%, 64.8%).

Thus, if we have repeated our study over and over again, each time drawing a sample of 40 patients, then the true population MI rate at 3 yrs. would be between 35.2% and 64.8% approximately 95% of the time.


• What’s interesting is that there are “lucky” and “unlucky” combinations of p (response rate) and N (sample size)

• That is, for a given sample size: * for some p you may higher ability to make inference

* for some p you may have less ability!

• Not to scald the Wald, but not all CI’s are created equal

• Paper

http://www-stat.wharton.upenn.edu/~tcai/paper/Binomial-StatSci.pdf

Modeling in CDAModeling in CDA

Modeling in CDAModeling in CDA

• Modeling is done with variations of Logistic Regression:• Dichotomous• Ordinal (Proportional odds)• Nominal (Generalized logit)• Conditional (Matched-pairs)• Exact (small sample size/rare outcome)• Longitudinal (GEE, GLMM)

• Simple (1 predictor) vs. Multivariable (>1 predictor/adjusted)

Why use adjusted analysis?Why use adjusted analysis?

• Do you think patient demographics or clinical characteristics at baseline would affect MI?

• What if half of the patients are all <30 yrs. old and half are all >80 yrs. old?

• What are some possible confounders of response? Effect modifiers?

• These are testable in adjusted analyses.

You may not need adjusted.You may not need adjusted.

• Typically have well-defined specific patient populations of interest.

• Thus, inclusion/exclusion criteria might have removed variability from potential confounders

• A well designed, well executed trial usually does not require intensive and complex analysis.

What is Logistic Regression?What is Logistic Regression?

• In a nutshell:

A statistical method used to model dichotomous or binary outcomes (but not limited to) using predictor variables.

Used when the research method is focused on whether or not an event occurred, rather than when it occurred (time course information is not used).


• What is the “Logistic” component?

Instead of modeling the outcome, Y, directly, the method models the Pr(Y) using the logistic function.


• What is the “Regression” component?

Methods used to quantify association between an outcome and predictor variables. Could be used to build predictive models as a function of predictors.

What can we use Logistic Regression for?What can we use Logistic Regression for?

• To estimate adjusted prevalence rates, adjusted for potential confounders

(sociodemographic or clinical characteristics)

• To estimate the effect of a treatment on a dichotomous outcome, adjusted for other covariates

• Explore how well characteristics predict a categorical outcome

Fig 1. Logistic regression curves for the three drug combinations. The dashed reference line represents the probability of DLT of .33. The estimated MTD can be obtained as the value on the horizontal axis that coincides with a vertical line drawn through the point where the dashed line intersects the logistic curve. Taken from “Parallel Phase I Studies of Daunorubicin Given With Cytarabine and Etoposide With or Without the Multidrug Resistance Modulator PSC-833 in Previously Untreated Patients 60 Years of Age or Older With Acute Myeloid Leukemia: Results of Cancer and Leukemia Group B Study 9420” Journal of Clinical Oncology, Vol 17, Issue 9 (September), 1999: 283. http://www.jco.org/cgi/content/full/17/9/2831

http://www.jco.org/cgi/content/full/17/9/2831

Logistic Regression quantifies “effects” Logistic Regression quantifies “effects” using Odds Ratiosusing Odds Ratios

• Does not model the outcome directly, which leads to effect estimates quantified by means (i.e., differences in means)

• Estimates of effect are instead quantified by “Odds Ratios”

Logistic Regression &Logistic Regression &Odds Ratio (OR)Odds Ratio (OR)

• The odds ratio is equally valid for retrospective, prospective, or cross-sectional sampling designs

• That is, regardless of the design it estimates the same population parameter

(not true for Relative Risk)

Relationship between Relationship between Odds & ProbabilityOdds & Probability

Probability eventOdds event =

1-Probability event

Odds eventProbability event

1+Odds event

The Odds RatioThe Odds RatioDefinition of Odds Ratio: Ratio of two odds estimates.

Example:

Suppose 16 out of 40 people in the trt group had a MI and only 5 out of 25 in the placebo group had a MI.

16Pr MI | trt group 0.40

40

5Pr MI | placebo group 0.20

25

The Odds RatioThe Odds Ratio

Example Cont’d:

So, if Pr(MI | trt) = 0.40 and Pr(MI | placebo) = 0.20

Then:

0.40Odds MI | trt group 0.667

1 0.40

0.20Odds MI | placebo group 0.25

1 0.20

0.667 OR Trt vs. Placebo 2.67

0.25

Interpretation of the Odds RatioInterpretation of the Odds Ratio

•Example cont’d:

Outcome = MI, 67.2OR Plb Trt vs.

Then, the odds of a MI in the treatment group were estimated to be 2.67 times the odds of having a MI in the placebo group.

Alternatively, the odds of having a MI were 167% higher in the treatment group than in the placebo group.

Odds Ratio vs. Relative RiskOdds Ratio vs. Relative Risk

• An Odds Ratio of 2.67 for trt. vs. placebo does NOT mean that MI is 2.67 times as LIKELY to occur.

• It DOES mean that the ODDS of MI are 2.67 times as high for trt. vs. placebo.

Odds Ratio vs. Relative RiskOdds Ratio vs. Relative Risk

• The Odds Ratio is NOT mathematically equivalent to the Relative Risk (Risk Ratio)

• However, for “rare” events, the Odds ratio can approximate the Relative risk (RR)

1-P MI | plbOR=RR

1-P MI | trt

The Logistic Regression ModelThe Logistic Regression Model

0 1 1 2 2 K K

0 1 1 2 2 K K

Logistic Regression:

P Yln

1-P Y

Linear Regression:

Y

X X X

X X X


0 1 1 2 2 K K

P Yln

1-P YX X X

predictor variables

YP1

YPln is the log(odds) of the outcome.

dichotomous outcome


0 1 1 2 2 K K

P Yln

1-P YX X X

intercept

YP1

YPln is the log(odds) of the outcome.

model coefficients


0 1 1 2 2 K K

0 1 1 2 2 K K

0 1 1 2 2 K K

P Yln

1-P Y

expP Y

1 exp

X X X

X X X

X X X

In this latter form, the logistic regression model directly relates the probability of Y to the predictor variables.

Application of Logistic Regression:Application of Logistic Regression:

• paper example

http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WHH-4J8CX9N-3&_user=582433&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_version=1&_urlVersion=0&_userid=582433&md5=82dd961fea3908fa6bf1d6eb3852149a

Extensions of Extensions of Logistic RegressionLogistic Regression

• Outcomes with more than 2 categories (polytomous or polychotomous)

• Cumulative logit model – Proportional odds model for ordinal outcomes (ordered categories)

• Generalized logit model for nominal outcomes or non-proportional odds models (unordered categories)


• Ordinal Logistic Regression model:

– Fits a logistic regression model with g-1 intercepts for a g category outcome and one model coefficient for each predictor

– Models cumulative probability of being in a “higher” category

Discharge Location as OrdinalDischarge Location as Ordinal(Died, Assisted, Home)(Died, Assisted, Home)

• There is no law that says you can’t model all categories of Discharge Location

• Ordinal logistic regression example:

Predictor OR P-value

Trt vs. Control 1.24 0.036

M vs. F 0.87 0.163

Ordinal Outcome Ordinal Outcome (Died, Assisted, Home)(Died, Assisted, Home)

OR(Trt vs C) = 1.24 means there was 24% higher odds of being in a higher DL category for Treatment vs. Control (adjusting for gender).

OR(M vs F) = 0.87 means there was 13% lower odds of being in a higher DL category for Males vs. Females (adjusting for Trt group).


Trt vs. Control 1.24 0.036

M vs. F 0.87 0.163


• Nominal Logistic Regression Model:

– Fits a logistic regression model with g-1 intercepts and g-1 model coefficients for a g category outcome

– Model captures the multinomial probability of being in a particular category using generalized logits

Nominal Logistic RegressionNominal Logistic Regression• Doesn’t make “Proportional odds” assumption

• Separate OR’s for C-1 categories of C category outcome (get OR for every group except Referent)

• Example:


Trt vs. Control

Home vs. Died

Assisted vs. Died

1.22

1.56

Overall=0.048

0.236

0.034

Nominal Logistic RegressionNominal Logistic Regression

• Thus, there was 56% higher odds of being discharged to Assisted Living compared to Dying for Trt. vs. Control.


Trt vs. Control

Home vs. Died

Assisted vs. Died

1.22

1.56

Overall=0.048

0.236

0.034


• Longitudinal data / repeated measures data / Clustered data with binary outcomes

• Multilevel models (nested data structures)

GEE (Generalized Estimating Equations)GLMM (Generalized Linear Mixed Models)

Repeated Measures /Repeated Measures /Longitudinal dataLongitudinal data

• Longitudinal data = data on subjects over time

• Repeated measures need to be taken into account when testing for differences

• Need to investigate correlation of repeated measures

Extensions to Extensions to Logistic RegressionLogistic Regression

• Exact Logistic Regression

• Small Sample Size

• Adequate sample size but rare event (sparse data)

Questions?

Part II. Analysis of Time-to-event Data

(A.K.A., Survival Analysis)

What do we mean by Time?

• Length of follow-up till the event of interest occurs

• Follow-up can start at (for example)1. Randomization into a clinical trial2. Time of employment3. First contact on record in retrospective cohort

• Age of the individual at the time of the event

What is Survival Analysis?• Survival analysis is a collection of statistical

analysis techniques where the outcome is time to an event.

• Survival or time-to-event outcomes are defined by the pair of random variables (ti, δi) that give the observation time and an indicator of whether or not the event occurred

What do we mean by Event?• Usually we mean death – thus the name

“survival” analysis• Other examples:

– Cancer relapse or recurrence– Disease incidence

• Can also be a positive outcome:– Discharge from psychiatric counseling– Normalization of WBC count(in these examples, death would be a censored

outcome)

Censoring

• In the pair of random variables (ti, δi) that constitute survival outcomes:– ti is an observed variable representing time (e.g.,

actual time until death or time until last follow-up)– δi is a Bernoulli random variable (0,1) or indicator

of whether the observation is censored or not – 1 if we observed a failure, 0 if we have a censored observation

Censoring Occurs• When we have incomplete information about the

exact survival time due to a random factor– Non-informative censoring – whether an observation is

censored or not is independent of the value of the observation.

– Informative censoring – whether an observation is censored or not is dependent on the value of the observation

– E.g., we use dates seen in clinic provide censoring times (without attempted phone contact to verify vital status)

• We will require non-informative censoring mechanisms. If censoring is informative, then these methods will generate biased results.

Types of censoring

• Right censoring – true survival time is greater than what we observed

• Left censoring – true survival time is less than what we observed (less common)

• Interval censoring – subjects are not observed continuously and we only know the event happened between time A and time B (e.g., annual testing of partner of an HIV+ individual)

Three common reasons for right censoring

• Person does not experience the event before the study ends

• Person is lost to follow-up during the study period

• Person withdraws from the study because of death (if death is not the outcome of interest) or some other reason (e.g., adverse drug reaction)

What do the data look like?

end of study

drop out

5 10 15 20

2

1

3

4

5

0

event occurred

Example: Survival of Patients With Renal Resistive Index ≥ 0.8

• 86 hypertensive patients with open or percutaneous repair of RVD

• RA resistive index (RI) defined as 1-EDV/PSV in the D-segment

• RI dichotomized: <0.8 or ≥0.8• Outcome of interest is time to death (any

cause)

Example: Survival of Patients With Renal Resistive Index ≥ 0.8

Hosp UNITNO

Preop RI

<08 or ≥ 0.8Time to Event

Death Indicator

006-52-90 1 41.0 1

013-39-46 1 77.1 0

014-57-80 0 112.3 0

022-39-81 0 90.3 0

023-80-78 0 88.3 0

026-66-88 0 104.3 0

028-75-46 0 57.8 0

030-10-56 0 26.8 0

033-83-68 0 38.1 1

034-30-42 1 90.9 0

δi = 1 if death

δi = 0 if censoredRI=1 if ≥ 0.8

RI=0 if < 0.8

Survival Distribution

• Distribution of times to event – called “survival times,” even when the “event” is not “death”

• Let T = survival time (T ≥ 0) t = specified value for T

• Survival times follow a continuous distribution with times ranging from zero to infinity

• Ordinary methods for estimating and comparing continuous distributions cannot be used with survival data due to the presence of censoring

Probability Density Function f (t )

0

1( ) lim [ ]

tf t P t T t t

t

Difficult to estimate density directly because of censoring – histogram not direct estimate of f(t)

Cumulative Distribution Function F(t )

0

( ) [ ] ( )t

F t P T t f s ds Defined in the same way we would any CDF

Survival Function S(t )

0

( ) Pr[ ] ( ) 1 ( ) 1 ( )t

t

S t T t f u du f u du F t

• Monotone non-increasing function• S(0) = 1• S(+∞) = 0

Hazard Function λ(t)

Instantaneous death rate at time t, given alive at time t

0

0

Prob event in ( ) given survived to ( ) lim

Pr( | )lim

t

t

t, t t tt

tt T t t T t

t

Hazard Function λ(t )• So, you survived to time t, what is the probability

that you survive another increment of time t?• Standardize this conditional probability to a per unit

of time.• As unit of time gets very small (i.e., goes to 0) this

conditional probability becomes an instantaneous rate.

• Some simple features of λ(t)– λ(t) takes on values in the interval (0, ∞)– λ(t) could be instantaneously increasing, decreasing, or

constant

Survival Distribution

• Any one of these four functions is enough to specify the survival distribution. There exists an equivalence relationship between the them.

• Survival analysis techniques focus on survival distribution S(t) and hazard rate λ(t)– When λ(t) is high, S(t) decreases faster.– When λ(t) is low, S(t) decreases slower.

How do censored cases affect survival estimation?

• Censored patients do not make the survival curve drop in steps

• Censored cases do reduce the number of patients left who are contributing to the survival curve

• Thus every event after that censored case will result in a “larger” step down than it would have been without the censored case

• The reduction in sample size due to amount of censored cases present will result in reduced reliability of the estimates of survival.

• That is, larger amount of censored cases make wider CI’s about survival estimates.

• End of the survival curve is most affected yet is of great interest

How do censored cases affect the survival estimation?

Survival Estimation: The Kaplan-Meier (K-M) Method

• Also known as “Product-limit” Method• Most popular method of estimating

survival / time-to-event• Good statistical properties: estimates

converge to true survival distribution as sample size grows

• Nonparametric - does not require knowledge of the underlying distribution

K-M Estimation: How it Works• Order death/censoring times from smallest to

largest• Update survival estimate at each distinct

failure time

( ) ( ) ( )1

( 1) ( ) ( )

ˆ ˆ( ) [ | ]

ˆ ˆ( ) ( | )

j

j i ii

j j j

S t P T t T t

S t P T t T t

K-M Estimation: RI Example

Product-Limit Survival Estimates

Time (months)

Censored (*) Survival

Survival SE

NumberFailed

NumberLeft

0.000 1.0000 0 0 272.628 * . . 0 267.392 0.9615 0.0377 1 25

13.470 0.9231 0.0523 2 2414.587 * . . 2 2316.164 0.8829 0.0636 3 2216.296 0.8428 0.0722 4 21

ˆ

S(t) = (prop. alive after this death) (surv. estimate at prior death)21

= 0.8829 22

Pro

po

rtio

n A

live

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Months Post-surgery

0 6 12 18 24 30 36 42 48 54 60 66 72 78 84 90 96

Plot of K-M Survival Estimate Group with RI≥0.8

Survival estimate updated at each death time

Open diamonds mark censoring times

Assumptions of Kaplan-Meier

• Non-informative Censoring: The probability of being censored does not depend upon a patient’s prognosis for the event.

• Deaths of patients in a sample occur independently of each other

• Does not make assumptions about the distribution of survival times

Before Kaplan-Meier• Life-table (“actuarial”) method of

estimating time to death

• Break follow-up time into pre-defined intervals

• Number of subjects alive at beginning of interval

• Number of subjects dying during interval

• Estimate survival in similar fashion to Kaplan-Meier

Testing for difference between two survival curves: Log-rank test

• Are two survivor curves the same?• Use the times of events: t1, t2, ... (do not include censoring

times)• Treat each event and its “set of persons still at risk” (i.e., risk

set) at each time tj as an independent table

• Make a 2×2 table at each tj (i.e., each distinct death time)

Event No Event Total

Group A aj njA- aj njA

Group B cj njB-cj njB

Total dj nj-dj nj

Log-rank test for comparing survivor curves

• At each event time t j, under assumption of equal survival (i.e., SA(t) = SB(t) ), the expected number of events in Group A out of the total events (dj=aj +cj) is proportional to the numbers at risk in group A to the total at risk at time tj:

E(aj)= dj x njA / nj

• Differences between aj and E(aj) represent evidence against the null hypothesis of equal survival in the two groups


• Use the Cochran Mantel-Haenszel idea of pooling over events j to get the log-rank chi-squared statistic with one degree of freedom

21

2

2 ~ˆ

)(

jj

jjj

a

aEa

raV


• Idea summary:– Create a 2x2 table at each uncensored failure time– The construct of each 2x2 table is based on the

corresponding risk set– Combine information from all the tables

• The null hypothesis is SA(t) = SB(t) for all times t (i.e., tests for differences across entire distribution)

(N=84) (N=76) (N=63) (N=48) (N=37) (N=30) (N=15) (N=8) (N=3)

Pro

po

rtio

n A

live

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Months Post-surgery

0 12 24 36 48 60 72 84 96 108 120

RI < 0.8

RI ≥ 0.8

Resistive Index example

Log-rank Χ2= 15.2 p-value<0.0001

Other Tests to Compare Survival Curves You May Encounter

• Wilcoxon (a.k.a., Peto) Test– Weights analysis by the number of subjects at risk at

each distinct death time– More sensitive than log-rank to early differences in

survival curves; log-rank is more sensitive to late differences in curves

• Likelihood Ratio Test– Assumes exponential distribution– Optimal if survival is, in fact, exponentially distributed

From Stratification to Modeling

• Goal: extend survival analysis to an approach that allows for multiple covariates of mixed forms (i.e., continuous, ordinal and nominal categorical)

• We have two options for our expansion– Model the survival function or time– Model the hazard function (between 0 to ∞)

We will model the hazard function

What are Proportional Hazards?

• The constant C does not depend on time• The model is multiplicative

C11 2

2

( | )( | ) ( | )

( | )

tC S t S t

t

x

x xx

Cox Proportional Hazards Model• D.R. Cox assumed proportionality was constant

across time and proposed the following model:

where λ0(t) is the baseline hazard and involves t but not X

• is the exponential function; involves

X’s but not t (as long as the are time independent)

)exp()();(

p

iiio xtXt

1

)exp(

p

iii x

1

Cox Proportional Hazards Model

• The regression model for the hazard function as a function of p explanatory (X) variables is specified as follows:log hazard: log λ(t; X) = log h0(t) + 1X1 + 2X2 + … + pXp

hazard: pp2211 XβXβXβ

o e...ee(t)λX)λ(t;

Cox Proportional Hazards Model

• Interpretation of – The relative hazard (i.e., hazard ratio) associated

with a 1 unit change in X1 (i.e., X1+1 vs. X1), holding other Xs constant, independent of time

– The relative (instantaneous) risk for X1+1 vs. X1, holding other Xs constant, independent of time

• Other ’s have similar interpretations

1βe

Cox Proportional Hazards Model• “multiplies” the baseline hazard λ0(t) by the

same amount regardless of the time t, thus we have a “proportional hazards” model – the effect of any (fixed) X is the same at any time during follow-up

• is the focus whereas λ 0(t) is a nuisance variable

• Cox (1972) showed how to estimate without having to assume a model for λ 0(t) (e.g., estimating λ 0(t) with a step function)

• Let # steps get large —partial likelihood for depends on , not λ0(t)

1e

Partial likelihood

• The likelihood function used in Cox PH models is called a partial likelihood

• We use only the part of the likelihood function that contains the ’s

• It depends only on the ranks of the data and not the actual time values.

Partial likelihood• Let the survival times (times to failure) be:

t1 < t2 < ... < tk

• And let the “risk sets” corresponding to these times be:R1, R2, ..., Rk where Rj = list of persons at risk just before tj

• The “partial likelihood” for is

(Assumes no ties in event times)• To estimate , find the values of s that maximize L() above.

k

i

Rj

XXX

XXX

i

pjpjj

pipii

e

eL

1...

...

2211

2211

)(

Partial likelihood

• Why does the partial likelihood make sense?

• Choose so that the one who failed at each time was most likely - relative to others who might have failed!

it at failed have could who ones of hazardsperson failed of hazard

i

pjpjj

pipii

i

pjpjj

pipii

Rj

XXXi

XXXi

Rj

XXX

XXX

et

et

e

e

...

...

...

...

)(

)(2211

2211

2211

2211

0

0

Some General Comments Thoughts

• Similar to logistic regression, a simple function of the has a particularly nice interpretation

• can be interpreted as a relative risk (risk ratio) for a one unit change in the predictor

e

β

0.60

0.60

ˆ 0.60 0.55 (protective effect)

ˆ 0.60 1.82 (increased risk)

e

e

Some General Comments Thoughts

• Estimates of βs are asymptotically normal (i.e., are normally distributed)

• Two important implications of asymptotic normality– We can use the likelihood ratio, score, and Wald tests to

make inference about our data – Wald test: “thing/SE(thing)”– We can use the usual method to construct a 95%

confidence intervalˆ ˆ1.96 ( )SEe

Resistive Index: Univariable PH Regression

Summary of the Number of Event and Censored Values

Total Event CensoredPercent

Censored86 22 64 74.42

Testing Global Null Hypothesis: BETA=0

TestChi-

Square DF Pr > ChiSqLikelihood Ratio 12.6328 1 0.0004

Score 15.2341 1 <.0001

Wald 12.6195 1 0.0004

Analysis of Maximum Likelihood Estimates

Parameter DFParameter

EstimateStandard

Error Chi-Square Pr > ChiSqHazard

Ratio

95% Hazard Ratio

Confidence Limits

PRERIGEP8 1 1.56144 0.43954 12.6195 0.0004 4.766 2.014 11.279

β

“Score” test is equivalent to log-rank

ˆ ˆ1.96 ( )SEe ˆ

e

Resistive Index: Multivariable PH RegressionAnalysis of Maximum Likelihood Estimates

Parameter DFParameter

EstimateStandard

Error Chi-Square Pr > ChiSq

HazardRatio

95% Hazard Ratio Confidence

LimitsRI >=0.8 1 1.89910 0.48164 15.5471 <.0001 6.680 2.60 17.17History of Coronary Disease 1 1.65099 0.75558 4.7745 0.0289 5.212 1.19 22.92History of PVD 1 0.78338 0.44869 3.0483 0.0808 2.189 0.91 5.27Preop - Postop Num Meds 1 -0.46046 0.20628 4.9830 0.0256 0.631 0.42 0.94Failed BP Response 1 1.43046 0.57694 6.1474 0.0132 4.181 1.35 12.95

• Each effect is estimated controlling for other effects in model• Note: increase in hazard ratio for RI in multivariable model• Proportional hazards assumption should be verified

Examine survival plots by covariate values Plot –log(-log(S(t)) vs. time, should be parallel if hazards are proportional

(N=84) (N=76) (N=63) (N=48) (N=37) (N=30) (N=15) (N=8) (N=3)

Pro

po

rtio

n A

live

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Months Post-surgery

0 12 24 36 48 60 72 84 96 108 120

RI < 0.8

RI ≥ 0.8

Resistive Index example

Log-rank Χ2= 15.2 p-value<0.0001

Example of proportional hazards violation

Unadjusted all-cause mortality survival curve, by annual hospital volume of Medicare breast cancer cases: United States, 1994–1996

Remedial Measures for Non-proportionality

• Stratified analysis– Uses stratification to control for the non-

proportional factor– Removes factor as covariate (i.e., get no effect

estimate)

• Add time dependent covariate– Time dependent covariates change value over

time– Can use indicator to fit new effect after change

point (e.g., at 40 months in previous plot)

lecture 5 – categorical data and survival analyses

Documents

categorical data analysis

cdanominal categorical

new trt patients

new trt response rate

statistical analysis

prevalence of mi

new trt grouppvalue

test of association