detecting spatial clustering in matched case-control studies andrea cook, ms collaboration with: dr....

31
Detecting Spatial Detecting Spatial Clustering in Matched Clustering in Matched Case-Control Studies Case-Control Studies Andrea Cook, MS Andrea Cook, MS Collaboration with: Collaboration with: Dr. Yi Li Dr. Yi Li November 4, 2004 November 4, 2004

Post on 19-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Detecting Spatial Clustering in Detecting Spatial Clustering in Matched Case-Control StudiesMatched Case-Control Studies

Andrea Cook, MSAndrea Cook, MS

Collaboration with:Collaboration with:

Dr. Yi LiDr. Yi Li

November 4, 2004November 4, 2004

Page 2: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

OutlineOutline1.1. MotivationMotivation

• Petrochemical exposure in relation to childhood Petrochemical exposure in relation to childhood brain and leukemia cancersbrain and leukemia cancers

2.2. Cumulative Geographic ResidualsCumulative Geographic Residuals• UnconditionalUnconditional• ConditionalConditional

3.3. Simulation ResultsSimulation Results• Type I error Type I error • Power CalculationsPower Calculations

4.4. ApplicationApplication• Childhood Leukemia Childhood Leukemia • Childhood Brain CancerChildhood Brain Cancer

5.5. SoftwareSoftware6.6. DiscussionDiscussion

• Limitations Limitations • Future ResearchFuture Research

Page 3: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Taiwan Petrochemical StudyTaiwan Petrochemical Study

Matched Case-Control StudyMatched Case-Control Study• 3 controls per case3 controls per case• Matched on Age and GenderMatched on Age and Gender• Resided in one of 26 of the overall 38 Resided in one of 26 of the overall 38

administrative districts of Kaohsiung administrative districts of Kaohsiung County, TaiwanCounty, Taiwan

• Controls selected using national Controls selected using national identity numbers (not dependent on identity numbers (not dependent on location). location).

Page 4: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Study PopulationStudy Population

Due to dropout approximately 50% 3 to 1 matching, Due to dropout approximately 50% 3 to 1 matching, 40% 2 to 1 matching, and 10% 1 to 1 matching.40% 2 to 1 matching, and 10% 1 to 1 matching.

LeukemiaLeukemia Brain CancerBrain Cancer

CasesCases 121121 111111

ControlsControls 287287 259259

Page 5: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Map of KaohsiungMap of Kaohsiung

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

##

#

#

#

#

#

#

#

#

# #

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

# #

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

# #

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

##

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

##

##

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

##

#

#

##

#

#

##

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

# #

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

##

#

#

#

###

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

###

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

###

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

###

# #

#

#

#

#

##

# #

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

# #

#

#

#

#

#

#

#

##

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

##

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

##

#

###

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

#

$$$

$

Nantze

Jenwu

Linyuan

Tsoying

# Study Participants$ Petro Plants

Page 6: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Cumulative ResidualsCumulative Residuals

Unconditional (Independence)Unconditional (Independence)• Model definition using logistic regressionModel definition using logistic regression• Extension to Cluster DetectionExtension to Cluster Detection

Conditional (Matched Design)Conditional (Matched Design)• Model definition using conditional logistic Model definition using conditional logistic

regressionregression• Extension to Cluster DetectionExtension to Cluster Detection

Page 7: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Logistic ModelLogistic ModelAssume the logistic model where,Assume the logistic model where,

and the link function,and the link function,

Therefore the likelihood score function for isTherefore the likelihood score function for is

with information matrixwith information matrix

ii Y1i

Yiii )p1(p)p|Y(L

. )p(logit)p(g ii iβX

n

1ii )exp(1

)exp(Y)(U

i

ii βX

βXXβ

β

.)exp(1

)exp()( T

n

1i2 ii

i

i XXβX

βXβI

Page 8: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Residual FormulationResidual Formulation

Then define a residual as,Then define a residual as,

where is the solution to .where is the solution to .

Assuming the model is correctly specified would Assuming the model is correctly specified would imply there is no pattern in residuals.imply there is no pattern in residuals.

=> Use Residuals to test for misspecification.=> Use Residuals to test for misspecification.

)ˆexp(1

)ˆexp(Ye ii

i

i

β 0)(U β

Cumulative Residuals for Model Checking; Lin, Wei, Ying 2002

Page 9: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Hypothesis TestHypothesis Test

Hypothesis of interest,Hypothesis of interest,

Geographic Location, (rGeographic Location, (rii, t, tii ) )

Independent Independent

of Outcome, Yof Outcome, Yii|X|Xii

Cumulative Geographic Residual Cumulative Geographic Residual Moving Block Process is PatternlessMoving Block Process is Patternless

Page 10: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Unconditional Cluster DetectionUnconditional Cluster DetectionDefine the Cumulative Geographic Residual Moving Block Process as,Define the Cumulative Geographic Residual Moving Block Process as,

n

1ii2i221i112121loc ext)bx(,xr)bx(I

n

1),bb|x,x(W

Page 11: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Asymptotic DistributionAsymptotic Distribution

However, the asymptotic distribution of is difficult to However, the asymptotic distribution of is difficult to simulate, but it has been shown to be equivalent to the following, simulate, but it has been shown to be equivalent to the following, conditional on the observed data, distribution, conditional on the observed data, distribution,

wherewhere

i

n

1ii2

121

T

i

n

1ii2i221i112121loc

Ge)()|x,x(n

1

Gext)bx(,xr)bx(In

1),bb|x,x(W

)]ˆexp(1[

)ˆexp(ˆˆ

iXβ

iXβββ iXI

data. observed theoft independen )1,0( ~ G,...,G and

xt)bx(,xr)bx(I)|x,x(

iid

n1

n

1i22i221i1121

T

)]ˆexp(1[

)ˆexp(

iXβ

iXββ iX

)b,b|,(W 21loc

Page 12: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Significance TestSignificance TestTesting the NULLTesting the NULL

• Simulate N realizations ofSimulate N realizations of

by repeatedly simulating , while fixing the data at their observed by repeatedly simulating , while fixing the data at their observed values.values.

• Calculate P-valueCalculate P-value

)t,r(|Y:H iiio iX

)b,b|,(W 21loc

)b,b|,(W),...,b,b|,(W 21loc,N21loc,1

)G,...,G( n1

)b,b|x,x(Wsup)b,b(S and )b,b|x,x(Wsup)b,b(S

whereN

)b,b(S)b,b(SI

value-P

2121locx,x

21loc2121locx,x

21loc

N

1j21loc,j21loc

2121

Page 13: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Conditional Logistic ModelConditional Logistic ModelType of Matching: 1 case to MType of Matching: 1 case to Ms s controlscontrols

Data Structure:Data Structure:

Assume that conditional on , an unobserved stratum-specific intercept, Assume that conditional on , an unobserved stratum-specific intercept, and given the logit link, implies,and given the logit link, implies,

The conditional likelihood, conditioning on is,The conditional likelihood, conditioning on is,

.)exp(

)exp()s|Y(E 1M

1j

isis s

is

is

βX

βX

.)exp(

)exp()(L

1 s

is

s

N

1s

1M

1i

Y

1M

1j j

s

is

βX

βXβ

0Y,...,0Y,1Y s)1M(s2s1 s

s

1YY s)1M(s1 s

Page 14: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Score and InformationScore and Information

Denote the conditional likelihood score as,Denote the conditional likelihood score as,

with information matrix,with information matrix,

,)exp(

)exp()(U)(U

1 1

s

sN

1s

N

1s1M

1j

1M

1js

js

jsjs

1sβX

βXXXββ

.

)exp(

)exp()exp(

)exp(

)exp()(I

1

s

ss

s

sN

1s21M

1j

1M

1j

T1M

1j

1M

1j

1M

1j

T

js

jsjsjs

js

jsjs

βX

βXXβXX

βX

βXXXβ jsjs

Page 15: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Conditional ResidualConditional Residual

Then define a residual as,Then define a residual as,

where is the solution to .where is the solution to .

=> Use these correlated Residuals to test for patterns => Use these correlated Residuals to test for patterns based on location.based on location.

1M

1j js

sisis s )ˆexp(

)ˆexp(Ye

Xβ i

β 0)(U β

Page 16: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Conditional Cumulative ResidualConditional Cumulative ResidualDefine the Conditional Cumulative Residual Moving Block Process as,Define the Conditional Cumulative Residual Moving Block Process as,

Which has been shown to be asymptotically equivalent to,Which has been shown to be asymptotically equivalent to,

wherewhere

and that are independent of observed data.and that are independent of observed data.

)1,0(~G,...,Giid

N1 1

1 sN

1s

1M

1iis2is221is11

1

2121loc ext)bx(,xr)bx(IN

1),bb|x,x(W

ss1

21T

N

1s

1M

1iis2is221is11

1

2121loc

GˆUˆIˆ|x,x

ext)bx(,xr)bx(IN

1),bb|x,x(W

1 s

βββ

1 sN

1s

1M

1iis2is221is1121 /xt)bx(,xr)bx(I)|x,x( ββ

Page 17: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Significance TestSignificance TestTesting the NULL Testing the NULL

• Simulate N realizations ofSimulate N realizations of

by repeatedly simulating , while fixing the data at their observed by repeatedly simulating , while fixing the data at their observed values.values.

• Calculate P-valueCalculate P-value

)t,r(|Y:H isissiso iX

)b,b|,(W 21loc

)b,b|,(W),...,b,b|,(W 21loc,N21loc,1 )G,...,G(

1N1

)b,b|x,x(Wsup)b,b(S and )b,b|x,x(Wsup)b,b(S

whereN

)b,b(S)b,b(SI

value-P

2121locx,x

21loc2121locx,x

21loc

N

1j21loc,j21loc

2121

Page 18: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

SimulationSimulation Choice of GChoice of Gii or G or Gisis

UnconditionalUnconditionalNormalNormal DiscreteDiscrete

ConditionalConditionalNormalNormal DiscreteDiscrete

1 to 11 to 1

2 to 12 to 1

3 to 13 to 1 Type I errorType I error Power CalculationsPower Calculations

)1,0(N~G i

2/1.p.w1

2/1.p.w1~G i

)1,0(~ NGs

2/1..1

2/1..1~

pw

pwGs

3/1..2/2

3/2..2/1~

pw

pwGs

4/1..3/3

4/3..3/1~

pw

pwGs

Page 19: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Type I errorType I error

UnconditionalUnconditionalGenerate N xGenerate N xii and y and yii from Unif(0,10) from Unif(0,10)

Type I error is the percentage of found Type I error is the percentage of found significant clusters.significant clusters.

ConditionalConditionalGenerate N xGenerate N xisis and y and yisis from Unif(0,10) from Unif(0,10)

Type I error is the percentage of found Type I error is the percentage of found significant clusters.significant clusters.

Page 20: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Type I errorType I error

UnconditionalUnconditional

ConditionalConditional

300 500 1000 300 500 1000Percent of 20% 0.016 0.036 0.054 0.146 0.172 0.168

Cases 30% 0.024 0.044 0.054 0.136 0.154 0.138

Normal DiscreteNumber of Observations

1:1 2:1 3:1 1:1 2:1 3:1Number of 100 0.010 0.080 0.148 0.020 0.074 0.036

Cases 200 0.012 0.088 0.162 0.030 0.084 0.046

Normal DiscreteType of Matching

Page 21: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Power CalculationsPower Calculations

Two Power CalculationsTwo Power Calculations

1313 1414 1515 1616

99 1010 1111 1212

55 66 77 88

11 22 33 44

Page 22: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Power CalculationsPower Calculations

Single HotspotSingle Hotspot

1313 1414 1515 1616

99 1010 1111 1212

55 66 77 88

11 22 33 44

Page 23: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Power CalculationsPower Calculations

Multiple HotspotsMultiple Hotspots

1313 1414 1515 1616

99 1010 1111 1212

55 66 77 88

11 22 33 44

Page 24: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Power CalculationsPower Calculations

UnconditionalUnconditional

ConditionalConditional

1:1 2:1 3:1 1:1 2:1 3:1Single Cluster

Number of 100 0.606 0.766 0.828 0.706 0.758 0.750

Cases 200 0.886 0.964 0.990 0.908 0.950 0.982

Multi ClusterNumber of 100 0.464 0.704 0.774 0.490 0.672 0.704

Cases 200 0.844 0.946 0.974 0.854 0.932 0.948

Type of MatchingNormal Discrete

Spatial Scan Normal DiscreteSingle 0.958 0.964 0.976

Multi 0.852 0.916 0.932

Page 25: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

ApplicationApplication

Study: Study:

Kaohsiung, Taiwan Matched Case-Control Kaohsiung, Taiwan Matched Case-Control StudyStudy

Method: Method:

Conditional Cumulative Geographic Conditional Cumulative Geographic Residual Test (Normal and Mixed Residual Test (Normal and Mixed Discrete)Discrete)

Page 26: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

ResultsResults

Odds Ratio (p-values)Odds Ratio (p-values)

Marginally Significant Clustering for both outcomes Marginally Significant Clustering for both outcomes without adjusting for smoking history.without adjusting for smoking history.

Unadjusted Adjusted Unadjusted AdjustedDiscrete 2.10 (0.055) 2.19 (0.143) 1.97 (0.058) 2.08 (0.104)

Normal 2.10 (0.050) 2.19 (0.122) 1.97 (0.052) 2.08 (0.104)

Leukemia Brain Cancer

Page 27: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Childhood LeukemiaChildhood Leukemia

165000 170000 175000 180000 185000 190000

24

90

00

02

50

00

00

25

10

00

02

52

00

00

25

30

00

02

54

00

00

X1

X2

Cu

mu

lativ

e R

esi

du

als

Unadjusted

P-Values:Discrete = 0.055 Normal = 0.050

(a)

165000 170000 175000 180000 185000 190000

24

90

00

02

50

00

00

25

10

00

02

52

00

00

25

30

00

02

54

00

00

X1

X2

Adjusted

(b)

P-Values:Discrete = 0.143 Normal = 0.122

CasesControlsPlants

Page 28: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

Childhood Brain CancerChildhood Brain Cancer

165000 170000 175000 180000 185000 190000

24

90

00

02

50

00

00

25

10

00

02

52

00

00

25

30

00

02

54

00

00

X1

X2

P-Values:Discrete = 0.052 Normal = 0.058

(a)

Cu

mu

lativ

e R

esi

du

als

Unadjusted

165000 170000 175000 180000 185000 190000

24

90

00

02

50

00

00

25

10

00

02

52

00

00

25

30

00

02

54

00

00

X1

X2

Adjusted

P-Values:Discrete = 0.104 Normal = 0.104

(b)CasesControlsPlants

Page 29: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

SoftwareSoftware

R macro to handle both unconditional and R macro to handle both unconditional and conditional dataconditional data

Dataset:Dataset:X and Y coordinates of each participantX and Y coordinates of each participantCase/control variableCase/control variableCovariate matrixCovariate matrixStratum Variable for conditional dataStratum Variable for conditional data

Takes just a few minutes to run!Takes just a few minutes to run!

Page 30: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

DiscussionDiscussion

Cumulative Geographic ResidualsCumulative Geographic Residuals• Unconditional and Conditional Methods for Binary Unconditional and Conditional Methods for Binary

OutcomesOutcomes• Can find multiple significant hotspots holding type I Can find multiple significant hotspots holding type I

error at appropriate levels.error at appropriate levels.• Not computer intensive compared to other cluster Not computer intensive compared to other cluster

detection methodsdetection methods

Taiwan StudyTaiwan Study• Found a possible relationship between Childhood Found a possible relationship between Childhood

Leukemia and Petrochemical Exposure, but not with Leukemia and Petrochemical Exposure, but not with the outcome Childhood Brain Cancer.the outcome Childhood Brain Cancer.

Page 31: Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004

DiscussionDiscussion

Future ResearchFuture Research• Failure Time DataFailure Time Data• Recurrent EventsRecurrent Events• Relocation of Study ParticipantsRelocation of Study Participants• SurveillanceSurveillance