model comparison for tree resin dose effect on termites lianfen qian florida atlantic university...

Post on 13-Jan-2016

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Model Comparison for Model Comparison for Tree Resin Dose Effect Tree Resin Dose Effect

On TermitesOn Termites

Lianfen Qian

Florida Atlantic University

Co-author: Soyoung Ryu, University of Washington

OutlineOutline

Introduction

Longitudinal Data: Termites Data Set

Model Comparison

– Partially Linear Model

– Piecewise Linear Models

– Nonparametric Smoothing Methods

Conclusions

IntroductionIntroduction

Termite destruction in Florida is a serious problem.

Each year wood termites bore into thousands of homes and businesses causing millions of dollars of damage.

Current chemical pesticides that are used in the control of termites and protection from their damage are potentially harmful to Florida’s delicate environment.

Goal of studyGoal of study To determine the effectiveness of a natural tropical tree resin in controlling termites thus providing protection from their destruction.

Longitudinal DataLongitudinal Data

Definition: Longitudinal data is characterized by repeated measures over time on the same set of units.

Incomplete data: one or more of the sequences of measurements from units are incomplete.

Unbalanced data if the measurement was NEVER INTENDED to be taken

Missing data if the measurement was INTENDED to be taken

Longitudinal Data, Cont.Longitudinal Data, Cont.

Benefits Distinguish changes over time within units from the

differences among units Use units efficiently once they are enrolled in a

study

Issue: Repeated observations on the same subject tend to be correlated Need to find appropriate statistical analysis

considering this correlation.

Termites Data SetTermites Data Set

The resin was derived from the bark of tropical trees and was dissolved in a solvent and is placed on filter paper in two different levels of concentration, either 5mg or 10mg dosage.

There are eight dishes for each dose. Twenty five alive termites are placed in each dish.

Each dish was observed on 13 specific days. No observation was made on day 3 and day 9.

O O O O OO O O O OO O O O OO O O O OO O O O O

5mg or 10mg25

Termites Data, Cont.Termites Data, Cont.

5mg

10mg

Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

25

25

25

25

25

25

25

25

25

25

25

25

25

25

25

25

Scatter PlotScatter Plot

Longitudinal PlotsLongitudinal Plots

(a) 5mg dose (b) 10mg dose

Partially Linear Model

Data Set

EDA

Strange Behavior of dishes 1 & 2 for 10mg dose is found

Mistake?

Remove dishes 1 & 2 for 10mg

Add additional unknown level of dosage. *mg, 5mg, 10mg

Add random effect of dish

NO YES

Common time effect for the different dose

Are error terms correlated?

Add correlation additional term to

catch the correlation

End

No

YES

Different time effect for the different dose

Partially Linear ModelPartially Linear Model

Benefits:

It is more efficient than the standard linear regression model, when the response variable depends on some variables in linear relationship, but is nonlinearly related to other covariates.

It can provide a parsimonious description of relationship between the response variable and explanatory variables.

It has the flexibility of the nonparametric model.

Partially Linear ModelPartially Linear Model

Yij = xijTβ + g(tij) + εij , i = 1,…,m, j = 1,…, ni

m is the number of units ni is the number of observations for each unit (xij, tij) is either independent and identically

distributed random design points or fixed design points

g is an unknown non-parametric function εij are a set of N random variables, each with zero

mean and finite variance.N = n1+…+nm

Back-fitting AlgorithmBack-fitting Algorithm

1. Given the current estimate , calculate residuals

rij=Yij-xTij and use these in place of Yij to

calculate a cubic spline estimate, g(t). 2. Given g, calculate residuals, rij=Yij-g(tij), and

update the estimate using generalized least squares,

ß = (XTV-1 X)-1 XTV-1 r, where X is the matrix with rows xT

ij, V is the assumed block diagonal covariance matrix of the data and r is the vector of residuals.

3. Repeat steps 1 and 2 for convergence.

^

^

^

^ ^

^

^

Spline Estimator of gSpline Estimator of g Among all functions g(x) with two continuous

derivatives, find the one that minimizes the penalized residual sum of squares:

∑{rij – g(xij)}2 + λ ∫ {g″(t)}2 dt

λ controls the smoothness of the fitted curve:

Larger λ => Smaller variance => Smoother curve => Larger bias

Trade-off between bias and variance.

The Generalized Cross-Validation function (Rice & Silverman, 1991) is used to choose λ: Minimize

2

( ) ( )

1 1

ˆ ˆ( ) ( ) , where is the cubic

spline estimator without jth observation of ith unit.

inmij ij

ij iji j

S r g t g

Original Data Set Original Data Set with Common Time Effectwith Common Time Effect

Removing Outliers (dishes 1 &2)Removing Outliers (dishes 1 &2)

Add Additional DoseAdd Additional Dose

Different Time Effect for DoseDifferent Time Effect for Dose

Piecewise Linear Regression ModelPiecewise Linear Regression Model

For 5mg, the data does not show change point. For 10mg, the data shows a change point. Use the following piecewise linear model:

E(y|x)= 0 + 1 x, if x< 0 + 1 x, otherwise.

• Change point estimated using M-estimation (Koul & Qian & Surgailis, 2003)

Two-Phase Linear Regression

Piecewise Linear RegressionPiecewise Linear Regression

Cubic Splines SmoothingCubic Splines Smoothing

Cubic Splines

E(y|x)=+1x+ 2x2+ 3x3+ 4 (x-7)3+

Cubic Spline MethodCubic Spline Method

(a) 5 mg dose (b) 10 mg dose (c) Unknown dose

No significant different between cubic smoothing and piecewise models

Model ComparisonsModel Comparisons Partially Linear Model gives

significant dose effects and non-linear time trend.

The dose effect under 10 mg is about 1.5 times faster than under 5 mg dosage in killing termites.

Time trend levels off by the end of the experiments. It is possible that there are not many termites in the dishes or the termites build up resistance to the tree resin.

Piecewise Linear ModelsPiecewise Linear Models It shows that there is a dramatic effect in

the first seven days under 10 mg dosage.

There is linear trend and dose effect under 5 mg dosage.

For the two strange dishes under 10 mg dosage, the first seven day effect is not significantly from 5 mg dose, while after seven days, it shows worse effect than 5 mg dose. This indicates that there are recording or operating mistakes for those two dishes’ records.

Cubic SplineCubic Spline

It shows the similar results as the piecewise linear models. There is one knot identified at the seventh day for 10 mg dosage, but there is none for 5mg dosage.

ConclusionsConclusions

Overall, 10 mg dose is significantly more effective than 5 mg dose.

For 10 mg dose, both piecewise linear model and cubic spline smoothing show that termites are killed in about 7 days.

Conclusion, Cont.Conclusion, Cont.

For 5 mg dose, all methods (linear, partial linear, piecewise linear and cubic spline smoothing) show that the effect is linear. It takes more than double time to kill termites comparing 10 mg dose.

Two dishes recorded for 10 mg dose behaviors insignificant from 5 mg dose for the first 7 days. After seven days, it shows significantly none effectiveness on killing termites.

Conclusions, cont.Conclusions, cont.

The estimated treatment effect is time varying with a change point at day 7.

The final piecewise model fits the data with adjusted R2=93.7%.

On average, 10mg is 68.9% more efficient than 5mg in killing termites during the first week.

Thank you !Thank you !

Florida Atlantic Universityhttp://www.math.fau.edu/qian

Please contact at

E-MAIL: lqian@fau.edu

PHONE: 561-297-2486

Department of Mathematical Sciences

Florida Atlantic University

Boca Raton, FL 33431

. . . . . . . . . . . . . . . . . . .

top related