semiparametric methods for colonic crypt signaling raymond j. carroll department of statistics...

35
Semiparametric Methods for Colonic Crypt Signaling Raymond J. Carroll Department of Statistics Faculty of Nutrition and Toxicology Texas A&M University http://stat.tamu.edu/~carroll

Post on 22-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Semiparametric Methods for Colonic Crypt Signaling

Raymond J. CarrollDepartment of Statistics

Faculty of Nutrition and Toxicology

Texas A&M Universityhttp://stat.tamu.edu/~carroll

Outline

• Problem: Modeling correlations among functions and apply to a colon carcinogenesis experiment

• Biological background• Semiparametric framework• Nonparametric Methods• Asymptotic summary• Analysis• Summary

Biologist Who Did The Work

Meeyoung Hong

Postdoc in the lab of Joanne Lupton and Nancy Turner

Data collection represents a year of work

Basic Background

• Apoptosis: Programmed cell death• Cell Proliferation: Effectively the

opposite• p27: Differences in this marker are

thought to stimulate and be predictive of apoptosis and cell proliferation

• Our experiment: understand some of the structure of p27 in the colon when animals are exposed to a carcinogen

Data Collection

• Structure of Colon• Note the finger-like

projections• These are colonic

crypts• We measure

expression of cells within colonic crypts

Data Collection

• p27 expression: Measured by staining techniques

• Brighter intensity = higher expression

• Done on a cell by cell basis within selected colonic crypts

• Very time intensive

Data Collection

• Animals sacrificed at 4 times: 0 = control, 12hr, 24hr and 48hr after exposure

• Rats: 12 at each time period• Crypts: 20 are selected• Cells: all cells collected, about 30 per crypt• p27: measured on each cell, with

logarithmic transformation

Nominal Cell Position

• X = nominal cell position

• Differentiated cells: at top, X = 1.0

• Proliferating cells: in middle, X=0.5

• Stem cells: at bottom, X=0

Standard Model

• Hierarchical structure: cells within crypts within rats within times

rcrc

rc

r rc

r

Y (x) = η(x)+γ (x)+ +ε (x)

η(x)+γ (x)=

= crypt-level functions,

typ

rat-lev

ically

θ (

as

el fun

sumed

x)

θ (

inde

cti

pen

x

o

)

n

dent

Standard Model

• Hierarchical structure: cell locations of cells within crypts within rats within times

• In our experiment, the residuals from fits at the crypt level are essentially white noise

• However, we also measured the location of the colonic crypts

Crypt Locations at 24 Hours, Nominal zero

Scale: 1000’s of microns

Our interest: relationships at between 25-200 microns

Standard Model

• Hypothesis: it is biologically plausible that the nearer the crypts to one another, the greater the relationship of overall p27 expression.

• Expectation: The effect of the carcinogen might well alter the relationship over time

• Technically: Functional data where the functions are themselves correlated

Basic Model

• Two-Levels: Rat and crypt level functions

• Rat-Level: Modeled either nonparametrically or semiparametrically

• Semiparametrically: low-order regression spline

• Nonparametrically: some version of a kernel fit

Basic Model

• Crypt-Level: A regression spline, with few knots, in a parametric mixed-model formulation

rc,I rc,L rc,S

p

rc S 2p

rc = + x +C(x) ;

C(x) spline basis functions

0cov( )

0

θ (x)

I

Linear spline:

In practice, we used a quadratic spline.

The covariance matrix of the quadratic part used the Pourahmadi-Daniels construction

Advertisement

http://stat.tamu.edu/~carroll/semiregbook/

Please buy our book! Or steal it

David Ruppert Matt Wand

Basic Model

• Crypt-Level: We modeled the functions across crypts semiparametrically as regression splines

• The covariance matrix of the parameters of the spline modeled as separable

• Matern family used to model correlations

m

mm

m

1

Matern correlation : (d, , )

1 2d 2d(d, , ) K

2 ( )

K Modified Bessel functio

m

n

m

m mm

Basic Model

• Crypt-Level: regression spline, few knots• Separable covariance structure

rc,I rc,L rc,Sr

j S

c

ri r

= + x +C(x) ;

1 d(i, j), ,cov( , )

d(i, j), , 1

d(i, j) dis tance between crypts (

m

θ (x

m

i, j

)

)

Xihong Lin

Theory for Parametric Version

• The semiparametric approach we used fits formally into a new general theory

• Recall that we fit the marginal function at the rat level “nonparametrically”.

• At the crypt level, we used a parametric mixed-model representation of low-order regression splines

General Formulation

• Yij = Response

• Xij, Zij = cell and crypt locations

• Likelihood (or criterion function)

• The key is that the function is evaluated multiple times for each rat

• This distinguishes it from standard semiparametric models

• The goal is to estimate and efficiently

θ( )

i i i1 imY ,Z , , (X ),..., (θ )θβ X

θ( ) β

General Formulation: Overview

• Likelihood (or criterion function)

• For iid versions of these problems, we have developed constructive kernel-based methods of estimation with• Asymptotic expansions and inference available

• If the criterion function is a likelihood function, then the methods are semiparametric efficient.• Methods avoid solving integral equations

i i i1 imY ,Z , , (X ),..., (θ )θβ X

General Formulation: Overview

• We also show • The equivalence of profiling and backfitting• Pseudo-likelihood calculations

• The results were submitted 7 months ago, and we confidently expect 1st reviews within the next year or two , depending on when the referees open their mail

General Formulation: Overview

• In the application, modifications necessary both for theoretical and practical purposes

• Techniques described in the talk of Tanya Apanasovich can be used here as well to cut down on the computational burden due to the correlated functions.

Nonparametric Fits

• Equal Spacing: Assume cell locations are equally spaced

• Define = covariance between crypt-level functions that are apart, one at location x1 and the other at location x2

• Assume separable covariance structure

1 2 1 2V(x ,x ,Δ)=G(x ,x ) ρ(Δ)

1 2V(x ,x ,Δ)

Nonparametric Fits

• Discrete Version: Pretend x1 and x2 take on a small discrete set of values (we actually use a kernel-version of this idea)

• Form the sample covariance matrix per rat at x1 and x2 , then average across rats.

• Call this estimate

1 2V̂(x ,x ,Δ)

Nonparametric Fits

• Separability: Now use the separability to get a rough estimate of the correlation surface.

1 2

1 2

1 2 1 2

1 2x ,x

1 2x ,x

V(x ,x ,Δ)=G(x ,x )

V̂(x ,x ,Δ)

=V̂(x ,x ,0)

ρ(Δ)

ρ(Δ)

Nonparametric Fits

• The estimate is not a proper correlation function

• We fixed it up using a trick due to Peter Hall (1994, Annals), thus forming , a real correlation function

• Basic idea is to do a Fourier transform, force it to be non-negative, then invert

• Slower rates of convergence than the parametric fit, more variability, etc.

• Asymptotics worked out (non-trivial)

ρ(Δ)

ρ̂(Δ)

Semiparametric Method Details

• In the example, a cubic polynomial actually suffices for the rat-level functions

• Maximize in the covariance structure of the crypt-level splines, including Matern-order

• The covariance structure includes• Matern order m (gridded)• Matern parameter • Spline smoothing parameter• Quadratic part’s covariance matrix

(Pourahmadi-Daniels)• AR(1) for residuals

Results: Part I

• Matern Order: Matern order of 0.5 is the classic autoregressive model

• Our Finding: For all times, an order of about 0.15 was the maximizer

• Simulations: Can distinguish from autocorrelation

• Different Matern orders lead to different interpretations about the extent of the correlations

Marginal over time Spline Fits

Note• time

effects• Location

effects

24-Hour Fits: The Matern order matters

Nonparametric and Semiparametric Fits at 24 Hours

Comparison of Fits

• The semiparametric and nonparametric fits are roughly similar

• At 200 microns, quite far apart, the correlations in the functions are approximately 0.40 for both

• Surprising degree of correlation

Summary

• We have studied the problem of crypt-signaling in colon carcinogenesis experiments

• Technically, this is a problem of hierarchical functional data where the functions are not independent in the standard manner

• We developed (efficient, constructive) semiparametric and nonparametric methods, with asymptotic theory

• The correlations we see in the functions are surprisingly large.

Statistical Collaborators

Yehua Li Naisyin Wang

Summary

Insiration for this work:

Water goanna, Kimberley Region, Australia