university of california at irvine department of statisticshombao/ombao_uplb_ibs_dec_2012.pdf ·...

35
S PATIO-T EMPORAL MODELS WITH APPLICATIONS TO HEALTH DATA Hernando Ombao University of California at Irvine Department of Statistics December 11, 2012

Upload: ngodien

Post on 07-Oct-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

SPATIO-TEMPORAL MODELS WITH

APPLICATIONS TO HEALTH DATA

Hernando Ombao

University of California at IrvineDepartment of Statistics

December 11, 2012

OUTLINE OF TALK

RESEARCH MOTIVATION

Visual-motor electroencephalogram (HAND EEG)

Motor-decision (fMRI)

LA county environmental data (mortality, pollution,temperature)

NEUROSCIENCE DATA AND STATISTICAL GOALS

Electrophysiologic data: multi-channel EEG, local fieldpotentialsHemodynamic data: fMRI time series at several ROIs

NEUROSCIENCE DATA AND STATISTICAL GOALS

StimulusNeuronal

Response

Brain

SignalsBehavior

NEUROSCIENCE DATA AND STATISTICAL GOALS

StimulusNeuronal

Response

Brain

SignalsBehavior

Moderators

Modifiers

Genes

Trait

Socio-Environment

NEUROSCIENCE DATA AND STATISTICAL GOALS

Changes in

the mean

Changes in

variance

Changes in

Cross-Dependence

OUR RESEARCH GOALS

Characterize dependence in a brain network

Temporal: Y1(t) ∼ [Y1(t − 1),Y2(t − 1), . . .]′

Spectral: interactions between oscillatory activities at Y1,Y2

Develop estimation and inference methods for connectivity

Investigate potential for connectivity as a biomarker

Predicting behavior

Motor intent (left vs. right movement)[Brain-Computer-Interface]State of learningLevel of mental fatigue

Differentiating patient groups (bipolar vs. healthy children)

Connectivity between left DLPFC ⇆ right STG is greater forbipolar than healthy

OUR RESEARCH GOALS

Selected Contributions to the Time Series Literature

Automatic methods:SLEX Transform (Smooth Localized Complex EXponnetials)

Ombao et al. (2001, JASA)Ombao et al. (2001, Biometrika)Ombao et al. (2002, Ann Inst Stat Math)Huang, Ombao and Stoffer (2004, JASA)Ombao et al. (2005, JASA)Böhm, Ombao et al. (2010, JSPI)

Massive data; Complex-dependence; Mixed Effects

Bunea, Ombao and Auguste (2006, IEEE Trans Sig Proc)Ombao and Van Bellegem (2008, IEEE Trans Sig Proc)Freyermuth, Ombao, von Sachs (2009, JASA)Fiecas and Ombao (2011, Annals of Applied Statistics)Gorrostieta, Ombao et al. (2012, NeuroImage)Kang, Ombao et al. (2012, JASA)

VAR MODELS AND APPLICATION TO THE LA COUNTY

MORTALITY DATA

Cardiovascular Mortality

1970 1972 1974 1976 1978 1980

70

10

01

30

Temperature

1970 1972 1974 1976 1978 1980

50

70

90

Particulates

1970 1972 1974 1976 1978 1980

20

60

10

0

VAR MODELS AND APPLICATION TO THE LA COUNTY

MORTALITY DATA

Shumway, Azari and Pawitan (1988); Shumway and Stoffer(2010)

LA County

Weekly data on mortality, temperature and pollution levels

Model mortality ∼ (temperature + pollution)

Granger causality: does past knowledge of temperatureand pollution help improve prediction for mortality?

Causation vs Association

Practical issues

Hospitalization (rather than mortality)Effect of pollution might be long term (rather than shortterm)

VAR MODELS AND APPLICATION TO THE LA COUNTY

MORTALITY DATA

The VAR Model

Y(t) = [Y1(t),Y2(t),Y3(t)]′

Y(t) = [Mort(t),Temp(t),Part(t)]′

VAR(1) Model

Y1(t) = φ11Y1(t − 1) + φ12Y2(t − 1) + φ13Y3(t − 1) + ǫ1(t)

Y2(t) = φ21Y1(t − 1) + φ22Y2(t − 1) + φ23Y3(t − 1) + ǫ2(t)

Y3(t) = φ31Y1(t − 1) + φ32Y2(t − 1) + φ33Y3(t − 1) + ǫ3(t)

In matrix notation VAR(1)

Y(t) = ΦY(t − 1) + Z(t)

VAR MODELS AND APPLICATION TO THE LA COUNTY

MORTALITY DATA

Focus only on the dynamics of cardiac mortality

Trend µ(t) - linear trend + seasonality

The lag-1 model

Mort(t) = µ(t) + φ11Mort(t − 1) +

φ12Temp(t − 1) + φ13Part(t − 1) + Z1(t)

Lagged dependence parameters

φ11: Mort(t − 1) −→ Mort(t)φ12: Temp(t − 1) −→ Mort(t)φ13: Part(t − 1) −→ Mort(t)

VAR MODELS AND APPLICATION TO THE LA COUNTY

MORTALITY DATA

Results.

Bayesian information criterion chose L = 2 as the optimallag

Predicted cardiac mortality at time t

M(t) = 56 − 0.01t +

0.30M(t − 1)− 0.20T (t − 1) + 0.04P(t − 1) +

0.28M(t − 2)− 0.08T (t − 2) + 0.07P(t − 2)

MIXED EFFECTS VAR AND APPLICATIONS TO BRAIN

SIGNALS

PM

PFC

PMv

SMA

PMd

IPS

SPL

0.22%0

L R

MIXED EFFECTS VAR AND APPLICATIONS TO BRAIN

SIGNALS

Motor-Decision Experiment

N = 15 right-handed college students

Experiment: subjects see visual targets and must movejoystick

Two Conditions

Free choice - subject freely chooses any targetInstructed - subject must choose the specified target

Regions of interest (7 areas that show highest differentialactivation)

PFC, SMA, etc.

MIXED EFFECTS VAR AND APPLICATIONS TO BRAIN

SIGNALS

Limitations of the classical VAR model

Dependence and connectivity identical for allparticipants/subjectsDependence and connectivity identical across allexperimental conditions

Our novel contribution: mixed effects VAR model

Gorrostieta, Ombao, et al. (2012, NeuroImage)Subjects allowed to have a unique brain networkModel captures the effect of an experimental condition onconnectivity

MIXED EFFECTS VAR AND APPLICATIONS TO BRAIN

SIGNALS

Total of R regions of interest (ROIs)

Y nr (t) the fMRI time series at the ROI r for subject n

Entire network: Yn(t) = [Y n1 (t), . . . ,Y

nR(t)]

′.

General additive model

Yn(t) = Fn(t) + En(t)

The componentsFn(t) - deterministic componentEn(t) - stochastic component

MIXED EFFECTS VAR AND APPLICATIONS TO BRAIN

SIGNALS

The deterministic component Fn(t)

Decomposition

Fn(t) = Dn(t) + Mn(t) + βn1 ⊗ X1(t) + . . .+ βn

C ⊗ XC(t),

The mean component includes systematic changes in theBOLD signal that is due to

Scanner driftPhysiological signals of non-interest (e.g., cardiac andrespiratory)Experimental conditions

MIXED EFFECTS VAR AND APPLICATIONS TO BRAIN

SIGNALS

The stochastic component En(t)

En(t) captures between-ROI connectivity

Cov[Yn(t + h),Yn(t)] = Cov[Fn(t + h) + En(t + h),Fn(t) + En(t)]

= Cov[En(t + h),En(t)]

En(t) cannot be observed directly; we use the residuals:

En(t) = Yn(t)− Fn(t)

Rn(t) = Yn(t)− Fn(t)

MIXED EFFECTS VAR AND APPLICATIONS TO BRAIN

SIGNALS

The ME-VAR(1) Model

E(n)(t) =[Φ1,kW1(t) + Φ2,kW2(t) + b(n)

1

]E(n)(t − 1) + e(n)(t)

W�(t) is the indicator functionWhen condition 1 is active then W1(t) = 1 and W2(t) = 0When condition 2 is active then W1(t) = 0 and W2(t) = 1

b(n)1 models between-subject variation in connectivity

b(n)1 ∼ (0, σ2

1)

MIXED EFFECTS VAR AND APPLICATIONS TO BRAIN

SIGNALS

Subject-specific connectivity matrix (condition on b(n)1 )

When W1(t) = 1, the connectivity matrix for subject n is

Φ1,1 + b(n)1

When W2(t) = 1, the connectivity matrix for subject n is

Φ1,2 + b(n)1

The model can be utilized to test forLagged dependence between each pair of ROIs

H0 : Φ1,1 = 0,Φ1,2 = 0

Granger causality in each experimental conditionTesting for differences between conditions

H0 : ∆1 = Φ1,1 −Φ1,2 = 0

MIXED EFFECTS VAR AND APPLICATIONS TO BRAIN

SIGNALS

SMA

IPS

SPL

PFC

PMd PMd

PMv

L R

MIXED EFFECTS VAR AND APPLICATIONS TO BRAIN

SIGNALS

SMA

IPS

SPL

PFC

PMd PMd

PMv

L R

SPECTRAL MEASURES OF DEPENDENCE: PARTIAL

COHERENCE

Time series: X,Y,Z

Cross-correlation ρ(X,Y) = Cov(X,Y)√Var XVar Y

Partial cross-correlation between X and Y given Z

Remove Z from X: ǫX = X − βX ZRemove Z from Y: ǫY = Y − βY Zρ(X,Y|Z) = Cov(ǫX ,ǫY )√

Var ǫX Var ǫY

SPECTRAL MEASURES OF DEPENDENCE: PARTIAL

COHERENCE

Model A Model BCross-Corr Yes YesPartial CC NO Yes

SPECTRAL MEASURES OF DEPENDENCE: PARTIAL

COHERENCE

Let U(t) = [X (t),Y (t),others(t)]′.

Estimate partial coherence when dimU(t) is large

Characterization of partial coherence (Dahlhaus, 1996)

Spectral matrix f (ω)Λ(ω) = H(ω)f−1(ω)H(ω)Partial coherence between X and Y is | Λ12(ω) |

2

A Problem: bias in sample eigenvalues leads to poorcondition number

Sample max eigenvalue over-estimatesSample min eigenvalue under-estimates

SPECTRAL MEASURES OF DEPENDENCE: PARTIAL

COHERENCE

Standard methods (Welch’s periodogram; multi-taper) tendto produce highly erratic resultsOur novel contribution: Shrinkage MethodFiecas, M. and Ombao H. (2011, Annals of AppliedStatistics)

Target: highly structured spectral matrix V (ω)

Vector auto-regressive; Vector ARCHDiagonal matrix

Initial estimator I(ω) (non-parametric)Generalized shrinkage estimator

f (ω) = (1 − W (ω)) I(ω) + W (ω)V (ω)

W (ω) ∝ E[I(ω)− f (ω)]2

BIOMARKER SELECTION

Outcome of interest

Behavioral measures (cognitive assessments, etc)Response to treatment

Potential predictors

Neuroimaging-derived measuresClinicalDemographicGenetic

BIG Problem: The number of potential predictors exceedthe number of subjects

“Large P - small N problem"

BIOMARKER SELECTION

Regression Model

Yn = β0 + β1x1n + β2x2n + . . .+ βPxPn + ǫn

Least Squares Criterion

C(β) =

N∑

n=1

[Yn − (β0 + β1x1n + . . .+ βPxPn)]2

Penalty for complexityL1(β) =

∑Pp=1 |βp|

L2(β) =∑P

p=1 |βp|2

Complexity-penalized least squares criterion

PC(β) = C(β) + λ1L1(β) + λ2L2(β)

Result: many β estimates will be forced to 0 (consideredirrelevant!)

BIOMARKER SELECTION

Complexity-penalized methods

LASSO, elastic netLimitation: does not assess uncertainty in biomarkerselection

Our novel contribution: Bootstrap-enhanced elastic netmethod

Bunea, She, Ombao et al. (2011, NeuroImage)Obtain B bootstrap datasetsFor each b = 1 : B bootstrap data, record the predictorsthat were selected

BIOMARKER SELECTION

age

km

sk_

cocopi

Education

hcv_

curr

ent

km

sk_

alc

fa_

ic2

fa_

cc1

fa_

ic2.fa_

ic2

md_

cc234.m

d_

cc234

fa_

ic4

Predictors (partial)

HVLT_sum_T

HVLT_delay_T

BVMT_sum_T

BVMT_delay_T

WAIS_LNS_T

WAIS_DigSym_T

WAIS_SymSrch_T

GPeg_dom_T

GPeg_nondom_T

Trail_A_T

Trail_B_T

COWAT_T

Animal_T

NC

Is

Variable Selection

SUMMARY

Vector (multivariate) Autoregressive Models

Mixed Effects VAR

Spectral Dependence: Shrinkage procedure

Biomarker selection: bootstrap enhanced elastic net

US-BASED STATISTICIANS

Name Institution ExpertiseChi-chi Aban Univ Alabama - Survival, Trials

Birmingham CardiologyDexter Cahoy Louisiana Tech Applied StatsMark Fiecas Univ California Applied Stats

San Diego ImagingHernando Ombao Univ California - Time Series

Irvine Brain SignalsEdsel Pena Univ South Carolina Survival, NonparJohn Yap FDA Genetics

ACKNOWLEDGEMENT

INSTAT, UP Los Baños (Prof Reaño)

Institute of Biological Sciences, UP Los Baños