immune response studies
DESCRIPTION
Mixtures, clustering, spatial [ & dynamic ] point processes and big data sets Mike West Department of Statistical Science Duke University. Immune response studies. 57.8. 0.79. : CD4 CY55PE. 36.3. : CD8 Q705. cellular phenotypes in vaccine adjuvant studies . - PowerPoint PPT PresentationTRANSCRIPT
BISP.6 - Bressanone - June 18th-20th 2009
Mixtures, clustering, spatial [ & dynamic ] point processes and big data sets
Mike WestDepartment of Statistical Science
Duke University
BISP.6 - Bressanone - June 18th-20th 2009
cellular phenotypes in vaccine adjuvant studies
Immune response studies
Lymphocyte differentiation: Multiple cell types ~ 15 cell surface marker proteins (+)
<V705-A>: CD8 Q705
<G7
10-A
>: C
D4 C
Y55P
E
57.8
36.3
0.79
i. cell subtyping
ii. spatio-temporal response
BISP.6 - Bressanone - June 18th-20th 2009
i. Cell subtyping: Flow cytometry data
LASER
OPTICS
ELECTRONICSFLUIDICS
Modest p, large n <V705-A>: CD8 Q705
<G7
10-A
>: C
D4 C
Y55P
E
57.8
36.3
0.79
Cell subset identification
Comparison across times, interventions
Comparison across patients
Comparison across treatmentsTreatment
Patient 1
Dataset 1
Cell subset 1 Cell subset n
Dataset n
Patient nMultiple experiments … - really big data - characterise data distributions - comparisons
BISP.6 - Bressanone - June 18th-20th 2009
Mixtures for flow cytometry data
mixture models (TDP version)Chan et al 2008,9
MCMC Bayesian EM
FSC-W
FSC-
H
88.3
<Violet G-A>: CD3 Amcyan<V
iole
t H-A
>: v
Amin
e CD
14PB
CD1
9 PB
41.4
Live T-cells
BISP.6 - Bressanone - June 18th-20th 2009
Non-Gaussian clusters/cell subtypesFlexible mixture model:
Subtypes: groups of componentsModal grouping
Modal clustering for non-Gaussian mixtures
Mode trace: fast iterative id of modes
BISP.6 - Bressanone - June 18th-20th 2009
Mixtures of mixtures
Cluster mixture models (TDP version)Cao & West 1993; Merl et al 2009
Cluster “anchors”
BISP.6 - Bressanone - June 18th-20th 2009
cluster locations
data by cluster
components
CFSE data: 3 of 7 dimensions: MCMC snapshot
helper Ts
cytotoxic Ts
dead cells
other Ts
BISP.6 - Bressanone - June 18th-20th 2009
Specification & computation
MCMC iteratesa. Reallocate data to components:
One “big mixture of normals”
b. Sufficient statistics: resample normal parameters
c. Probabilities: - Counts of data in clusters
- Counts in components within clusters
BIG data, many components: Exploit parallelisation in modules a, b, c
shared memory multi-threading in multi-core, multi-cpu computer
cluster: MPI interface
Prior control: Anchor cluster locations
Tie component means “close” to anchors
Stickiness: New MCMC - Split/merge? Component swapping between clusters
MAP/Bayesian EM
BISP.6 - Bressanone - June 18th-20th 2009
Inferences: Comparisons
Common interest: rare cell subsets
(e.g. antigen-specific cells << 1%)
Changes in relative abundance Changes in marker levels
Mouse cell line: HIV adjuvants
BISP.6 - Bressanone - June 18th-20th 2009
Marker 1
Mar
ker 2
Variable selection: Discriminative information
Measure fewer variables? Subtype characterising variables?
Redundant variables? Discrimination confusing variables?
discriminators:
discriminatory information: - high is good - finds useful & useless variables - ranks subsets
- involves “concordances” :
BISP.6 - Bressanone - June 18th-20th 2009
CFSE discriminative information analysis
Change in information by subtype: Drop one marker
Lose irrelevant markers: no loss in false pos/neg ratesSimpler, efficient marker subset analysis
BISP.6 - Bressanone - June 18th-20th 2009
Technology adoption: Many routine analyses
ComputationImplementation
HIV/AIDSCancer vaccines
Tropical diseases
BISP.6 - Bressanone - June 18th-20th 2009
BISP.6 - Bressanone - June 18th-20th 2009
ii. Spatial responses: Fluorescent histology/microscopy
Example: Mice lymph nodes: Compare immune response to various treatments
3 or 4 fluorescent tags – stain cell types: e.g. B220, IgM, GL-7
Many exploratory questions: Regional concentrations of types?
Overall levels of types?Interactions?
Germinal centres: relative concentrations of GL7/B220
Etc
Different time points
PA+Alum, day 1
BISP.6 - Bressanone - June 18th-20th 2009
4 cell types/4 colour channels: several treatments, several dayspixels: grid to small pixel regions
PA alone
B220 IgM
CD4 GL7
Immunofluorescent histology: BIG data
106 ¡ 108
Cells: model 2D (3D) spatial intensity hugely inhomogeneous
Noisy fluorescence
Flexible model to characterise
... intensity surfaces, … uncertain overall levels,
… noise & signal fluorescence,… compare cell types
BISP.6 - Bressanone - June 18th-20th 2009
Inhomogeneous Poisson process model
Point process
Measured fluorescence levels
B cells: GFP/B220, day 1
Intensity function
Latent
BISP.6 - Bressanone - June 18th-20th 2009
Spatial mixture & measurement model
Truncated Dirichlet process mixture [ Kottas & Sanso 07; Ji et al 09 ]
Data: noise/background vs. signal Extend “usual” priors:
- random effects- Pareto tails
BISP.6 - Bressanone - June 18th-20th 2009
Fluorescence intensity signal & noise model
Fluorescence intensity data
Mixture model - noise vs. signal
signal
noise
BISP.6 - Bressanone - June 18th-20th 2009
Components of posterior
Grid: (small) pixel regions: area
MCMC: conditionals
Gaussian mixture: Signal only observationsLarge K, large NBlock Gibbs sampler for TDP mixtures
f
BISP.6 - Bressanone - June 18th-20th 2009
MCMC progression & inferences
Signal/noise events? Pr(Signal/noise events)?
Intensity function …
… estimate…Intensity function …
B220/day 1
B220/day 11
BISP.6 - Bressanone - June 18th-20th 2009
Posterior summaries and explorations
(b) IgM(a) B220
(c) CD4 (*) B220/(B220+IgM)
Quantified germinal centres
BISP.6 - Bressanone - June 18th-20th 2009
Computation: Multi-core, multi-thread; cluster
Large K, large N mixture model
Heavy computation: Configuration indicators,
Gaussian component parameter updates
Parallelizable steps within MCMC
Parallel sub-images: conditional mixture in sub-image
a) allocate pixel to sub-image b) … then to component in sub-mixture
… use a) only for pixels “near boundaries” - reduces computation
BISP.6 - Bressanone - June 18th-20th 2009
3D
BISP.6 - Bressanone - June 18th-20th 2009
Dynamic spatial process
Confocal microscopy: Imaging fluorescence in situ
Model: quantify directional(?) drifts in intensity
Above model at each time:Intensity dynamic
- Dynamic models for Gaussian parameters- Generalized Polya Urn Scheme for random
partitions/pixel-component configurations[ Matt Taddy’s talkCaron, Davey, Doucet, 07, UAIC. Ji et al, 09 forthcoming ]
Sequential MC: Particle filtering
BISP.6 - Bressanone - June 18th-20th 2009
Dynamic spatial process
BISP.6 - Bressanone - June 18th-20th 2009
Team & Links
Lynn Lin, PhD student
Quanli Wang, comp.guru
Dan Merlpostdoc > Livermore
Cliburn ChanImmunology & Comp Bio
Chunlin Ji, PhD student
Tom KeplerImmunology & Comp Bio
Ioanna Manolopolou postdoc
Chan et al, Cytometry A, 2008Ji et al, BA 2009
New & software: www.stat.duke.edu/~mw