fred fu masc, veronica cojocari msc, zaldy balde, albert nguyen, … · 2020. 11. 10. · fred fu...

1
Development and Validation of an Imaging Mass Cytometry and Multiplex Immunofluorescence Biologically-guided Cellular Segmentation Strategy STTARR Innovation Centre, MaRS Discovery District, Princess Margaret Cancer Research Tower, 101 College Street, 7 th floor, Toronto, ON, M5G 1L7, Canada University of Toronto, Department of Immunology, Toronto ON Canada Fred Fu MASc, Veronica Cojocari MSc, Zaldy Balde, Albert Nguyen, Valeria Ramaglia PhD, Salma Sheikh- Mohamed, Mark Zaidi, Jade Bilkey, Susan Done MD, Jen Gommerman PhD, Trevor D. McKee PhD MS: Manual Classifier Spatio-Temporal Targeting and Amplification of Radiation Response Program http://www.sttarr.com [email protected] Multiplex IF/IMC Analysis Imaging Mass Cytometry and Multiplex Fluorescence staining approaches, require more stringent approaches for immune cell image segmentation & classification Large panels of antibodies have been common in flow cytometry staining of cells in suspension, allowing for deep cell surface marker interrogation of the immune system. Novel techniques (stain / strip immuno- fluorescence, or heavy metal-based antibody tagging, followed by imaging mass cytometry) allow highly multiplexed staining to be applied to the analysis of tissue sections. Segmentation of these high multiplexed image sets to produce single cell counts for downstream analysis is more challenging than single marker studies, due to a strong desire to avoid mixing of signals from neighboring cells, and obtain “pure” readouts of biomarker abundance on a single cell level. We have adopted a hybrid segmentation approach that utilizes biological domain knowledge to assist in the segmentation and classification of immune cell types in IMC and MxIF images Two example methodologies are described here, first to adjust segmentation to account for biological differences, and the second to adjust the classification strategy used to identify cell types, overcoming bleedthrough of neighboring cells using a supervised feature-based random forest machine learning approach. Selection of manually annotated cells to help guide supervised training allows for the reporting of quality metrics with respect to both segmentation (overall number of manually-scored cells versus segmented cells), and classification (per-class proportion of machine-scored immune cell subsets relative to gold standard of manual annotation) Future work could utilize this approach in a generalized segmentation and classification schema to improve analysis of immune subsets in highly multiplexed immunostained biopsies or tissue samples. bands Image Segmentation Cell Segmentation in IMC assists direct evaluation of platinum deposition in cisplatin treated pancreatic tumor xenografts Imaging mass cytometry with heavy metal tagged antibodies allows for: highly multiplexed tissue immunostaining Integrated tissue & cell segmentation utilizing epithelial, stromal markers Highly quantitative single cell analytics: intensity, spatial relationships Finding: Majority of injected platinum actually binds collagen, not tumor cells Published: Chang et al, Scientific Reports 2016; 6:36641 γH2AX PanCK EF - 5 Collagen DNA IdU EF - 5 cisPt Histone-H3 E-Cadherin cisPt Histone - H3 α-SMA Epithelial nuclei Stromal nuclei EF-5 (Hypoxia) intensity per cell Pt deposition intensity per-cell Multiple views of marker combinations in 28 marker panel. Segmentation & Intensity Quantitation of per-cell proliferation (IdU), DNA Damage (gH2AX), and cisPt uptake Laboratory of Dr. David Hedley, Princess Margaret Cancer Centre. Performed in collaboration with Fluidigm Canada Inc., manufacturers of Hyperion IMC system. Segmenting into bands of equal distance from blood vessels / stromal regions in pancreatic cancer xenografts highlights tumor microenvironment biomarker relationships Epithelium Stroma Lumen Distance from stroma EF5 (Hypoxia) per-cell EF5 in stromal dtstance bands 0 0.5 1 1.5 2 2.5 3 0 50 100 150 200 250 300 EF5 Signal Distance from Stroma Hypoxia Distance Map 0 0.5 1 1.5 2 2.5 3 0 50 100 150 200 250 300 Proliferation Signal Distance from Stroma Proliferation Distance Map IdU Intensity KI67 Intensity of Mean IdUU_I127Di of Mean CK7_Dy164Di of Mean Ki67_Er168Di of Mean p53_Nd150Di f Mean Pt196_Pt196Di of Mean Vimentin_Gd156Di Mean Pan_keratin_Dy162Di Mean DNA191_Ir191Di of Mean HH3_Yb176Di Beta_catenin_Ho165Di n E_cadherin_Gd158Di of Mean EF5_Tb159Di 0 0.5 1 1.5 2 2.5 3 3.5 4 Closest to stroma furthest from stroma Stromal Distance Analysis of Hypoxia This segmentation method allows for comparison amongst multiple biomarkers to highlight any putative microenvironmental relationships. Combining distance with cellular segmentation can also allow for multidimensional interrogation of biomarker relationships with distance, such as the “tissue cytometry” plot showing individual cell hypoxia and proliferative markers, with color scale denoting distance from each cell to nearest vessel. Multi-marker stromal distance relationship Tissue Cytometry scatterplot Zaidi et al 2019 Pure DNA based segmentation can result in combination of biomarkers from neighboring cell types, resulting in implausible immune cell types (e.g. mixed T and B cell markers). By adopting an approach to combine DNA segmentation with biological domain knowledge (i.e. immune surface marker subtypes), we can improve segmentation. Utilizing Immune Markers to guide Segmentation Random Forest Classifier CD3 CD8 DNA IgK IgL DNA HLA CD68 DNA Putative T Cell Putative T Cell Putative B Cell Putative T Cell Putative B Cell Putative Macrophage / Microglia / Other Cell Biologically guided segmentation Ramaglia et al, Elife 2019 Immune Non-Immune T Cytotoxic T Helper B Cell T Cell A B C A B C Adapted from Cudrici et al JRRD 2006 MS: Immune cell blood vessel distance Putative T Cell Putative Macrophage Putative B Cell DNA / Other Cell CD31+ Vessel Distance from Vessel Immune influx from blood vessels plays an important role in progression of multiple sclerosis. Creation of a distance map utilizing CD31+ vessels permits direct measurement of distance to nearest (in-plane) vessel on a per-cell basis. Gating strategy using manual annotations Classification of distinct immune cell subsets was performed by first manually labeling these subsets, then determining manually identified “gates” / thresholds by inspection of 2D scatterplots of markers of interest, to identify positivity cutoff Training data: Manual labeling 2D Tissue Cytometry plot: CD8+ T cells Proliferative T Cells (manual labels) Pandas dataframe logical gating strategy to classify T Cell subsets MS Lesion Classification Schema: Discrete Biological Activity within Lesions The resulting heatmap was obtained by first segmenting every cell within specific lesions (corresponding to regions of interest with discrete biological activity), then applying the gating strategy determined by manual labeling of discrete immune cell types. The cell subtype count is reported in units of cells / mm2 across each region of interest. In addition to the immune cell density, the average distance to nearest blood vessel for each immune cell subset is reported here. Immune Cell Density Heatmap Ramaglia et al, Elife 2019 Breast cancer: Immune infiltrate classification Knowledge of the immune content of the tumor microenvironment, and the proportions of distinct regulatory and cytotoxic immune cell components surrounding and within the tumor, is critical for the delivery of optimal therapy. Multiplexed immunofluorescence, using sequential staining, stripping and re- staining, can highlight immune cell subsets. However, challenges remain with segmentation and classification of these cell types. The higher resolution of these images, and greater density of immune cells, made segmentation adjustments more challenging; we optimized the segmentation to balance under- and over-segmented cells; and utilized a Random Forest machine learning strategy to “train” an immune classifier using supervised labeled training data. Red – CD8 Intensity (THelper Cells) Green – CD4 intensity (TCyt Cells) B cells based on threshold Actual B cells (based on manual investigation) Actual TC cells (based on manual CD8 investigation) Actual TH cells (based on manual CD4 investigation and Non-immune cells Red – CD3 Intensity (T Cells) Green – CD20 intensity (B Cells) Yellow – Segmented Cell Border B cells (based on gated CD20 threshold) T cells (based on gated CD3 threshold) Poorly segmented cells (double positive for both markers) Non-immune cells (double negative / below both thresholds) DNA (Nuclear) and Na-K ATPase (Membrane) Classical Computer Vision Segmentation Original Image Segmented Image Grey – DAPI (DNA Intercalator) Green Na-K ATPase Membrane Cytoplasm Nucleus Poorly Segmented Cells B Cell / T Cell Scatterplot Thelper vs TCytotoxic Scatterplot Original Image Manually Labeled Cells Random Forest-based Immune Cell Classifier Computer vision-based segmentation was tuned to balance under- and over- segmented cells (utilizing thresholds for DAPI and membrane markers). For each detected cell, ~50 features were exported, corresponding to pixel intensities for sub-regions and histogram-based intensity subsets. These were used with manually labeled data to train a Random Forest classifier to properly identify immune cell subsets. Trained Random Forest Cell Classifier Manually Labeled Cells Relative Feature Importance for Classification Scatterplot of Classified Cells Validation: Confusion Matrix Trained classifier achieves an accuracy of 0.87, f1-score (macro) of 0.84; Precision of: [0.86, 0.88, 0.74, 0.89], and recall of [0.88 0.88 0.74 0.88] when trained on labeled cells from representative tumor microarray cores. This approach is useful when segmentation is challenging; allowing for extraction of immune cell density data for a number of classes, for better evaluation of the tumor microenvironment. Red – CD3 Intensity (T Cells) Green – CD20 intensity (B Cells) Manual Classifier B Cell 145 165 T Helper 102 126 T Cytotoxic 11 2 Non-Immune 175 141 Total Counts in this ROI Classifier per-class confidence

Upload: others

Post on 21-Mar-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fred Fu MASc, Veronica Cojocari MSc, Zaldy Balde, Albert Nguyen, … · 2020. 11. 10. · Fred Fu MASc, Veronica Cojocari MSc, Zaldy Balde, Albert Nguyen, Valeria Ramaglia PhD, Salma

Development and Validation of an Imaging Mass Cytometry and Multiplex Immunofluorescence Biologically-guided Cellular Segmentation Strategy

STTARR Innovation Centre, MaRS Discovery District, Princess Margaret Cancer Research Tower, 101 College Street, 7th floor, Toronto, ON, M5G 1L7, Canada

University of Toronto, Department of Immunology, Toronto ON Canada

Fred Fu MASc, Veronica Cojocari MSc, Zaldy Balde, Albert Nguyen, Valeria Ramaglia PhD, Salma Sheikh-Mohamed, Mark Zaidi, Jade Bilkey, Susan Done MD, Jen Gommerman PhD, Trevor D. McKee PhD

MS: Manual Classifier

Spatio-Temporal Targeting and Amplification of Radiation Response Program

http://www.sttarr.com [email protected]

Multiplex IF/IMC AnalysisImaging Mass Cytometry and Multiplex

Fluorescence staining approaches, require more stringent approaches for immune

cell image segmentation & classificationLarge panels of antibodies have been common in flow cytometry staining of cells in suspension, allowing for deep cell surface marker interrogation of the immune system.

Novel techniques (stain / strip immuno-fluorescence, or heavy metal-based antibody tagging, followed by imaging mass cytometry) allow highly multiplexed staining to be applied to the analysis of tissue sections.

Segmentation of these high multiplexed image sets to produce single cell counts for downstream analysis is more challenging than single marker studies, due to a strong desire to avoid mixing of signals from neighboring cells, and obtain “pure” readouts of biomarker abundance on a single cell level.

We have adopted a hybrid segmentation approach that utilizes biological domain knowledge to assist in the segmentation and classification of immune cell types in IMC and MxIF images

Two example methodologies are described here, first to adjust segmentation to account for biological differences, and the second to adjust the classification strategy used to identify cell types, overcoming bleedthroughof neighboring cells using a supervised feature-based random forest machine learning approach.

Selection of manually annotated cells to help guide supervised training allows for the reporting of quality metrics with respect to both segmentation (overall number of manually-scored cells versus segmented cells), and classification (per-class proportion of machine-scored immune cell subsets relative to gold standard of manual annotation)

Future work could utilize this approach in a generalized segmentation and classification schema to improve analysis of immune subsets in highly multiplexed immunostainedbiopsies or tissue samples.

bands

Image SegmentationCell Segmentation in IMC assists direct

evaluation of platinum deposition in cisplatin treated pancreatic tumor xenografts

• Imaging mass cytometry with heavy metal tagged antibodies allows for:• highly multiplexed tissue immunostaining • Integrated tissue & cell segmentation utilizing epithelial, stromal markers• Highly quantitative single cell analytics: intensity, spatial relationships

• Finding: Majority of injected platinum actually binds collagen, not tumor cellsPublished: Chang et al, Scientific Reports 2016; 6:36641

γH2AX PanCK EF-5 Collagen DNA IdU EF-5 cisPt Histone-H3E-Cadherin cisPt Histone-H3 α-SMA Epithelial nuclei Stromal nuclei EF-5 (Hypoxia) intensity per cell

Pt deposition intensity per-cell

Multiple views of marker combinations in 28 marker panel. Segmentation & Intensity

Quantitation of per-cell proliferation (IdU), DNA Damage (gH2AX), and cisPt uptake

Laboratory of Dr. David Hedley, Princess Margaret Cancer Centre. Performed in collaboration with Fluidigm Canada Inc., manufacturers of Hyperion IMC system.

Segmenting into bands of equal distance from blood vessels / stromal regions in pancreatic cancer xenografts highlights tumor microenvironment biomarker relationships

Epithelium Stroma Lumen Distance from stroma EF5 (Hypoxia) per-cell EF5 in stromal dtstance bands

0

0.5

1

1.5

2

2.5

3

0 50 100 150 200 250 300

EF5

Sign

al

Distance from Stroma

Hypoxia Distance Map

0

0.5

1

1.5

2

2.5

3

0 50 100 150 200 250 300

Prol

ifera

tion

Sign

al

Distance from Stroma

Proliferation Distance Map

IdU Intensity KI67 Intensity

of Mean IdUU_I127Di of Mean CK7_Dy164Di

of Mean Ki67_Er168Di of Mean p53_Nd150Di

f Mean Pt196_Pt196Di of Mean Vimentin_Gd156Di

Mean Pan_keratin_Dy162Di Mean DNA191_Ir191Di of Mean HH3_Yb176DiBeta_catenin_Ho165Din E_cadherin_Gd158Di of Mean EF5_Tb159Di

0

0.5

1

1.5

2

2.5

3

3.5

4

Closest to stromafurthest from

stroma

Stromal Distance Analysis of Hypoxia

This segmentation method allows for comparison amongst multiple biomarkers to highlight any putative microenvironmental relationships. Combining distance with cellular segmentation can also allow for multidimensional interrogation of biomarker relationships with distance, such as the “tissue cytometry” plot showing individual cell hypoxia and proliferative markers, with color scale denoting distance from each cell to nearest vessel.

Multi-marker stromal distance relationship Tissue Cytometry scatterplot

Zaidi et al 2019

Pure DNA based segmentation can result in combination of biomarkers from neighboring cell types, resulting in implausible immune cell types (e.g. mixed T and B cell markers). By adopting an approach to combine DNA segmentation with biological domain knowledge (i.e. immune surface marker subtypes), we can improve segmentation.

Utilizing Immune Markers to guide Segmentation

Random Forest Classifier

CD3 CD8DNA

IgKIgLDNA

HLA CD68DNA

Putative T Cell

Putative T Cell Putative B Cell

Putative T Cell Putative B CellPutative Macrophage/ Microglia / Other Cell

Biologically guided segmentation

Ramaglia et al, Elife 2019

Immune Non-Immune

T Cytotoxic T Helper

B Cell T Cell

A

B

C

A

B

C

Adapted from Cudrici et al JRRD 2006

MS: Immune cell blood vessel distance

Putative T CellPutative Macrophage

Putative B CellDNA / Other Cell

CD31+ Vessel Distance from Vessel

Immune influx from blood vessels plays an important role in progression of multiple sclerosis. Creation of a distance map utilizing CD31+ vessels permits direct measurement of distance to nearest (in-plane) vessel on a per-cell basis.

Gating strategy using manual annotationsClassification of distinct immune cell subsets was performed by first manually labeling these subsets, then determining manually identified “gates” / thresholds by inspection of 2D scatterplots of markers of interest, to identify positivity cutoffTraining data: Manual labeling 2D Tissue Cytometry plot: CD8+ T cells Proliferative T Cells (manual labels)

Pandas dataframe logical gating strategy to classify T Cell subsets

MS Lesion Classification Schema: Discrete Biological Activity within Lesions

The resulting heatmap was obtained by first segmenting every cell within specific lesions (corresponding to regions of interest with discrete biological activity), then applying the gating strategy determined by manual labeling of discrete immune cell types. The cell subtype count is reported in units of cells / mm2 across each region of interest.

In addition to the immune cell density, the average distance to nearest blood vessel for each immune cell subset is reported here.

Immune Cell Density Heatmap

Ramaglia et al, Elife 2019

Breast cancer: Immune infiltrate classificationKnowledge of the immune content of the tumor microenvironment, and the proportions of distinct regulatory and cytotoxic immune cell components surrounding and within the tumor, is critical for the delivery of optimal therapy. Multiplexed immunofluorescence, using sequential staining, stripping and re-staining, can highlight immune cell subsets. However, challenges remain with segmentation and classification of these cell types. The higher resolution of these images, and greater density of immune cells, made segmentation adjustments more challenging; we optimized the segmentation to balance under- and over-segmented cells; and utilized a Random Forest machine learning strategy to “train” an immune classifier using supervised labeled training data.

Red – CD8 Intensity (THelper Cells)Green – CD4 intensity (TCyt Cells)

■ B cells based on threshold

■Actual B cells (based on manual investigation)

■Actual TC cells (based on manual CD8 investigation)

■ Actual TH cells (based on manual CD4 investigation

■ and ■ Non-immune cells

Red – CD3 Intensity (T Cells)Green – CD20 intensity (B Cells)Yellow – Segmented Cell Border

■B cells (based on gated CD20threshold)

■T cells (based on gated CD3threshold)

■Poorly segmented cells (double positive for both markers)

■Non-immune cells (double negative / below both thresholds)

DNA (Nuclear) and Na-K ATPase (Membrane) Classical Computer Vision Segmentation

Original Image Segmented ImageGrey – DAPI (DNA Intercalator)Green – Na-K ATPase

■Membrane

■ Cytoplasm

■ Nucleus

Poorly Segmented Cells B Cell / T Cell Scatterplot Thelper vs TCytotoxic Scatterplot

Original Image Manually Labeled Cells

Random Forest-based Immune Cell Classifier Computer vision-based segmentation was tuned to balance under- and over-segmented cells (utilizing thresholds for DAPI and membrane markers). For each detected cell, ~50 features were exported, corresponding to pixel intensities for sub-regions and histogram-based intensity subsets. These were used with manually labeled data to train a Random Forest classifier to properly identify immune cell subsets.

Trained Random Forest Cell Classifier Manually Labeled Cells

Relative Feature Importance for Classification Scatterplot of Classified Cells Validation: Confusion Matrix

Trained classifier achieves an accuracy of 0.87, f1-score (macro) of 0.84; Precision of: [0.86, 0.88, 0.74, 0.89], and recall of [0.88 0.88 0.74 0.88] when trained on labeled cells from representative tumor microarray cores. This approach is useful when segmentation is challenging; allowing for extraction of immune cell density data for a number of classes, for better evaluation of the tumor microenvironment.

Red – CD3 Intensity (T Cells) Green – CD20 intensity (B Cells)

Manual Classifier

B Cell 145 165

T Helper 102 126

T Cytotoxic 11 2

Non-Immune 175 141

Total Counts in this ROI

Classifier per-class confidence