single-cell transcriptomics and flow cytometry reveal ......2017/01/27 · 3 major populations of...
TRANSCRIPT
Single-cell Transcriptomics
and Flow Cytometry Reveal
Disease-associated Fibroblast Subsets
in Rheumatoid Arthritis
Kamil Slowikowski
Soumya Raychaudhuri Laboratory
January 27, 2016
Strategy to define fibroblast subsets
Isolated
synovial
fibroblasts
Transcriptomics of
sorted bulk populations
PRO: robust gene expression signal
CON: bias due to marker selection Fresh tissue
from joint
replacement
surgery
Transcriptomics of
unsorted single cells
PRO: unbiased, no marker selection
CON: noisy gene expression signal
Study design
• Microarray gene expression profiling
– Donors: 3 OA and 3 RA
– Gates: 5-8 sorted subsets per donor
• Low-input RNA-seq
– Donors: 4 RA
– Gates: 5-8 sorted subsets per donor
• Single-cell RNA-seq
– Donors: 2 OA and 2 RA
– Cells: 384
DISCOVERY
VALIDATION
Sorting fibroblast subsets by surface markers
Figure 1. Gating strategy for
MSCs with heterogeneous
expression of surface proteins.
(A) We obtained fresh synovial
tissue from surgery and enriched the
sample for MSCs by depleting
hematopoietic and endothelial cells
(see Methods for details). (B) We
separated CD34- and C34+ cells,
then (C) selected PDPN+ cells, then
(D) gated 7 subpopulations of cells.
(E) For each subpopulation, we
tested for a difference in the propor-
tion of total fibroblasts between OA
and RA (rank sum test) and found
two populations that are significantly
different.
A
B
C
D
E P = 0.141
P = 0.003 *
P = 0.033
P = 0.005 *
P = 0.045
P = 0.632
P = 0.241
0% 20% 40% 60% 80%OA (n = 26) RA (n = 17)
Fresh synovial tissue
from surgery
Remove hematopoietic cells,
endothelial cells, and RBCs
CD34
CD
146
PD
PN
CD
H11
THY1
CD
H11
THY1
PD
PN
THY1 THY1
Fumitaka Mizoguchi
CD34– CD34+
PD
PN
C
DH
11
THY1
Up to 8 subsets by surface protein phenotype.
How many subsets by gene expression?
Sorting fibroblast subsets by surface markers
Figure 1. Gating strategy for
MSCs with heterogeneous
expression of surface proteins.
(A) We obtained fresh synovial
tissue from surgery and enriched the
sample for MSCs by depleting
hematopoietic and endothelial cells
(see Methods for details). (B) We
separated CD34- and C34+ cells,
then (C) selected PDPN+ cells, then
(D) gated 7 subpopulations of cells.
(E) For each subpopulation, we
tested for a difference in the propor-
tion of total fibroblasts between OA
and RA (rank sum test) and found
two populations that are significantly
different.
A
B
C
D
E P = 0.141
P = 0.003 *
P = 0.033
P = 0.005 *
P = 0.045
P = 0.632
P = 0.241
0% 20% 40% 60% 80%OA (n = 26) RA (n = 17)
Fresh synovial tissue
from surgery
Remove hematopoietic cells,
endothelial cells, and RBCs
CD34
CD
146
PD
PN
CD
H11
THY1
CD
H11
THY1
PD
PN
THY1 THY1
436 genes have significant variation (ANOVA 1% FDR)
across 7 gated subsets in microarray data and also in
independent RNA-seq data.
3 major populations of synovial fibroblasts
Pairwise correlation between samples and principal components analysis (PCA) both
suggest 3 major populations.
Pearson’s r
PCA Pairwise Correlation
Gene expression suggests different functions
CD
34nT
HY
1n
CD
34nT
HY
1p
CD
34p
MITOTIC_CELL_CYCLE_CHECKPOINT
MITOTIC_CELL_CYCLE
CELL_CYCLE_PROCESS
SPINDLE
COLLAGEN
REGULATION_OF_MITOSIS
CELL_CYCLE_PHASE
M_PHASE
MITOSIS
M_PHASE_OF_MITOTIC_CELL_CYCLE
GROWTH_FACTOR_ACTIVITY
INFLAMMATORY_RESPONSE
AMINE_METABOLIC_PROCESS
PURINE_RIBONUCLEOTIDE_METABOLIC_PROCESS
NUCLEOTIDE_SUGAR_METABOLIC_PROCESS
LOCOMOTORY_BEHAVIOR
PATTERN_BINDING
RESPONSE_TO_WOUNDING
BEHAVIOR
CYTOKINE_ACTIVITY
EXTRACELLULAR_MATRIX_STRUCTURAL_CONSTITUENT
CELL_SUBSTRATE_ADHESION
CELL_MATRIX_ADHESION
CHEMOKINE_RECEPTOR_BINDING
CHEMOKINE_ACTIVITY
STRUCTURAL_CONSTITUENT_OF_RIBOSOME
SKELETAL_DEVELOPMENT
EXTRACELLULAR_MATRIX
PROTEINACEOUS_EXTRACELLULAR_MATRIX
0 2 4 6 8
MSigDB: C5 Gene Ontology
CD
34nT
HY
1n
CD
34nT
HY
1p
CD
34p
MITOTIC_CELL_CYCLE_CHECKPOINT
MITOTIC_CELL_CYCLE
CELL_CYCLE_PROCESS
SPINDLE
COLLAGEN
REGULATION_OF_MITOSIS
CELL_CYCLE_PHASE
M_PHASE
MITOSIS
M_PHASE_OF_MITOTIC_CELL_CYCLE
GROWTH_FACTOR_ACTIVITY
INFLAMMATORY_RESPONSE
AMINE_METABOLIC_PROCESS
PURINE_RIBONUCLEOTIDE_METABOLIC_PROCESS
NUCLEOTIDE_SUGAR_METABOLIC_PROCESS
LOCOMOTORY_BEHAVIOR
PATTERN_BINDING
RESPONSE_TO_WOUNDING
BEHAVIOR
CYTOKINE_ACTIVITY
EXTRACELLULAR_MATRIX_STRUCTURAL_CONSTITUENT
CELL_SUBSTRATE_ADHESION
CELL_MATRIX_ADHESION
CHEMOKINE_RECEPTOR_BINDING
CHEMOKINE_ACTIVITY
STRUCTURAL_CONSTITUENT_OF_RIBOSOME
SKELETAL_DEVELOPMENT
EXTRACELLULAR_MATRIX
PROTEINACEOUS_EXTRACELLULAR_MATRIX
0 2 4 6 8
MSigDB: C5 Gene Ontology
-Log10(P)
Gene Set Enrichment Analysis
with Gene Ontology (MSigDB C5)
Selected Genes
Proportions of fibroblast populations are altered in RA
Figure 1. Gating strategy for
MSCs with heterogeneous
expression of surface proteins.
(A) We obtained fresh synovial
tissue from surgery and enriched the
sample for MSCs by depleting
hematopoietic and endothelial cells
(see Methods for details). (B) We
separated CD34- and C34+ cells,
then (C) selected PDPN+ cells, then
(D) gated 7 subpopulations of cells.
(E) For each subpopulation, we
tested for a difference in the propor-
tion of total fibroblasts between OA
and RA (rank sum test) and found
two populations that are significantly
different.
A
B
C
D
E P = 0.141
P = 0.003 *
P = 0.033
P = 0.005 *
P = 0.045
P = 0.632
P = 0.241
0% 20% 40% 60% 80%OA (n = 26) RA (n = 17)
Fresh synovial tissue
from surgery
Remove hematopoietic cells,
endothelial cells, and RBCs
CD34
CD
146
PD
PN
CD
H11
THY1
CD
H11
THY1
PD
PN
THY1 THY1
CD34–THY– CD34–THY+ CD34+
OA (n=26) and RA (n=16) joints have
different abundances of fibroblast populations.
Different fibroblast abundances in swollen (n=7) or
non-swollen (n=5) joints.
Strategy to define fibroblast subsets
Isolated
synovial
fibroblasts
Transcriptomics of
sorted bulk populations
Transcriptomics of
unsorted single cells
PRO: robust gene expression signal
CON: bias due to marker selection
PRO: unbiased, no marker selection
CON: noisy gene expression signal
Fresh tissue
from joint
replacement
surgery
Single cell RNA-seq at Broad Technology Labs
Chad Nussbaum, PhD
Director of Broad Technology Labs
Nir Hacohen, PhD
Associate Professor of Medicine
Massachusetts General Hospital
1. Each cell labeled with protein
surface markers:
PDPN, CD34, THY1, CDH11
2. Cells sorted into 96-well plates
3. Illumina Smart-Seq2
4. NextSeq500
~1M reads per cell
Single cell data quality
Donors: 2 OA, 2 RA
Cells: 384
337 (88%)
47 (12%)
2,500
5,000
7,500
10,000
12,500
20% 40% 60%
Fragments assigned to transcripts
Ge
ne
s d
ete
cte
d
DonorOA1OA2RA1RA2
Single−cell RNA−seq
We excluded 47 cells with fewer than 5,000 genes
detected at ≥1 transcript per million (TPM).
Kharchenko, Silberstein, & Scadden Nat. Methods (2014).
We used SCDE to:
• compute robust expression estimates
• correct for batch effects
Jean Fan
Bioinformatics and Integrative
Genomics PhD Candidate
Department of Biomedical Informatics
Genes with high mean and variance across cells
We measured protein fluorescence on each cell
OA1 (n=83) OA2 (n=86) RA1 (n=87) RA2 (n=81)
1
2
3
4
1
2
3
4
CD
34−
CD
34
+
2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5
THY1
CD
H1
1
We measured CD34, THY1, and CHD11 protein markers on each cell.
OA1 (n=83) OA2 (n=86) RA1 (n=87) RA2 (n=81)
1
2
3
4
1
2
3
4
CD
34−
CD
34
+
2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5
THY1
CD
H1
1CD34–THY–
CD34–THY+
CD34+
OA1 (n=83) OA2 (n=86) RA1 (n=87) RA2 (n=81)
1
2
3
4
1
2
3
4
CD
34−
CD
34
+
2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5
THY1
CD
H1
1
THY1
Genes with high mean and variance across cells
Classification of each single cell by gene expression
1. Select ~900 genes that distinguish 3 populations in the bulk RNA-seq data.
2. Build a Linear Discriminant Analysis (LDA) model with these genes.
3. Input each single cell expression profile into the model.
4. Get the posterior probability of each cell to belong to one of the 3 populations.
Bulk RNA-seq
LDA Model
Single Cell
Unclassified
95% probability CD34-THY1+
Unbiased scRNA-seq validates discovery of 3 subsets
Fibroblast subsets in microanatomical structures
CD34-THY1-
CD34+
CD34-THY1+
Lining
Sublining
Conclusions
Figure 1. Gating strategy for
MSCs with heterogeneous
expression of surface proteins.
(A) We obtained fresh synovial
tissue from surgery and enriched the
sample for MSCs by depleting
hematopoietic and endothelial cells
(see Methods for details). (B) We
separated CD34- and C34+ cells,
then (C) selected PDPN+ cells, then
(D) gated 7 subpopulations of cells.
(E) For each subpopulation, we
tested for a difference in the propor-
tion of total fibroblasts between OA
and RA (rank sum test) and found
two populations that are significantly
different.
A
B
C
D
E P = 0.141
P = 0.003 *
P = 0.033
P = 0.005 *
P = 0.045
P = 0.632
P = 0.241
0% 20% 40% 60% 80%OA (n = 26) RA (n = 17)
Fresh synovial tissue
from surgery
Remove hematopoietic cells,
endothelial cells, and RBCs
CD34
CD
146
PD
PN
CD
H11
THY1
CD
H11
THY1
PD
PN
THY1 THY1
1. Synovial fibroblasts have distinct cellular subsets:
Protein surface marker expression by FACS
mRNA expression by microarrays and RNA-seq
Microanatomical localization by histology
2. Proportions of subsets differ between OA and RA
3. scRNA-seq on unbiased samples of fibroblasts confirms our findings:
Single cells are classified with high probability into 3 defined classes
Number of classified cells matches expectation by FACS gating
This work was a collaboration with the Brenner Lab
Michael B. Brenner, MD
Theodore Bevier Bayles Professor of Medicine
Harvard Medical School
Chief, Rheumatology, Immunology and Allergy
Brigham And Women's Hospital
Fumitaka Mizoguchi, MD, PhD
Assistant Professor, Department of Rheumatology
Tokyo Medical and Dental University
Acknowledgements
Brigham and Women’s Hospital Brenner Lab
Michael Brenner
Fumitaka Mizoguchi
Erika Noss
Sook Kyung Chang
Deepak Rao
Kevin Wei
Hung Nguyen
Patrick Brennan
Nigrovic Lab
Peter Nigrovic
Sarah Ameri
Allyn Morris
Raychaudhuri Lab
Soumya Raychaudhuri
Department of Orthopedic Surgery
John Wright
Barry Simmons
Scott Martin
Philip Blazar
Brandon Earp
Broad Institute Nir Hacohen
Chad Nussbaum
Boston Children’s Hospital Immunology
Lauren Henderson
Harvard Medical School Department of Biomedical Informatics
Jean Fan