super-enhancer landscapes of ovarian cancer reveal novel ... · eaton ml, johnston b, piotrowska k,...
TRANSCRIPT
Super-enhancer landscapes of ovarian cancer reveal novel epigenomic subtypes and targetsEaton ML, Johnston B, Piotrowska K, Collins C, Lopez J, Knutson S, di Tomaso E, Carulli J, Olson E Syros Pharmaceuticals Inc, 620 Memorial Drive, Cambridge, MA, 02139, USA
Abstract
Background: There is a critical unmet need for targeted therapies in ovarian cancer, especially high-grade serous ovarian cancer (HGSOC). We used enhancer mapping combined with transcriptomics and mutations to identify novel ovarian cancer subtypes and associated targets.Material and methods: With ChIP-seq for H3K27Ac, we profiled the enhancer landscape of 101 primary tumor samples from 7 ovarian cancer subtypes with a focus on HGSOC, 29 cell line models, 3 PDXs, and 8 non-can-cerous samples of ovarian and fallopian tube tissue. We also profiled many of these samples through RNA-seq and a focused NGS-based mutational panel. We used matrix factorization methods to reveal novel sub-groups of ovarian cancer patients and predicted their associated transcriptional circuitry.Results: Through a computational deconvolution of enhancer maps, we identified novel enhancer defined pa-tient subtypes of ovarian cancer. While some known subtypes, such as granulosa cell, associated uniquely with their own enhancer profile, the majority of the primary tumor samples fell into 4 clusters that did not correlate with histological subtype or with known high-frequency ovarian cancer mutations. Each cluster was associated with its own unique super-enhancer (SE) signature, implying that each is driven by a unique transcriptional cir-cuitry. There was a striking cluster-specific patterning of many known ovarian cancer related genes such as FOXM1, CD47, and MYC, and genes linked to pathways known to be dysregulated in ovarian cancer, including an SE linked to the RB pathway gene Cyclin E1. Furthermore, many additional cluster-specific SEs were dis-covered representing novel potential therapeutic targets. Interestingly, while we could assign ovarian cancer cell line models to these novel subtypes, many cell lines’ enhancer landscapes appeared to be distinct from those of primary tumor cells.Conclusions: Together, our results comprise the largest ovarian cancer enhancer mapping effort to date, and demonstrate how an integrated analysis of enhancers, transcriptomes, and genotypes together can yield tran-scriptional circuitry that reinforces the role of known pathways associated with ovarian cancer progression and treatment, can be used to select cell models that best recapitulate the enhancer landscape of primary tumors, and can be mined to identify novel targets and biomarkers. Cluster-specific SEs at MYC and CCNE1 suggest that some tumors have increased transcriptional dependency on these loci as well as the components of the corresponding transcriptional machinery. The role of one such component, CDK7, is currently being evaluated in ovarian cancer patients with SY-1365, a first-in-class selective CDK7 inhibitor in Phase 1 clinical development (NCT 03134638).
ZBTB16ZKSCAN5LBX2 ZNF791 CUX1KLF2 RREB1
ZEB2 ZNF516NFIC STAT5B ZNF436CASZ1FOS
JUN
KLF6
ZNF558 TFAP2A
FOSL2
ZNF217 ZFP36L1SOX9SOX17MAFF NR2E3WT1 MEIS1 ZNF503 ETV6BHLHE41
PBX1JUND ZNF628MYPOP
PAX8 ZNF687ELF3
CTCFLESR2 THRA
NR1D1 RARA
ZNF703 RFX2SMAD3
ZNF764ZNF48
TCFL5FOXP1 HIC1 ZBTB42 HIF1AZFP36L2 ZFP36L1HIC1
ZNF16NR2F6 ZBTB38
ZNF646HES4ZNF707 ZNF7HSF1 ZNF598 PRRX2GATA4SKICEBPB
PEG3ZMIZ1ELMSAN1EHF
Ovarian Cancer subtype specific
Shared between normal and ovarian cancer sub-
Normal tissue specific
Transcription Factor Non-Transcription Factor GeneName
Method - Call SEs, NNMF, & cluster
0 2000 4000
050
000
2500
0035
0000
Tota
l H3K
27A
c O
ccup
ancy
(rpm
)
6000 8000
Global Distribution of H3K27Ac signal
2.5
rpm
/bp
H3K27Ac ChIP-seqProfile at Gene Level
Gene
Super-enhancer
H3K27Ac ChIP-seq ProfileGenome-wide
Map Enhancer Regions Identify Coordinated Patterns of Enhancer Activity
Using Non-Negative Matrix Factoriza-tion (NNMF, Brunet’s algorithm), de-compose transformed SE score matrix into 6 factors. The granulosa cell and yolk sack subtypes were shown to rep-resent their own subgroups of sam-ples, and so were held out along with the normals for de novo clustering.
SEt ≈ W Hx300
91 6
6
NNMF is well suited to the task since any enhancer activity in the cell popu-lation can only add to enhancer signal. The choice to use 6 factors was driven by a shoulder in the variance ex-plained by the factors in genome-wide SEs, as well as being a reasonable number of factors given the number of samples.
Find samples withsimilar
enhancer-patternutilization
K-medoids clustering of the column scaled NNMF coeffi-cient matrix (H) with euclide-an distance and k=4 reveals clear clusters of enhancer pattern utilization.
Using Pearson correlation, we link SEs to genes by searching in cis around SEs for gene expression that cor-relates with SE strength across samples. P-values must pass genome-wide mul-tiple hypothesis testing cor-rection.
Link SEs to genes
2.5
rpm
/bp
Gene
Super-enhancer
1500
00
H3K27Ac ChIP-seq reveals a small number of “super-en-hancer” regions con-taining orders of magnitude higher binding occupancy and spanning DNA domains
Across 109 primary OvCa samples, select the 300 SEs with the highest across-sample variability
Enhancers ranked by H3K27Ac signal
We profiled 146 SE maps across a range of sample types:• Primary: • High-grade serous (HGS): 40 • Low-grade serious (LGS): 16 • Unknown serous (SUn): 13 • Mucinous (Mu): 8 • Endometrioid (En): 7 • Yolk Sac (YS): 6 • Granulosa cell (GC): 4 • Unknown (Un): 4 • Clear cell (CC): 3• Cell lines: 29• PDX: 3• Normal (FT/Ov): 8• Normal-like cell lines: 5
TTot
al H
3K27
Ac
Sig
nal (rp
m)
Enhancers ranked by H3K27Ac Signal
Examples of SE-Proximal Genes
SEs from one of the HSOC tumors
Mapping H3K27Ac in primary ovarian cancer and normal tissue reveals tumor-specific Super-Enhancers (SEs)
Abstract
HOXBMIRLET7
HOXA
FOSB
Novel ovarian cancer patient subgroup identification through NNMF
His
tolo
gica
l Sub
type
0
0.2
0.4
0.6
0.8
1
Con
sens
us d
ist
Differential enhancer activity between tumor and normal
Differential super-enhancer landscapes across samples identifies normal- and tumor-specific transcriptional circuitry
Normal (mean transformed ChIP-seq signal)
Tum
or (m
ean
trans
form
ed C
hIP
-seq
sig
nal)
Top tumor GO categories:• Chromatin organization• Chromatin modifying en-zymes
Top normal GO categories:• Cell junction• Focal adhesion
Cluster 1 Cluster 2 Cluster 3 Cluster 4 GC YS Norm
101 Primary Tumor SamplesNormal
Primary tumor samples are characterized by SE gain and loss compared to normal tissue.
Gained SE
Unchanged
Lost SE
% E
nhan
cers
0
20
40
60
80
100
• Average gained SEs compared to normal: 226
• Average lost SEs: 174
EHF
Tumor
chr11
Normal
Tumor
chr21
Normal
RUNX1
AB
A
B
Tumor
chr11
Normal
C
CReceptor involved in cell motility
Tumor
chr4
Normal
Actin-binding & cell migration
DD
Comparing tumor vs normal with a generalized linear model, we find 348 regions of gained SE activity, and 211 with diminished SE activity. These SEs are linked to genes that are enriched for particular pathways, the gained SEs tend to be linked to genes enriched for the GO:BP categories chromatin organization and chromatin modifying enzymes. The diminished SEs tend to be linked to genes enriched for cell junction and focal adhesion categories.
Significant SE-SE mutual information
Clustering patient SE maps by NNMF consensus clustering reveals 4 patient subgroups
Holding out the GC and YS subtypes, we clustered the primary samples into 4 subgroups (left, top). We chose 4 as it yielded the desired complexity while maximizing the cophenetic correlation of the resulting clusters (right). These epigenome-defined subgroups did not segregate by histological subtype.
CCUnEnMuLGSSUnHGS
HistologicalSubtype
Number of clusters
SE
Stre
ngth
C1
C2
C3
C4
GC YS
Nor
m
0
5
10
15
C1
C2
C3
C4
GC YS
Nor
m
0
2
4
6
8
SE
Stre
ngth
C1
C2
C3
C4
GC YS
Nor
m
0.0
0.5
1.0
1.5
2.0
C1
C2
C3
C4
GC YS
Nor
m
0
1
2
3
4
5
6
SE
Stre
ngth
Clustering leads to subgroup-specific SE-tagged genes.The clustered samples can be queried to identify SEs that distinguish subsets of samples. The SEs (rows) are shown above, ordered by patient subgroup (columns). SEs are clustered by coordinated activity across patients (hierarchical tree, left). Three example subgroup-specific loci are shown: the HOXB locus (top), the HOXD locus (middle), and GATA4 (bottom), with lines indicating their position in the heatmap above.
SE
Stre
ngth
CCNE1 SE
GATA4 SE
HOXD SE
HOXB SE
Copyright 2018 Syros Pharmaceuticals, Inc
MSIGDB Hallmark Pathways upregulated in each subgroup
C1
C2
C3
C4
GC
YS
MITOTIC_SPINDLEMTORC1_SIGNALINGMYC_TARGETS_V2GLYCOLYSISG2M_CHECKPOINTE2F_TARGETSESTROGEN_RESPONSE_EARLYESTROGEN_RESPONSE_LATEMYC_TARGETS_V1OXIDATIVE_PHOSPHORYLATIONUV_RESPONSE_DNINTERFERON_GAMMA_RESPONSEINTERFERON_ALPHA_RESPONSEALLOGRAFT_REJECTIONANGIOGENESISINFLAMMATORY_RESPONSEIL2_STAT5_SIGNALINGP53_PATHWAYAPICAL_JUNCTIONTNFA_SIGNALING_VIA_NFKBKRAS_SIGNALING_UP
01234
-log10 (p-value)
C1
C2
C3
C4
GC YS
Nor
m
9
10
11
12
13
14
15
RN
A−se
q (v
st)
C1
C2
C3
C4
GC YS
Nor
m
0
1
2
3
4
CD47 SECD47 mRNA
Each patient subgroup exhibits its own representative biology.
Each cluster of primary tumor samples is upregulated for unique pathways, as shown by finding GSEA enrichments for the MSIGDB Hallmark set of meta-pathways (heatmap on right, where each column shows the enrichment of the given patient cluster for a hallmark gene set). Cluster 1 exhibits higher expression of immune pathways. As an example, it has the highest expression of and largest super-enhancer linked to the CD47 transcript (bottom right), involved in immune evasion. Cluster 2 has the strongest SE at the HOXB locus, while having the weakest at HOXD, implying a distinct developmental state. Cluster 3 shows upregulated pathways involved in MTOR signaling, MYC targets, and glycolysis. Cluster 4 is characterized by increased expression in inflammatory and cytokine response pathways, but is the most “normal-like” of the clusters, with the highest SE map correlation with normal samples, as well as lower expression of FOXM1, and pre-RC components such as ORC6 (right). Interestingly, cluster membership did not correlate with known ovarian somatic mutations, implying that epigenomic subtype is independent of mutational burden. C
1C
2C
3C
4G
C YSN
orm
6
8
10
12
RN
A−se
q (v
st)
FOXM1 mRNA
SE
Stre
ngth
C1
C2
C3
C4
GC YS
Nor
m
5
6
7
8
9
10
RN
A−se
q (v
st)
ORC6 mRNA
A mutual information network of subgroup- and normal-specific SEs was constructed to demonstrate the complex circuitry underlying ovarian cancer. Rectangles represent SEs linked to TF transcripts. Circles represent SEs linked to non-TF transcripts. Edges represent mutual information relationships stronger than a stringent cutoff based on a shuffled empirical background. All SEs depicted here were identified as specific to either a patient subgroup or normal tissue. The SEs present in both patient tumor subgroups and normal are depicted in gray, while those SEs present only in tumor samples are shown in red.
ConclusionsWe have assembled the largest H3K27Ac ChIP-seq database in ovarian cancer and related normal tissue to date, with the express purpose of profiling active enhancer activity. We complemented our ChIP-seq data with RNA-seq and mutational profiling. We were able to show large differences between the enhancer landscapes of tumor and normal in ovarian cancer. We were also able to identify 6 total patient subgroups, comprising granulosa cell, yolk sac, and 4 novel subgroups with samples from clear cell, high grade serous, low grade serous, mucinous, and endometrioid ovarian cancer. Each of these subgroups is driven by its own unique transcriptional circuitry, with associated transcription factors and potential druggable targets. Interestingly, while we can predict subgroup membership for cell lines, we find that many cell lines have a transcriptional signature distinct from patients (volcano plot below, showing thousands of genes significantly differentially expressed in cell lines). Many genes involved in ovarian cancer etiology are linked to cluster-specific SEs, such as CCNE1, MYC, and CD47 (above and left). Many other potential targets are revealed by our analysis.
The poster submission deadline is midnight Dublin time, which is 6PM here... David The poster submission deadline is midnight Dublin time, which is 6PM here... David The poster submission deadline is midnight Dublin time, which is 6PM here... David
p < 0.002 (ANOVA)
C1
C2
C3
C4
GC YS
Nor
m
0.0
0.5
1.0
1.5
2.0
2.5
3.0MYC SE 1
C1
C2
C3
C4
GC YS
Nor
m
0
2
4
6
8
10MYC SE 2
p < 0.032 (ANOVA) p < 0.002 (ANOVA)
708
SE
s w
ith c
lust
er-s
peci
fic a
ctiv
ityN
orm
aliz
ed S
E S
igna
l
High-ThroughputSequencing
20x106 Reads
Super-EnhancerClassification
Median 737 SEs
ChromatinFragmentation
101OvCa
Donors29 OvCaCell Lines
8 HealthyDonors
ChromatinImmunoprecipitation Peak calling
Median 18KH3K27Ac peaks
Human OvCa
SEs from one of the normal samplesExamples of SE-Proximal Genes
FHL2MALAT1
MIRLET7
KLF9
FOSB
IRF2BPL
MALAT1
KLF9FHL2
IRF2BPL
Cell lines are transcriptionally distinct from patients
log2 Fold Change (Patient / Cell Line)
p-va
lue
(-lo
g 10(p
))
Interestingly, we find that ovarian cell lines as a group are transcriptionally di-vergent from patients. As an example, we show an enhancer volcano plot on the left of patients vs. cell lines, revealing 1792 potential SE loci that diverge from patients with an FDR of at least 0.01 (~48% of loci).
Within the group of cell lines, we find that some cell lines are more represen-tative than others of the general patient enhancer landscape. Importantly, we can predict a subgroup (see next panel) for each cell line.
Some cluster-specific SEs suggest increased tumor dependency on transcriptional machinery
Cluster-specific SEs imply that the tumors falling into those clusters have an increased transcriptional dependency on active transcription initiation at these loci, as well as the components of the corresponding transcriptional machinery. Three examples shown here (right) demonstrate cluster-specific SEs at CCNE1, and two cluster-specific SEs at MYC.