genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 original article ....

41
1 Original Article Genomic characteristics of triple-negative breast cancer nominate molecular subtypes that predict chemotherapy response Jihyun Kim 1† , Doyeong Yu 1† , Youngmee Kwon 2 , Keun Seok Lee 2 , Sung Hoon Sim 2,3 , Sun-Young Kong 3,4 , Eun Sook Lee 2,4 , In Hae Park 2,3 *, and Charny Park 1 * 1 Bioinformatics Analysis Team, Research Institute, National Cancer Center, 323 Ilsan-ro, Ilsandong-gu, Goyang 10408, Republic of Korea 2 Center for Breast Cancer Hospital, National Cancer Center, 323 Ilsan-ro, Ilsandong- gu, Goyang 10408, Republic of Korea 3 Translational Cancer Research Branch, Division of Translational Science, National Cancer Center, 323 Ilsan-ro, Ilsandong-gu, Goyang 10408, Republic of Korea 4 Graduate School for Cancer Science and Policy, National Cancer Center, 323 Ilsan- ro, Ilsandong-gu, Goyang 10408, Republic of Korea These authors contributed equally to this work as first authors. * To whom correspondence should be addressed. Running Title: TNBC Subtypes and Chemotherapy Response Keywords: triple-negative breast cancer (TNBC), consensus molecular subtype, drug-response prediction, cisplatin resistance, prognosis Corresponding authors: Charny Park, 323 Ilsanro Ilsandonggu Goyangsi Gyeonggido 10408 Korea. Research. on November 24, 2020. © 2019 American Association for Cancer mcr.aacrjournals.org Downloaded from Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Upload: others

Post on 17-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

1

Original Article

Genomic characteristics of triple-negative breast cancer nominate

molecular subtypes that predict chemotherapy response

Jihyun Kim1†

, Doyeong Yu1†

, Youngmee Kwon2, Keun Seok Lee

2, Sung Hoon Sim

2,3,

Sun-Young Kong3,4

, Eun Sook Lee2,4

, In Hae Park2,3

*, and Charny Park1*

1 Bioinformatics Analysis Team, Research Institute, National Cancer Center, 323

Ilsan-ro, Ilsandong-gu, Goyang 10408, Republic of Korea

2 Center for Breast Cancer Hospital, National Cancer Center, 323 Ilsan-ro, Ilsandong-

gu, Goyang 10408, Republic of Korea

3 Translational Cancer Research Branch, Division of Translational Science, National

Cancer Center, 323 Ilsan-ro, Ilsandong-gu, Goyang 10408, Republic of Korea

4 Graduate School for Cancer Science and Policy, National Cancer Center, 323 Ilsan-

ro, Ilsandong-gu, Goyang 10408, Republic of Korea

† These authors contributed equally to this work as first authors.

* To whom correspondence should be addressed.

Running Title: TNBC Subtypes and Chemotherapy Response

Keywords: triple-negative breast cancer (TNBC), consensus molecular subtype,

drug-response prediction, cisplatin resistance, prognosis

Corresponding authors:

Charny Park, 323 Ilsanro Ilsandonggu Goyangsi Gyeonggido 10408 Korea.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 2: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

2

Phone: +82-031-920-2581; E-mail: [email protected]; and In Hae Park; E-mail:

[email protected]

Disclosure of Potential Conflicts of Interest:

The authors declare no conflicts of interest.

Grant Support:

This research was supported by the National Cancer Center Grant (NCC-1611150,

NCC-1910100); the National Research Foundation of Korea (NRF) (NRF-

2018R1C1B6001475).

Authors’ contributions:

Conception, design and study supervision: IHP, CP

Acquisition of patient data (acquired and managed patients): IHP, YMK, KSL, SHS,

SYG, ESL

Computational analysis: JK, DYY, CP

Writing, review and/or revision of the manuscript: JK, IHP, CP

Word count: 5858

Total number of figures and tables: 5

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 3: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

3

Abstract

The heterogeneity of triple-negative breast cancer (TNBC) poses difficulties for

suitable treatment and leads to poor outcome. This study aimed to define a consensus

molecular subtype (CMS) of TNBC and thus elucidate genomic characteristics and

relevant therapy. We integrated the expression profiles of 957 TNBC samples from

published datasets. We identified genomic characteristics of subtype by exploring the

pathway activity, microenvironment, and clinical relevance. Additionally, drug

response (DR) scores (n = 181) were computationally investigated using chemical

perturbation gene signatures and validated in our own TNBC patient (n = 38) who

received chemotherapy and organoid biobank data (n = 64). Subsequently,

cooperative functions with drugs were also explored. Finally, we classified TNBC

into four CMSs: stem-like; mesenchymal-like; immunomodulatory; luminal-androgen

receptor (AR). CMSs also elucidated distinct tumor-associated microenvironment and

pathway activities. Furthermore, we discovered metastasis-promoting genes, such as

secreted phosphoprotein 1 by comparing with primary. Computational DR scores

associated with CMS revealed drug candidates (n = 18) and it was successfully

evaluated in cisplatin response of both patients and organoids. Our CMS recapitulated

in-depth functional and cellular heterogeneity encompassing primary and metastatic

TNBC. We suggest DR scores to predict CMS-specific drug responses and to be

successfully validated. Finally, our approach systemically proposes a relevant

therapeutic prediction model as well as prognostic markers for TNBC.

Implications: We delineated the genomic characteristic and computational drug

responses prediction for TNBC CMS from gene expression profile. Our systematic

approach provides diagnostic markers for subtype and metastasis verified by machine-

learning and novel therapeutic candidates for TNBC patients.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 4: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

4

Introduction

Triple-negative breast cancer (TNBC) is known to have poor prognosis and the

tendency of early metastasis (1). Although TNBC shows a higher response rate to

standard chemotherapy than hormone receptor positive breast cancer, it has poor

prognosis due to early development of multi-drug resistance (2). Recently, targeted

therapy based on genomic characteristics was introduced with promising results. For

example, DNA-damaging agents like platinum agent or poly (ADP-ribose)

polymerase-1 inhibitor showed excellent efficacy in BRCA1/2 mutation carriers

among heterogeneous TNBC groups (3). In addition, immune-modulating agent, anti-

programmed death-ligand 1 (PD-L1) antibody, and atezolizumab demonstrated

outstanding results in metastatic TNBC patients whose tumors had higher PD-L1

expression (4). To find the appropriate drug for each TNBC patient, a thorough

understanding of the heterogeneity of TNBC and robust subtype identification are

prerequisites.

Previous TNBC subtype identification studies have proposed various classification

and subtypes of size 3 to 6. Lehmann’s classification has advanced to specify the

heterogeneity and clinically evaluated in several independent cohorts (5–7). Further

studies proposed different-size subtypes of 3 or 4 based on next-generation

sequencing using a more robust algorithm (8,9). However, owing to restricted sample

size (n < 200) originating from a single platform or insufficient algorithm

optimization to refer diagnostic gene set, a robust TNBC subtype classification has

not be established in the absence of multiple validations. Additionally, previous

classification was restricted to only primary tumors despite poor outcomes due to

metastatic tumor development.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 5: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

5

As an additional important factor of tumor heterogeneity, the tumor-associated

microenvironment (TME) confers genetic diversity to tumor bulk cells. Cellular

characteristics are decided by tumorigenic origin (10). As proven in a single-cell

platform, the TME cell population cooperates with the chemotherapy response (11,12).

Cell abundance investigation to induce transitional diversity is unrestricted to a

single-cell platform. Distinct cell abundance identification could be estimated from

bulk tumor sample gene expression profiles using cell deconvolution technologies

(13,14). Expanded cellular investigation in terms of TME could reveal the origin of

functional heterogeneity and specify therapeutic targets.

Due to the complex interactions between heterogeneous tumor cells and TME, there

have been a few successes in attempts to search novel drug targets computationally in

TNBC. To reveal the mode of action and to identify novel drug-target relationships,

pharmacogenomics approaches utilizing gene expression or mutation profiles have

been suggested (15,16). However, most computational methods were performed using

genetic profiles of cell lines, and validation was insufficient to follow actual patients’

responses. A systematic approach interconnected with molecular subtype diagnosis

and therapeutic strategy is required to find effective treatment.

In the current study, utilizing gene expression profiles, we established various

classification and prediction models. Machine-learning approaches referring related

gene expression signature classified two types: a consensus molecular subtype (CMS)

by SVM and primary-metastasis by random-forest model. In-depth investigation using

known gene signatures recapitulated genomic characteristics and clinical relevance of

CMS and metastasis. We also carried out TME and mammary gland cells by TNBC

bulk cell deconvolution and revealed the cellular heterogeneity of CMS. In additional

study, we established drug response prediction model to compute DR score based on

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 6: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

6

chemical perturbation gene signatures. We identified subtype-related drug responses

and pathway interventions. The drug response predictions were validated in actual

patient’s cisplatin response and matched gene expression profile as well as TNBC

organoid (17).

Materials and Methods

CMS classification and functional investigation using gene expression profile

We collected 1,525 gene expression profiles from 14 high-throughput gene expression

datasets related to metastasis and triple-negative breast cancer from the Gene

Expression Omnibus (GEO) (Supplementary Table S1). To eliminate the batch effect

from the datasets, meta-analysis of the gene expression profiles was performed using

ComBat (18). Next, we confirmed the removal of the batch effect of the expression

profiles in each dataset using a principal component analysis (PCA) plot

(Supplementary Figure S1A). We removed samples considered as positive for each

estrogen receptor (ER+), progesterone receptor (PR

+) and human epidermal growth

factor receptor 2 (HER2+) followed by Lehmann study’s filtering strategy based on

bimodal distribution (5). Additionally, immunohistochemistry (IHC)-based diagnosis

profiles for ER+, PR

+ and HER2

+ were considered for judicious sample elimination.

We exceptionally considered metastasis cell isolation from primary breast pleural

cells as metastasis sample (Supplementary Table S1). Finally, one aggregated

expression profile was generated including 601 primary and 356 metastatic samples.

We clustered samples using the non-negative matrix factorization method, and the

number of clusters was verified by silhouette width for each subtype (Supplementary

Figure S1B). Next, we established a CMS classifier based on a support vector

machine (SVM) using the expression profile of 1227 genes (information gain ≥ 0.1;

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 7: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

7

Supplementary Table S2) as a training dataset. The classification performance of the

SVM was investigated by 5-fold cross-validation with 1000 iterations. Accuracy,

sensitivity, and specificity were calculated for each subtype. We compared our TNBC

CMS with the Lehmann and Burstein expression profiles. Additionally, 10-year meta-

free survival (MFS) and overall survival (OS) differences in CMS were investigated

using Cox regression.

Matched clinical information is based on GEO. Sample counts after filtering out ER,

PR, and HER2 positive samples are listed under primary and metastasis. A part of the

samples include long-distance metastasis described in tissue type. In total, 601

primary and 356 metastasis samples were chosen for functional analysis.

Pathway activity scores were derived from gene expression profiles using Z-score

transformation after gene set variation analysis (GSVA) enrichment scoring by

referring to Hallmark gene sets from MSigDB or additional gene signatures (19–21).

Key pathways inducing distinct CMS characteristics were chosen by limma test (P <

1e-05). To verify functionality for each CMS, we collected seven representative gene

signatures to extract subtype characteristics, namely epithelial-to-mesenchymal

transition (EMT), stromal, immune, TME, stemness, chromosomal instability (CIN),

and hormone. Next, enrichment scores were estimated using GSVA. The EMT score

was defined from normalized average expression values using a previously studied

EMT gene set (22). The abundance of stromal and immune cells was estimated by the

ESTIMATE tool (23). To perform cross-validation of immune infiltration, the TME

score was calculated by xCell (14). The stemness score was determined from The

Cancer Genome Atlas (TCGA) pan-cancer stemness index (24). Additionally, we

referred to embryonic stem (ES) cell-like gene expression signatures of Nanog

homeobox (NANOG), octamer-binding transcription factor 4 (OCT4), SRY-related

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 8: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

8

HMG-box 2 (SOX2), and MYC, which are known to be overexpressed in poorly

differentiated tumors (25). The CIN score was derived from the CIN70 gene

expression signature (26). Hormone response was summarized from the average score

of the ER and androgen receptor (AR) response pathway scores. We compared the

seven representative gene signature scores of our CMS with the Lehmann and

Burstein classification. The difference in scores among subtypes was tested by the

Mann-Whitney U test.

To reveal the TME heterogeneity, we estimated cell abundance, known as the

deconvolution method, of non-immune stromal and immune cells. The abundance of

mammary gland cells was computed using MCPcounter using cell marker gene sets

revealed by a single-cell study (27,28). Additionally, the proportion of immune cells

was estimated using CIBERSORT (13).

To evaluate association between BRCAness and SL type, we classified TCGA BRCA

TNBC into our CMSs and called germline mutations from TCGA breast cancer

TNBC samples (total n=1101, TNBC n=120) and mutation calling pipeline followed

by GATK v3.3 germline short variant discovery pipeline (29). First, the mutations

were called using HaplotypeCaller, and filtered out low-quality variants by

VariantFiltration. To extract pathogenic and functional germline mutations, we

chosen BRCA1/2 variants registered as ‘Pathogenic’ or ‘Likely pathogenic’ in

ClinVar, INDEL or truncated mutations (30).

To infer the genetic characteristics of metastatic transition in our aggregated data, we

analyzed differentially expressed genes (DEGs) between primary tumor and

metastasis samples using limma (adjusted P < 0.001; Benjamini-Hochberg correction).

Next, significant pathways were inferred from the DEGs by gene set enrichment

analysis (GSEA) using ReactomFI (31). Finally, the DEGs based on the GSEA test

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 9: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

9

(FDR < 0.05) were used to develop a gene-gene interaction network, visualized using

Cytoscape. In addition, we generated a metastasis prediction model based on these

prognostic markers. First, we tested for MFS and concordance index (CI) from DEGs

previously comparing primary with metastasis samples. Tests were performed using

the R package survcomp, and the identified gene set was defined as the metastasis

prediction gene set (32). Next, to test the metastasis prediction availability of gene set,

we established a random-forest machine-learning model to classify primary and

metastasis samples using the metastasis prediction gene set identified from MFS test

(P < 0.05). Finally, the random-forest model was evaluated by area under the curve

(AUC) metric using a 4-fold cross validation procedure. Pathways of metastasis

prediction gene set were also investigated using GSEA.

Computational drug-response prediction applying chemical perturbation gene

signature and cross-validation

To explore the relevant drug response status for each patient, we computed the drug

response (DR) scores in both aggregated data and biobank expression profiles (17).

DR scores were calculated by the GSVA method using the MSigDB chemical and

genetic perturbation signature gene set (17). Next, we selected drug signatures to

examine response differences among CMSs using analysis of variance (ANOVA). To

select reliable drug responses, we investigated drug names registered for clinical trials

(NIH ClinicalTrials.gov database), considering synonyms provided in Drugbank (33).

Meanwhile, paired Pearson correlation coefficients between pathway activities and

DR scores were calculated to explore the functional association for each drug. To

evaluate DR scores of aggregated data, we referred biobank expression profiles. First,

organoid samples were classified into four subtypes using our CMS classifier, and DR

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 10: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

10

scores were also equally calculated. Subtype-specificity of biobank DR scores was

also tested by ANOVA.

ROADMAP data analysis related to cisplatin response

Patients and sample preparation: In 2016, we introduced the ROADMAP program

to implement genomic sequencing of tumors and blood coupled with clinical data in

the management of advanced refractory breast cancer. As of June 2018, 38 female

patients with metastatic TNBC were enrolled, and their genomic and clinical data

were obtained after receiving written informed consent. RNA from fresh frozen or

paraffin-embedded, formalin-fixed (FFPE) tissue samples were extracted using the

RNeasy mini kit (QIAGEN, Hilden, Germany) according to the manufacturer’s

protocol. Total cellular RNA (at least 100 ng) was purified from 1 g total RNA from

each sample, and cDNA was synthesized using SuperScript II Reverse Transcriptase

(Thermo Fisher Scientific Inc., Waltham, MA, USA). Sequencing libraries were

prepared using the TruSeq RNA Library preparation kit (Illumina, San Diego, CA,

USA) and sequenced using HiSeq 2000.

Gene expression profile processing: After quality trimming using Trimmomatic, we

aligned reads to hg19 using the Mapsplice2-like TCGA RNASeqV2 pipeline (34,35).

Next, we quantified the gene expression profile using RSEM v1.1.13 referring to

UCSC Known Genes (36). Patient subtypes were assigned by utilizing SVM machine-

learning model for CMS TNBC classification described in first method section. The

seven representative gene signature scores and DR scores were computed by equal

method with an aggregated data. Cox-regression survival analysis was performed for

examining 5-year treatment-free survival.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 11: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

11

Further investigation of cisplatin drug response: The reliability of DR scores was

confirmed from both ROADMAP patient therapeutic profiles and biobank. Of the

ROADMAP-enrolled patients, 31 patients had undergone cisplatin therapy, and the

response log was available. Biobank cell models derived from TNBC patients were

divided into cisplatin response and non-response according to drug high-throughput

screening (HTS) 50% inhibitory concentration (IC50) values. Cisplatin DR scores of

ROADMAP patients and biobank cells were tested using the Wilcoxon rank-sum test.

We also explored genes involved in cisplatin response by analyzing DEGs between

cisplatin-response and -resistance samples using the Wilcoxon rank-sum test (P <

0.05) in the ROADMAP dataset. To avoid the artifact effect from the FFPE status, we

eliminated genes that showed differences in expression between FFPE and fresh

frozen samples (Wilcoxon rank-sum test, P < 0.1). Finally, pathways involved in

cisplatin response were investigated using ReactomeFI GSEA (31). Additionally,

survival analysis was performed to identify drug-related prognosis markers based on

DEGs. The combined CI and hazard ratio (HR) were computed to get an overall score

from independent datasets including aggregated data and the ROADMAP dataset of

prognostic marker genes.

Results

CMS classification

The summary of our aggregated data composed of primary (n = 601) and metastatic (n

= 356) TNBC is shown in Table 1. We confirmed to eliminate the batch effect via

PCA plot (Supplementary Figure S1A) and eliminated samples that highly expressed

ER, PR, and HER2 using a bimodal filter on the gene expression distribution. The

Lehmann subtype supports the most stringent ER, PR, and HER2 filtering to eliminate

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 12: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

12

15.3% of our aggregated data samples. On the other hand, 56% of Burstein samples to

pass IHC were eliminated based on mRNA expression.

Performance evaluation using 5-fold cross-validation confirmed the cluster size (n = 4)

via an average silhouette width of 0.57 of our subtypes, indicating remarkably better

performance than Lehmann’s (0.09) and Burstein’s (0.18) (Supplementary Figure

S1B–D). The IM subtype shows the lowest sensitivity than others (Supplementary

Figure S1B, C; 84%). However, its silhouette width (0.3) is relatively much higher

than Burstein’s (0.08) and Lehmann’s (0.06) classification. In independent signature-

based validation, our CMS showed the best performance overall (Supplementary

Figure S1E). IM subtype’s discriminative power by ESTIMATE immune scores

shrunk in both ours and previous studies than others, but our CMS relatively well

classified IM subtype (Supplementary Figure S1E; P-value ours=1.299e-08;

Lehmann=1.096e-16; Burstein=0.513).

In Lehmann’s classification, the mesenchymal stem-like (SL) type showed the highest

stromal infiltration, but the discriminative power using the EMT score was better in

our CMS (t-test test, metaCMS P = 2.86E-69, Lehmann P = 7.69E-31;

Supplementary Figure S1E). The genomic characteristics of mesenchymal type to

associate with stemness were ambiguous (Supplementary Figure S1E). Stemness was

diffused across the basal-like (BL) 1, BL2, and mesenchymal subtypes. Consistent

with a previous study, the average silhouette width of BL1 showed the worst result (5).

Burstein’s classification resemble with our result rather than Lehmann’s six subtypes,

except for the basal-like immune-activation type (P < 2.2e-16, chi-square test). When

investigating Burstein’s dataset (n = 198), the BL immune-activation (BLIA) type

considerably included stemness samples (18%; Supplementary Figure S1D, E). In

immune and TME signatures (Supplementary Figure S1E), our CMS was more

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 13: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

13

discriminative rather than Burstein’s. Therefore, we finalized the TNBC CMS into the

stem-like (SL, 32.1%), mesenchymal-like (MSL, 24.6%), immunomodulatory (IM,

12.9%), and luminal AR (LAR, 30.5%) subtypes (Fig. 1A).

Genomic characteristics and microenvironment of CMSs

Key pathways activated in each subtype were identified from computational scores (P

< 0.1e-06). Interestingly, IM and LAR-associated pathways were also up-regulated in

the MSL subtype (Fig. 1A). EMT, transforming growth factor beta (TGF-β), and

TP53 signals were constitutively up-regulated restricted in the MSL. Our LAR

showed the highest ER and AR response, particularly involved in adipogenesis and

fatty acid metabolism. IFN-α, IFN-γ response, IL6-JAK-STAT3, IL2-STAT5, and

inflammatory responses are activated in IM (Fig. 1A). In addition, both representative

immune cell infiltration and TME signature show the highest score in IM (Fig. 1B).

This indicates that IL-JAK-STAT signaling is implicated in immune infiltration, as

identified in previous studies on TNBC (37). Tumor necrosis factor alpha (TNF-α)

and apoptosis were activated in both IM and MSL. The characteristics of ES cells

included a SL subtype, in which cell cycle, proliferation, and ES-associated pathways

were activated, such as mitosis, E2F, MYC, WNT-β catenin, and Notch signaling (Fig.

1A,B). As previously known (26,38) , cell cycle deficiency in the SL subtype induced

the highest CIN (Fig. 1B; P = 1.35e-120). Meanwhile, DNA repair and glycolysis

were enhanced across SL and LAR.

The TME for each CMS presented distinct features in both stromal and immune cells.

First, the IM was strongly immune-infiltrated and harbored memory B cells, CD8 T

cells, activated CD4 T cells, gamma delta T cells, and activated natural killer cells

(Supplementary Figure S2A; P < 1.0e-07), while MSL showed a high incidence of

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 14: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

14

M2 macrophages, which was consistent with another study (9). Next, we explored the

prognostic effect by immune cells to determine MFS or OS. The results showed that

activation of CD8 T cells was associated with good prognosis (Supplementary Figure

S2B; MFS, P = 0.005; OS, P = 0.02). Among the known immunotherapy targets such

as CD20, CD52, PD-L1, CTLA4, CD52, and CTLA4 were significantly enhanced in

IM (Supplementary Figure S2C,D; ANOVA P < 0.001 and Tukey test). Another bulk

cell deconvolution revealed distinct characteristics of seven mammary gland cells

depending on CMS (Fig. 1B; P < 3.66e-17). Myoepithelial cells were enriched in the

SL, consistent with a previous finding (39), and we additionally discovered that

luminal progenitor cells were enhanced in SL. In MSL, adipocytes and matrix

metalloproteinases categorizing as stromal cells were highly enriched, followed by

luminal epithelial and myoepithelial cells. Interestingly, key pathways of adipogenesis

and fatty acid metabolism were activated across the MSL and LAR (Fig. 1A), but

adipocytes were enriched only in the MSL type. Additionally, we confirmed that

tumor-associated stromal cells were infiltrated in MSL from up-regulation of nine

markers (Supplementary Figure S3) (40,41). Luminal epithelial cells showed a strong

presence in the LAR, and blood cells undergoing the mitotic cell cycle process were

mainly activated in the IM (Fig. 1B). In further investigation of stem-like

characteristics for the SL subtype, we referred to previously known target gene sets of

ES cell transcription factors and polycomb-regulated genes (25). As already identified

in a previous study, target genes of transcription factors NANOG, OCT4, SOX2, and

MYC were clearly activated in SL. Additionally, we confirmed that known targets of

polycomb repressive complex 2 (PRC2) down-regulated in SL, PRC2 subunit 2

(SUZ12), PRC subunit (EED), and methylation at histone 3 lysine 27, are up-

regulated in the MSL subtype (Fig. 1B).

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 15: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

15

To investigate which subtype is dysregulated by BRCA1/2 germline mutations

associated with BRCAness, we classified TCGA BRCA TNBC samples into our

CMS and called germline mutations. 54 cases in TCGA TNBC samples (n=120)

belonged to SL type. Finally, 8 SL type samples in total 16 pathogenic BRCA1/2

germline mutations were identified (Fisher’s test, P < 1.89e-07). SL key pathways,

G2M checkpoint, MYC, mitotic spindle and NOTCH activation in BRCA1/2 mutation

samples resemble those shown in Fig. 1A (Supplementary Figure S4; P < 2.13e-04).

To investigate prognosis, we examined 10-year survival rates based on OS and MFS

rates among CMSs. Survival test details are described in Supplementary Table S3. SL

clearly presented poor MFS (median MFS = 5.9 years) and showed 2.53 times the HR

(95% confidence interval (CI): 1.69 – 3.7; Fisher’s test, P = 5.40e-06) compared with

MSL, which had good prognosis (Fig. 1C). Moreover, high-grade patients were

enriched in the SL subtype (44% of grade 3, chi-squared test P = 5.10e-05). The LAR

subtype showed poor MFS as well as OS, consistent with the primary TNBC of the

Curtis 2012 dataset METABRIC project (Fig. 1C), and the LAR subtype patients also

rapidly relapsed in line with the SL subtype (median MFS = 4.9 years) and had 1.5

times the ratio of risk than the MSL group (95% CI: 1.03 – 2.37; Fisher’s test P =

0.03) (42). The MSL and IM groups presented relatively good prognosis in terms of

both MFS and OS.

Comparison between primary and metastatic TNBC

We investigated the unique features between primary and metastatic TNBC.

Metastasis samples (37 %; n = 356) were originated from primary TNBC and derived

from several tissues, including skin, liver, lung, and brain (Table 1). When clustering

all samples, key pathways shared with primary TNBC also discriminated metastasis

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 16: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

16

into four subtypes (Fig. 1A). Proportions of SL and LAR subtypes were increased in

metastasis (SL 35%, LAR 35%) compared with primary TNBC (SL 30%, LAR 28%),

and the remaining two subtypes in metastasis were relatively decreased. Meanwhile,

tissue types of metastatic samples were independent with CMS confirmed by Chi-

square test.

To characterize the functional difference between primary cases and metastasis, we

first found 734 DEGs (adjusted P < 0.001) and revealed the biological pathways that

were enriched in the DEGs using ReactomeFI (Methods; Fig. 2). These genes were

associated with complement and coagulation, IL6-mediated signaling, toll-like

receptors, TNF signaling, and JAK-STAT signaling pathways (P < 0.001; Fig. 2A).

Serpin family E member 1 (SERPINE1) and fibroblast growth factor 1 (FGF1),

implicated in metastasis (43,44) were found to be involved in metastasis via EGFR

and FGF signaling pathways and angiogenesis (Fig. 2A,C).

We established metastasis prediction models based on identified prognostic markers

(Supplementary Figure S5A). Metastasis prediction gene set was chosen using MFS

test (P < 0.05; n = 84) from 734 DEGs comparing primary with metastasis samples

and prediction model was established using random-forest machine-learning method

(Supplementary Table S4). The model was validated by a 4-fold cross validation

(AUC 0.8; Supplementary Figure S5B). Importantly, we report CIs from these

metastasis prediction gene sets (Fig. 2B). Among metastasis prediction genes, CIs

were greater than 0.4 in FGF, JAK-STAT, and integrin signaling regulated by SHC

adaptor protein 1 (SHC1), integrin subunit alpha M (ITGRAM), vascular cell adhesion

molecule 1 (VCAM1), integrin subunit alpha 1 (ITGA1), and jagged canonical Notch

ligand 2 (JAG2) (Fig. 2B; Supplementary Figure S5C,D). SHC1 involved in FGF

signaling and cell migration is used as a prognostic marker for cancer treatment (45).

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 17: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

17

We show high concordance (c = 0.55; 95% CI: 0.47 – 0.62) of this gene, which was

activated in metastasis (Fig. 2C). ITGAM (c = 0.48; 95% CI: 0.44 – 0.53) and ITGA1

(c = 0.52; 95% CI: 0.48 – 0.58) seem to enhance distance metastasis via integrin

signal and coagulation cascading (Fig. 2A; Supplementary Table 5). JAG2 (c = 0.46;

95% CI: 0.43 – 0.52) was identified as a member of NOTCH signaling that promotes

metastasis (Fig. 2C).

Pharmacogenomics landscape investigation by CMS

We established a pharmacogenomics landscape depending on CMS. Computational

DR scores were calculated from 181 response or resistance drug signatures from

MSigDB. In total, 48 drugs of the 181 signatures extracted to overlap with breast

cancer clinical trials. DR scores were estimated in both our aggregated data and

biobank. We extracted final 18 DR scores of 17 drugs to indicate CMS specificity

(ANOVA FDR < 0.05). Next, DR scores and correlated pathways (γ > 0.65) were

extracted as network edges for each subtype (Fig. 3). DR score difference among our

CMSs remarkably resemble with biobank (Supplementary Figure S6A). Chosen 18

drug responses to shown difference among CMSs acquired high CI and partially

associated with clinical outcomes (Supplementary Figure S6B). Adaphostin,

doxorubicine and bexrotene assigned SL type were ranked in top 3 CI (CI ≤ 0.58) and

shown significant clinical outcomes (MFS P-value=3.31E-05). Travectenin response

of SL, ribabirin response and imatinib resistance of MSL were closely relevant to

MFS. Tamoxifen resistance was significantly detected in LAR.

Dasatinib, imatinib and cisplatin were estimated to have the highest resistance scores

in the MSL subtype (Fig. 3A), and the DR scores for these three drugs correlated

strongly with the EMT and KRAS pathways (Fig. 3B). In particular, EMT activation

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 18: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

18

led to cisplatin resistance by functionally shared genes such as cellular

communication network factor 2 (CCN2), vimentin (VIM), cysteine-rich angiogenic

inducer 61 (CYR61), and 5’-nucleotidase (NT5E) (Fig 3B). The genes involved in

EMT activation are known to be associated with the therapeutic effect of cisplatin

(46,47). Ribavirin, an eIF4E inhibitor, has been suggested as a response drug in MSL

subtype. The target gene is known to promote EMT and metastasis (Fig. 3A) (48).

Interestingly, our MSL drug-pathway network demonstrates that ribavirin inhibits

TNFα via NF-κb and apoptosis pathways activated in MSL (Fig. 1A, 3B).

The LAR subtype appeared strongly high resistant score against tamoxifen and

gefitinib, though androgen blockade was considered as a suitable therapy (Fig. 3A,

3B). As already known, androgen receptor promotes tamoxifen agonist activity. Our

prediction shows a strong intervention in ER early response (Fig. 3B). In LAR

network, the drug with the highest resistance, gefitinib, inhibits insulin-like growth

factor I receptor (IGF1R) signaling. Consequently, the LAR subtype shows significant

fold change in IGF1R expression (fold change=8.38). On the other hand, troglitazone

was identified as the only response drug that inhibits fatty acid oxidation (49). Fatty

acid metabolism and adipogenesis pathways were activated across the MSL and LAR

subtypes (Fig. 1A), but the response to troglitazone was more closely correlated to the

LAR subtype.

The SL shown resistance in doxorubicin, dasatinib, and flurouracil treatment but

adaholstin, cisplatin, and trabectedin were responsible (Fig. 3A). E2F, G2M, and

MYC targets 1 signal were positively correlated with doxorubicin resistance (Fig. 3B,

Supplementary Figure S6A). Cyclin B2 (CCNB2), extra spindle pole bodies like 1,

separase (ESPL1), marker of proliferation Ki-67 (MKI67), and DNA topoisomerase II

alpha (TOP2A) genes, involved in E2F and G2M pathways, induced drug resistance

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 19: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

19

and showed high expression in SL. Generally, ESPL1 and TOP2A encode for

topoisomerase or its binding partner and the genes’ topoisomerase activity are known

to trigger doxorubicin resistance (50).

Additionally, to examine cisplatin efficacy in genomic unstable samples, we

calculated CIN scores for each of these cases. Genomic unstable cases of high CIN

scores were classified as cisplatin treatment (Supplementary Figure S7; response r =

0.41, P = 0.06). Glycogen synthase kinase 3 (GKS3) inhibitor, activated in IM type,

regulates innate and adaptive immune response mediated by NK-κB and STATs (51).

Interestingly, biobank data show that the GSK3 inhibitor SB216763 responds not only

to IM, but also to partial LAR and SL types (Supplementary Fig. S5; see Discussion

for more details).

ROADMAP data analysis of cisplatin treatment patients

We validated the response of patients to cisplatin with our computational result based

on the pharmacogenomics approach. Of 38 TNBC patients, 31 patients received

cisplatin containing chemotherapy at the metastatic setting (Table 2 and

Supplementary Table S6). Our SVM model classified all patients into SL (55.3%),

MSL (10.5%), LAR (31.6%), and IM (2.6%; Figs. 3c and S7a). Key pathways

inferred from aggregated data were also investigated in ROADMAP data. EMT, ER

response, and MYC pathways clearly resembled subtype-dependent pathway activities

of the aggregated data, except for the IM subtype, which was due to sample size

limitation (Supplementary Figure S8A-B). Approximately 52% of the 31 patients

responded to cisplatin treatment (R group), whereas 48% were non-responders (NR

group). In progression-free survival (PFS) analysis, the NR group showed poor PFS

(P = 0.002; Fig. 3D).

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 20: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

20

Computational cisplatin DR scores of ROADMAP remarkably associated with actual

patients’ drug response, and the computational scores showed subtype-dependency (P

< 0.1; Fig. 3E). Three MSL patients except one belonged to the NR. In contrast, six of

seven LAR patients showed response to cisplatin treatment. In other words, MSL type

patients appeared to be cisplatin-resistant compared with other subtypes, while LAR

type was most sensitive for this treatment (P = 0.1; Wilcoxon test). Our method

predicted that ribavirin and thiazolidinedione response scores resembled cisplatin

resistance DR scores in ROADMAP patients (Fig. 3C, upper rectangle). Cisplatin was

suitable for the LAR and SL subtypes (Fig. 3C, middle rectangle). LAR showed both

tamoxifen resistance and androgen target treatment response (Fig. 3C, lower

rectangle). In addition, investigation of biobank drug HTS and expression profiles

indicated that cisplatin efficacy (IC50 value) was remarkably consistent with our DR

score (P = 6.0e-04; Fig. 3F).

Genes involved in cisplatin resistance were identified in ROADMAP using the DEG

set and GSEA analysis (Supplementary Table S7). We found SL- and MSL-related

pathways such as mitosis pathways, DNA damage bypass, Wnt signaling pathway,

and Beta1 integrin cell surface pathway to be enriched among the 847 drug response-

related DEGs (P < 0.0018; Supplementary Table S8). We next explored cisplatin

response markers for the prognosis of PFS based on the expression of genes, such as

E2F transcription factor 1 (E2F1), metastasis associated 1 (MTA1), plakophilin 1

(PKP1), and microtubule nucleation factor (TPX2). Up-regulation of these four genes,

characterized by shorter response times, indicated poor prognosis (P < 0.04;

Supplementary Figure S8C) and was also consistent with our aggregated data

(Supplementary Figure S8D).

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 21: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

21

Our DR scores strongly suggest that patients with cisplatin non-response are mainly

enriched in the MSL subtype, which could be treated with ribavirin and

thiazolidinedione (Fig. 3C). Ribavirin also inhibits histone methyltransferase (EZH2)

and treats leukemia cells (52). Thiazolidinedione (TZD) belongs to a class of drugs

used to improve glucose metabolism in type-2 diabetes, activates PPAR-γ to crosstalk

in mesenchymal stem cells, and inhibits adipocyte differentiation (53). In addition,

these drugs interfere in MSL-associated pathways that activate PRC2, H3K27, EED,

SUZ12 (related with methyltransferase activity) and TZD-related PPAR-γ inhibition

of adipogenesis (Fig. 1B).

Discussion

Through this study, we proposed four TNBC subtypes using an integrative expression

profile. The robustness of the classification was validated using several gene

signatures and a machine-learning method (Supplementary Figure S1). In our

classification, the absence of DNA variant evidence could be a limitation, but

tumorigenic functions by variants could be inferred from transcriptional consensus

change (21,26). For example, BRCAness by BRCA1/2 mutation could be related to

the SL subtype activated in pathways associated with cell cycle (54). Our pathogenic

BRCA1/2 enrichment in SL was consistent with Lehmann’s identification in BL1 and

BL2 (5). Meanwhile, our results also summarized key pathways specific to subtypes

encompassing primary and metastatic TNBC. These consensus key pathways are in

accordance with previous investigations on the ES signature of aggressive breast

cancer from glioma and bladder tumor (25).

We investigated precise stemness using various computational approaches like

machine-learning (Fig. 1C, left) or ES cell signature scoring regulated by transcription

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 22: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

22

factor target gene set (Fig. 1C, right).. ES cell signatures showed more enriched

results in the SL subtype, indicating that transcription factors SOX2, OCT4, and

NANOG to play a critical role in proliferation (25). However, MYC targets activation

was unrestricted in the SL type differently than in a previous report (25). MYC-

amplified samples except the SL type existed in the IM subtype. In another study,

MYC reduction induced the down-regulation of CD47 and PD-L1 in mouse and

human cells, which was in line with our results (55). Therefore, we infer that MYC up-

regulation is involved across the SL and IM types. Meanwhile, we suggested another

regulator, PRC2, which was known to be exclusively activated in non-stemness

samples (25). We specifically revealed mutual exclusiveness of PRC2 targets between

SL and MSL and this finding narrow down a PRC2 complex inhibitor target (56).

Cellular heterogeneity and tumor progenitors of TNBC could be elucidated from

various cells in the breast tissue microenvironment. Based on our cell abundance

results, the proliferative characteristics of SL seem to be originated from luminal

progenitor cells (Fig. 1C). In a previous study, BRCA1/2 defective tumor samples

originated from luminal progenitors (10). As already described, the SL is the closest

to BRCA1/2 deficiency. Therefore, our hypothesis is consistent with a previous study

that reported that BRCA1/2 mutation frequently recurred in TNBC that originated

from luminal progenitor and not from basal stem cells. Another study has shown that

adipocytes induce EMT in breast cancer cells (57), which was relevant to our results

that adipocytes were enriched in the MSL subtype (Fig. 1C). Thus, our cell abundance

analysis indicates the specific progenitor or interaction with other cells in the

microenvironment, which could be involved in tumorigenic biological functions.

We successfully established metastasis prediction machine-learning models using 84

DEG gene set to compare between primary and metastasis and the genes could be

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 23: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

23

considered as prognostic markers. The metastasis prediction gene set was enriched in

coagulation, cell migration, and cell-cell interaction required for enhancing metastasis

(45). Especially, one of the identified genes, SHC1, regulates apoptosis and drug

resistance in mammalian cells and is essential for tumor progression and metastatic

spread in breast cancer mouse models (54).

Our computational drug response estimated from the expression profile successfully

recapitulated the comprehensive characteristics of functional intervention and drug

target. Cisplatin response markers of SL subtype, E2F1, MTA1, PKP1, and TPX2 are

discriminative in PSF and MFS (Supplementary Fig. S8). We suggest that these four

genes could emerge as important drug resistance predictors. Despite the limited

patient expression profiles in ROADMAP data, we successfully predicted high

cisplatin resistance score in MSL, while the LAR subtype was associated with good

outcome (Fig. 3E). Further, we identified several candidate drugs that could intervene

with key pathways activated in each subtype: aplidin, thiazolidinediones, and ribavirin

for the MSL subtype by inhibiting EMT and metastasis (58,59); cisplatin and

trabectedin for SL type cases with defective DNA-repair machinery (60). Such

approach is useful to perform pre-clinical prediction. Additional evaluation using

biobank data further confirmed our data on cisplatin resistance and ribavirin response

in MSL (Supplementary Figure S6). Interestingly, the proportion of IM subtype

(4.2 %) in biobank is lesser than the CMS analysis dataset (12.9 %), and it overlaps

with some LAR and SL samples with respect to GSK3 inhibitor SB216763 response

(Supplementary Figure S6). It suggests that partial IM samples are predicted to LAR

and SL. We infer that this limitation originates from artificial reconstruction of TME

for organoid culture system. In spite of TME change in organoid culture, partial

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 24: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

24

patients show consistent drug response in GSK3 inhibitor consistently with fresh

frozen samples of our aggregated dataset.

In summary, we defined TNBC CMS encompassing primary and metastasis tumors

using large-scale expression profiles. Our CMS recapitulated in-depth genomic

function and microenvironment heterogeneity. Particularly, we propose the abundance

of mammary gland cells as a marker for tumorigenic origin and prognosis. Further,

our computational approach to estimate drug response was remarkably consistent with

patients’ responses to cisplatin and drug screening of biobank organoid models.

Therapeutic candidates for each CMS were suggested with pathway intervention.

Thus, our study identifies molecular subtypes and provides a basis for relevant

diagnostic and prognostic markers. Finally, our systematic computational approach

suggests a potential to complement TNBC and drug response study.

Acknowledgments

We would like to acknowledge the TNBC patients and the clinical help obtained for

the ROADMAP project. TNBC gene expression meta-data were referred from GEO

and ROADMAP next-generation sequencing files have been deposited in NCBI SRA

accession SRP156081. Our TNBC classification approach is provided as an R

package in Bioconductor

(https://bioconductor.org/packages/devel/bioc/html/TNBC.CMS.html).

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 25: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

25

References

1. TSENG LM, HSU NC, CHEN SC, LU YS, LIN CH, CHANG DY, et al.

Distant metastasis in triple-negative breast cancer. Neoplasma. 2013;60:290–4.

2. Isakoff SJ. Triple-negative breast cancer: role of specific chemotherapy agents.

Cancer J. 2010;16:53–61.

3. Litton JK, Rugo HS, Ettl J, Hurvitz SA, Gonçalves A, Lee K-H, et al.

Talazoparib in Patients with Advanced Breast Cancer and a Germline BRCA

Mutation. N Engl J Med. 2018;379:753–63.

4. Schmid P, Adams S, Rugo HS, Schneeweiss A, Barrios CH, Iwata H, et al.

Atezolizumab and Nab-Paclitaxel in Advanced Triple-Negative Breast Cancer.

N Engl J Med. 2018;379:2108–21.

5. Lehmann BD, Bauer JA, Chen X, Sanders ME, Chakravarthy AB, Shyr Y, et

al. Identification of human triple-negative breast cancer subtypes and

preclinical models for selection of targeted therapies. J Clin Invest.

2011;121:2750–67.

6. Lehmann BD, Jovanović B, Chen X, Estrada M V, Johnson KN, Shyr Y, et al.

Refinement of Triple-Negative Breast Cancer Molecular Subtypes:

Implications for Neoadjuvant Chemotherapy Selection. Sapino A, editor. PLoS

One. 2016;11:e0157368.

7. Ahn SG, Kim SJ, Kim C, Jeong J. Molecular Classification of Triple-Negative

Breast Cancer. J Breast Cancer. 2016;19:223.

8. Burstein MD, Tsimelzon A, Poage GM, Covington KR, Contreras A, Fuqua

SAW, et al. Comprehensive genomic analysis identifies novel subtypes and

targets of triple-negative breast cancer. Clin Cancer Res. 2015;21:1688–98.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 26: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

26

9. Jézéquel P, Loussouarn D, Guérin-Charbonnel C, Campion L, Vanier A,

Gouraud W, et al. Gene-expression molecular subtyping of triple-negative

breast cancer tumours: importance of immune response. Breast Cancer Res.

2015;17:43.

10. Molyneux G, Geyer FC, Magnay F-A, McCarthy A, Kendrick H, Natrajan R, et

al. BRCA1 basal-like breast cancers originate from luminal epithelial

progenitors and not from basal stem cells. Cell Stem Cell. 2010;7:403–17.

11. Ungefroren H, Sebens S, Seidl D, Lehnert H, Hass R. Interaction of tumor cells

with the microenvironment. Cell Commun Signal. 2011;9:18.

12. Kim C, Gao R, Sei E, Brandt R, Hartman J, Hatschek T, et al. Chemoresistance

Evolution in Triple-Negative Breast Cancer Delineated by Single-Cell

Sequencing. Cell. 2018;173:879-893.e13.

13. Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, et al. Robust

enumeration of cell subsets from tissue expression profiles. Nat Methods.

2015;12:453–7.

14. Aran D, Hu Z, Butte AJ. xCell: digitally portraying the tissue cellular

heterogeneity landscape. Genome Biol. 2017;18:220.

15. Chen B, Ma L, Paik H, Sirota M, Wei W, Chua M-S, et al. Reversal of cancer

gene expression correlates with drug efficacy and reveals therapeutic targets.

Nat Commun. 2017;8:16022.

16. Sengupta S, Sun SQ, Huang K-L, Oh C, Bailey MH, Varghese R, et al.

Integrative omics analyses broaden treatment targets in human cancer. Genome

Med. 2018;10:60.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 27: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

27

17. Bruna A, Rueda OM, Greenwood W, Batra AS, Callari M, Batra RN, et al. A

Biobank of Breast Cancer Explants with Preserved Intra-tumor Heterogeneity

to Screen Anticancer Compounds. Cell. 2016;167:260-274.e22.

18. Chen C, Grennan K, Badner J, Zhang D, Gershon E, Jin L, et al. Removing

batch effects in analysis of expression microarray data: an evaluation of six

batch adjustment methods. Kliebenstein D, editor. PLoS One. 2011;6:e17238.

19. Lenz M, Müller F-J, Zenke M, Schuppert A. Principal components analysis and

the reported low intrinsic dimensionality of gene expression microarray data.

Sci Rep. 2016;6:25696.

20. Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for

microarray and RNA-seq data. BMC Bioinformatics. 2013;14:7.

21. Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P,

Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics.

2011;27:1739–40.

22. Tan TZ, Miow QH, Miki Y, Noda T, Mori S, Huang RY-J, et al. Epithelial-

mesenchymal transition spectrum quantification and its efficacy in deciphering

survival and drug responses of cancer patients. EMBO Mol Med. 2014;6:1279–

93.

23. Yoshihara K, Shahmoradgoli M, Martínez E, Vegesna R, Kim H, Torres-

Garcia W, et al. Inferring tumour purity and stromal and immune cell

admixture from expression data. Nat Commun. 2013;4:2612.

24. Malta TM, Sokolov A, Gentles AJ, Burzykowski T, Poisson L, Weinstein JN,

et al. Machine Learning Identifies Stemness Features Associated with

Oncogenic Dedifferentiation. Cell. 2018;173:338-354.e15.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 28: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

28

25. Ben-Porath I, Thomson MW, Carey VJ, Ge R, Bell GW, Regev A, et al. An

embryonic stem cell–like gene expression signature in poorly differentiated

aggressive human tumors. Nat Genet. 2008;40:499–507.

26. Buccitelli C, Salgueiro L, Rowald K, Sotillo R, Mardin BR, Korbel JO. Pan-

cancer analysis distinguishes transcriptional changes of aneuploidy from

proliferation. Genome Res. 2017;27:501–11.

27. Han X, Wang R, Zhou Y, Fei L, Sun H, Lai S, et al. Mapping the Mouse Cell

Atlas by Microwell-Seq. Cell. 2018;172:1091-1107.e17.

28. Becht E, Giraldo NA, Lacroix L, Buttard B, Elarouci N, Petitprez F, et al.

Estimating the population abundance of tissue-infiltrating immune and stromal

cell populations using gene expression. Genome Biol. 2016;17:218.

29. Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, del Angel G, Levy-

Moonshine A, et al. From FastQ Data to High-Confidence Variant Calls: The

Genome Analysis Toolkit Best Practices Pipeline. Curr Protoc Bioinforma.

Hoboken, NJ, USA: John Wiley & Sons, Inc.; 2013. page 11.10.1-11.10.33.

30. Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, et al.

ClinVar: improving access to variant interpretations and supporting evidence.

Nucleic Acids Res. 2018;46:D1062–7.

31. Wu G, Dawson E, Duong A, Haw R, Stein L. ReactomeFIViz: a Cytoscape app

for pathway and network-based data analysis. F1000Research. 2014;3:146.

32. Schröder MS, Culhane AC, Quackenbush J, Haibe-Kains B. survcomp: an

R/Bioconductor package for performance assessment and comparison of

survival models. Bioinformatics. Narnia; 2011;27:3206–8.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 29: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

29

33. Wishart DS, Knox C, Guo AC, Cheng D, Shrivastava S, Tzur D, et al.

DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic

Acids Res. 2008;36:D901–6.

34. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina

sequence data. Bioinformatics. 2014;30:2114–20.

35. Wang K, Singh D, Zeng Z, Coleman SJ, Huang Y, Savich GL, et al.

MapSplice: accurate mapping of RNA-seq reads for splice junction discovery.

Nucleic Acids Res. 2010;38:e178.

36. Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq

data with or without a reference genome. BMC Bioinformatics. 2011;12:323.

37. Barbie TU, Alexe G, Aref AR, Li S, Zhu Z, Zhang X, et al. Targeting an

IKBKE cytokine network impairs triple-negative breast cancer growth. J Clin

Invest. 2014;124:5411–23.

38. Curtis C, Shah SP, Chin S-F, Turashvili G, Rueda OM, Dunning MJ, et al. The

genomic and transcriptomic architecture of 2,000 breast tumours reveals novel

subgroups. Nature. 2012;486:346–52.

39. Carter SL, Eklund AC, Kohane IS, Harris LN, Szallasi Z. A signature of

chromosomal instability inferred from gene expression profiles predicts clinical

outcome in multiple human cancers. Nat Genet. 2006;38:1043–8.

40. Nguyen QH, Pervolarakis N, Blake K, Ma D, Davis RT, James N, et al.

Profiling human breast epithelial cells using single cell RNA sequencing

identifies cell diversity. Nat Commun. 2018;9:2028.

41. Aboussekhra A. Role of cancer-associated fibroblasts in breast cancer

development and prognosis. Int J Dev Biol. 2011;55:841–9.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 30: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

30

42. Finak G, Bertos N, Pepin F, Sadekova S, Souleimanova M, Zhao H, et al.

Stromal gene expression predicts clinical outcome in breast cancer. Nat Med.

2008;14:518–27.

43. Rangel R, Lee S-C, Hon-Kim Ban K, Guzman-Rojas L, Mann MB, Newberg

JY, et al. Transposon mutagenesis identifies genes that cooperate with mutant

Pten in breast cancer progression. Proc Natl Acad Sci U S A. 2016;113:E7749–

58.

44. Brady N, Chuntova P, Bade LK, Schwertfeger KL. The FGF/FGFR axis as a

therapeutic target in breast cancer. Expert Rev Endocrinol Metab. 2013;8:391–

402.

45. van Roosmalen W, Le Dévédec SE, Golani O, Smid M, Pulyakhina I,

Timmermans AM, et al. Tumor cell migration screen identifies SRPK1 as

breast cancer metastasis determinant. J Clin Invest. American Society for

Clinical Investigation; 2015;125:1648–64.

46. Tsai H-C, Huang C-Y, Su H-L, Tang C-H. CCN2 Enhances Resistance to

Cisplatin-Mediating Cell Apoptosis in Human Osteosarcoma. Trackman PC,

editor. PLoS One. Public Library of Science; 2014;9:e90159.

47. Arumugam T, Ramachandran V, Fournier KF, Wang H, Marquis L,

Abbruzzese JL, et al. Cancer Research. Cancer Res. American Association for

Cancer Research; 2009;63:2649–57.

48. Robichaud N, del Rincon S V, Huor B, Alain T, Petruccelli LA, Hearnden J, et

al. Phosphorylation of eIF4E promotes EMT and metastasis via translational

control of SNAIL and MMP-3. Oncogene. PMC Canada manuscript

submission; 2015;34:2032–42.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 31: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

31

49. Fulgencio JP, Kohl C, Girard J, Pégorier JP. Troglitazone inhibits fatty acid

oxidation and esterification, and gluconeogenesis in isolated hepatocytes from

starved rats. Diabetes. 1996;45:1556–62.

50. Wijdeven RH, Pang B, van der Zanden SY, Qiao X, Blomen V, Hoogstraat M,

et al. Genome-Wide Identification and Characterization of Novel Factors

Conferring Resistance to Topoisomerase II Poisons in Cancer. Cancer Res.

American Association for Cancer Research; 2015;75:4176–87.

51. Beurel E, Michalek SM, Jope RS. Innate and adaptive immune responses

regulated by glycogen synthase kinase-3 (GSK3). Trends Immunol.

2010;31:24–31.

52. De la Cruz-Hernandez E, Medina-Franco JL, Trujillo J, Chavez-Blanco A,

Dominguez-Gomez G, Perez-Cardenas E, et al. Ribavirin as a tri-targeted

antitumor repositioned drug. Oncol Rep. 2015;33:2384–92.

53. Blanquicett C, Roman J, Hart CM. Thiazolidinediones as anti-cancer agents.

Cancer Ther. 2008;6:25–34.

54. Ahn R, Sabourin V, Bolt AM, Hébert S, Totten S, De Jay N, et al. The Shc1

adaptor simultaneously balances Stat1 and Stat3 activity to promote breast

cancer immune suppression. Nat Commun. 2017;8:14638.

55. Casey SC, Tong L, Li Y, Do R, Walz S, Fitzgerald KN, et al. MYC regulates

the antitumor immune response through CD47 and PD-L1. Science (80- ).

2016;352:227–31.

56. Kim KH, Roberts CWM. Targeting EZH2 in cancer. Nat Med. 2016;22:128–

34.

57. Lee Y, Jung WH, Koo JS. Adipocytes can induce epithelial-mesenchymal

transition in breast cancer cells. Breast Cancer Res Treat. 2015;153:323–35.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 32: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

32

58. Cuadrado A, Garcia-Fernandez LF, Gonzalez L, Suarez Y, Losada A, Alcaide

V, et al. Aplidin induces apoptosis in human cancer cells via glutathione

depletion and sustained activation of the epidermal growth factor receptor, Src,

JNK, and p38 MAPK. J Biol Chem. 2003;278:241–50.

59. Pettersson F, Del Rincon S V, Emond A, Huor B, Ngan E, Ng J, et al. Genetic

and pharmacologic inhibition of eIF4E reduces breast cancer cell migration,

invasion, and metastasis. Cancer Res. 2015;75:1102–12.

60. D’Incalci M, Zambelli A. Trabectedin for the treatment of breast cancer. Expert

Opin Investig Drugs. 2016;25:105–15.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 33: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

33

Table legends

Table 1. Summary of gene expression profiles from the aggregated data

Matched clinical information is based on Gene Expression Omnibus (GEO). Sample

counts after filtering out ER, PR, and HER2 positive samples are listed below under

primary and metastasis. A part of the samples include long-distance metastasis

described in tissue type. In total, 601 primary and 356 metastasis samples were

chosen for functional analysis.

Table 2. Molecular subtype and clinical summary of ROADMAP patients

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 34: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

34

Figure legends

Figure 1. Overview of TNBC molecular subtypes

a Heatmap of hallmark pathway activity derived from the expression profiles clearly

shows the difference among subtypes. Rows indicate the mean selected pathways, and

columns represent the total TNBC samples. b Enrichment status of gene signature,

mammary gland cell abundance, and embryonic stem cell and proliferation signatures

of different CMSs. MMP, matrix metalloproteinase; L. progenitor, luminal progenitor;

L. epithelial, luminal epithelial; blood/mitotic, blood cell mitotic cell cycle process. c

Overall survival and disease-free survival plots using our aggregated data and Curtis

2012 (38) classified into four CMSs. All p-values were calculated by Cox regression

analysis.

Figure 2. Functional investigation and gene interactions of differentially

expressed genes between primary and metastatic TNBC

a Significantly enriched pathways derived from the DEG set between primary and

metastatic TNBC. b CI plot for metastasis prediction markers. Cox regression

analysis was performed between binary groups according to gene expression. c Gene

interaction network of DEGs after GSEA. Red nodes indicate activated genes in

metastatic samples, and blue nodes indicate activated genes in primary samples. Node

sizes are −log(p-values), and p-values were calculated by t-test with primary and

metastatic samples.

Figure 3. Overall status of functional interactions between drug response and

key pathway within each CMS

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 35: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

35

a Boxplots of DR scores estimated from aggregated data, MSL, LAR, and SL,

respectively. b Interaction networks with drugs and key pathway to highly correlate (r

> 0.65) for each CMS. The subtype of each network is equally ordered within the

boxplots. Square and circle nodes represent key pathways and drugs, respectively.

Node colors of pathways indicate pathway activity scores. Red drug node indicates

resistance signature and blue, the response. The width of edges represents gene degree

overlap in drug and pathway gene sets. Numbers on edge represent the greatest gene

degree for each subtype. c DR score heatmap estimated from the gene expression

profiles of ROADMAP patients. d Progression-free survival plot of ROADMAP

patients that underwent cisplatin therapy. e Two waterfall plots enumerating cisplatin

resistance DR score estimated from the ROADMAP expression profiles colored based

on CMS (left panel) and patients’ cisplatin response profiles (right panel). f Cisplatin

IC50 boxplot of 64 biobank TNBC cells for non-responders (NR) and responders (R)

(left). Right boxplot is cisplatin resistance DR scores of NR and R from the biobank.

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 36: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 37: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 38: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 39: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

Table 1. Summary of gene expression profiles from the aggregated data

Matched clinical information is based on Gene Expression Omnibus (GEO). Sample counts

after filtering out ER, PR, and HER2 positive samples are listed below under primary and

metastasis. A part of the samples include long-distance metastasis described in tissue type. In

total, 601 primary and 356 metastasis samples were chosen for functional analysis.

ID Total Primary Metastasis Tissue type Platform

GSE103091 107 53 21 74 Breast GPL570

GSE11078 23 14 0 14 Breast GPL570

GSE2034 286 223 0 223 Breast GPL96

GSE46141 91 0 52 4 Breast, 8 Liver, 24 LN,

14 Skin, 1 Ascites, 1 Bone GPL10379

GSE12276 204 16 129 145 Breast GPL570

GSE14017 29 0 16 6 Bone, 8 Brain, 2 Lung GPL570

GSE14018 36 0 27 5 Bone, 5 Brain, 3 Liver,

14 Lung GPL96

GSE46563 94 14 0 14 Breast GPL6884

GSE46928 52 20 6 26 Breast GPL96

GSE5327 58 29 9 38 Breast GPL96

GSE54323 29 0 21 12 Bone, 6 Liver, 3 LN GPL570

GSE56493 120 0 74 11 Breast, 13 Liver, 29 LN,

20 Skin, 1 UNK GPL10379

GSE7390 198 141 0 141 Breast GPL96

GSE76124 198 91 1 92 Breast GPL570

Total 1,525 601 356

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 40: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

Table 2. Molecular subtype and clinical summary of ROADMAP patients

Characteristics MSL

n (%) LAR

n (%) IM

n (%) SL

n (%) P

Number of samples 4 12 1 21

Age < 50 y 3 (75) 5 (41.7) — 10 (47.6) 0.67

≥ 50 y 1 (25) 7 (58.3) 1 (100) 11 (52.4)

Sample type Fresh frozen 2 (50) 2 (16.7) — 6 (28.6) 0.58

FFPE 2 (50) 10 (83.3) 1 (100) 15 (71.4)

Metastases No metastases — 4 (33.3) — 9 (42.9) 0.42

Metastases 4 (100) 8 (66.7) 1 (100) 12 (57.1)

Number of

chemo before

biopsy

0 - 3 (25) — 5 (23.8) 0.87

1 3 (75) 4 (33.3) 1 (100) 9 (42.9)

≥ 2 1 (25) 5 (41.7) — 7 (33.3)

Cisplatin

response

Response 1 (25) 7 (58.4) 1 (100) 7 (33.3) 0.04

Non-response 3 (75) 1 (8.3) — 11 (52.4)

Unknown — 4 (33.3) — 3 (14.3)

Survival status Alive — 7 1 7 0.15

Dead 3 5 — 13

Unknown 1 — — 1

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453

Page 41: Genomic characteristics of triple-negative breast cancer ... · 11/8/2019  · 1 Original Article . Genomic characteristics of triple-negative breast cancer nominate molecular subtypes

Published OnlineFirst November 8, 2019.Mol Cancer Res   Jihyun Kim, Doyeong Yu, Youngmee Kwon, et al.   responsenominate molecular subtypes that predict chemotherapy Genomic characteristics of triple-negative breast cancer

  Updated version

  10.1158/1541-7786.MCR-19-0453doi:

Access the most recent version of this article at:

  Material

Supplementary

  http://mcr.aacrjournals.org/content/suppl/2019/11/08/1541-7786.MCR-19-0453.DC1

Access the most recent supplemental material at:

  Manuscript

Authorbeen edited. Author manuscripts have been peer reviewed and accepted for publication but have not yet

   

   

   

  E-mail alerts related to this article or journal.Sign up to receive free email-alerts

  Subscriptions

Reprints and

  [email protected] at

To order reprints of this article or to subscribe to the journal, contact the AACR Publications

  Permissions

  Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

.http://mcr.aacrjournals.org/content/early/2019/11/08/1541-7786.MCR-19-0453To request permission to re-use all or part of this article, use this link

Research. on November 24, 2020. © 2019 American Association for Cancermcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on November 8, 2019; DOI: 10.1158/1541-7786.MCR-19-0453