transcriptional landscape of b cell precursor acute ... · transcriptional landscape of b cell...

10
Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases Jian-Feng Li a,1 , Yu-Ting Dai a,1 , Henrik Lilljebjörn b,1 , Shu-Hong Shen c , Bo-Wen Cui a , Ling Bai a , Yuan-Fang Liu a , Mao-Xiang Qian d , Yasuo Kubota e , Hitoshi Kiyoi f , Itaru Matsumura g , Yasushi Miyazaki h , Linda Olsson b , Ah Moy Tan i , Hany Ariffin j , Jing Chen c , Junko Takita k , Takahiko Yasuda l , Hiroyuki Mano m , Bertil Johansson b,n , Jun J. Yang d,o , Allen Eng-Juh Yeoh p , Fumihiko Hayakawa q , Zhu Chen a,r,s,2 , Ching-Hon Pui o,2 , Thoas Fioretos b,n,2 , Sai-Juan Chen a,r,s,2 , and Jin-Yan Huang a,s,2 a State Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200025 Shanghai, China; b Department of Laboratory Medicine, Division of Clinical Genetics, Lund University, 22184 Lund, Sweden; c Key Laboratory of Pediatric Hematology and Oncology, Ministry of Health, Department of Hematology and Oncology, Shanghai Childrens Medical Center, Shanghai Jiao Tong University School of Medicine, 200127 Shanghai, China; d Department of Pharmaceutical Sciences, St. Jude Childrens Research Hospital, Memphis, TN 38105; e Department of Pediatrics, Graduate School of Medicine, The University of Tokyo, 1138654 Tokyo, Japan; f Department of Hematology and Oncology, Nagoya University Graduate School of Medicine, 4668550 Nagoya, Japan; g Division of Hematology and Rheumatology, Kinki University Faculty of Medicine, 5778502 Osaka, Japan; h Department of Hematology, Atomic Bomb Disease Institute, Nagasaki University, 8528521 Nagasaki, Japan; i Department of Paediatrics, KK Womens & Childrens Hospital, 229899 Singapore; j Paediatric Haematology-Oncology Unit, University of Malaya Medical Centre, 59100 Kuala Lumpur, Malaysia; k Department of Pediatrics, Graduate School of Medicine, Kyoto University, 6068501 Kyoto, Japan; l Clinical Research Center, Nagoya Medical Center, National Hospital Organization, 4600001 Nagoya, Japan; m National Cancer Center Research Institute, 1040045 Tokyo, Japan; n Department of Clinical Genetics, University and Regional Laboratories, Region Skåne, Lund 22185, Sweden; o Department of Oncology, St. Jude Childrens Research Hospital, Memphis, TN 38105; p Centre for Translational Research in Acute Leukaemia, Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, 119228 Singapore; q Department of Pathophysiological Laboratory Sciences, Nagoya University Graduate School of Medicine, 4618673 Nagoya, Japan; r Key Laboratory of Systems Biomedicine, Ministry of Education, Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China; and s Pôle de Recherches Sino-Français en Science du Vivant et Génomique, Laboratory of Molecular Pathology, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China Contributed by Zhu Chen, October 17, 2018 (sent for review August 29, 2018; reviewed by Christine J. Harrison and Patrick Tan) Most B cell precursor acute lymphoblastic leukemia (BCP ALL) can be classified into known major genetic subtypes, while a sub- stantial proportion of BCP ALL remains poorly characterized in relation to its underlying genomic abnormalities. We therefore initiated a large-scale international study to reanalyze and de- lineate the transcriptome landscape of 1,223 BCP ALL cases using RNA sequencing. Fourteen BCP ALL gene expression subgroups (G1 to G14) were identified. Apart from extending eight previously described subgroups (G1 to G8 associated with MEF2D fusions, TCF3PBX1 fusions, ETV6RUNX1positive/ETV6RUNX1like, DUX4 fusions, ZNF384 fusions, BCRABL1/Phlike, high hyperdiploidy, and KMT2A fusions), we defined six additional gene expression sub- groups: G9 was associated with both PAX5 and CRLF2 fusions; G10 and G11 with mutations in PAX5 (p.P80R) and IKZF1 (p.N159Y), respectively; G12 with IGHCEBPE fusion and mutations in ZEB2 (p.H1038R); and G13 and G14 with TCF3/4HLF and NUTM1 fu- sions, respectively. In pediatric BCP ALL, subgroups G2 to G5 and G7 (51 to 65/67 chromosomes) were associated with low-risk, G7 (with 50 chromosomes) and G9 were intermediate-risk, whereas G1, G6, and G8 were defined as high-risk subgroups. In adult BCP ALL, G1, G2, G6, and G8 were associated with high risk, while G4, G5, and G7 had relatively favorable outcomes. This large-scale transcriptome sequence analysis of BCP ALL revealed distinct mo- lecular subgroups that reflect discrete pathways of BCP ALL, informing disease classification and prognostic stratification. The combined results strongly advocate that RNA sequencing be in- troduced into the clinical diagnostic workup of BCP ALL. BCP ALL | RNA-seq | subtypes | gene fusion | gene mutation B cell precursor acute lymphoblastic leukemia (BCP ALL), the most common childhood cancer, is a highly heterogeneous malignant hematological disorder (1). Previous genome- and/or transcriptome-wide analyses of BCP ALLs have greatly im- proved our understanding of the pathogenesis and prognostic impact of many molecular abnormalities in BCP ALL (2, 3). Structural chromosomal alterations as well as sequence muta- tions are common in childhood and adult BCP ALL. In the last four decades, most of the recurring chromosomal abnormalities, including aneuploidy, chromosomal rearrangements/gene fu- sions (e.g., ETV6RUNX1, BCRABL1, and TCF3PBX1), and rearrangements of KMT2A (previously MLL), were identified by Significance In BCP ALL, molecular classification is used for risk stratification and influences treatment strategies. We reanalyzed the tran- scriptomic landscape of 1,223 BCP ALLs and identified 14 sub- groups based on their transcriptional profiles. Eight of these (G1 to G8) are previously well-known subgroups, harboring specific genetic abnormalities. The sample size allowed the identification of six previously undescribed subgroups, con- sisting of cases harboring PAX5 or CRLF2 fusions (G9), PAX5 (p.P80R) mutations (G10), IKZF1 (p.N159Y) mutations (G11), either ZEB2 (p.H1038R) mutations or IGHCEBPE fusions (G12), HLF rearrangements (G13), or NUTM rearrangements (G14). In addition, this study allowed us to determine the prognostic impact of several recently defined subgroups. This study suggests that RNA sequencing should be a valuable tool in the routine diagnostic workup for ALL. Author contributions: Z.C., C.-H.P., T.F., S.-J.C., and J.-Y.H. designed research; J.-F.L., Y.-T.D., H.L., S.-H.S., B.-W.C., L.B., Y.-F.L., M.-X.Q., Y.K., H.K., I.M., Y.M., L.O., A.M.T., H.A., J.C., J.T., T.Y., H.M., B.J., J.J.Y., A.E.-J.Y., F.H., Z.C., C.-H.P., T.F., S.-J.C., and J.-Y.H. performed research; S.-H.S., Y.-F.L., J.C., J.J.Y., and F.H. collected the samples and clinical data; J.-F.L., Y.-T.D., H.L., B.-W.C., L.B., Z.C., C.-H.P., T.F., S.-J.C., and J.-Y.H. analyzed data; Z.C., C.-H.P., T.F., S.-J.C., and J.-Y.H. wrote the paper; and J.-F.L., Z.C., C.-H.P., T.F., S.-J.C., and J.-Y.H. critically revised the manuscript. Reviewers: C.J.H., Newcastle University; and P.T., DukeNUS Medical School. The authors declare no conflict of interest. Published under the PNAS license. 1 J.-F.L., Y.-T.D., and H.L. contributed equally to this work. 2 To whom correspondence may be addressed. Email: [email protected], ching-hon.pui@ stjude.org, [email protected], [email protected], or [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1814397115/-/DCSupplemental. Published online November 28, 2018. www.pnas.org/cgi/doi/10.1073/pnas.1814397115 PNAS | vol. 115 | no. 50 | E11711E11720 GENETICS Downloaded by guest on October 21, 2020

Upload: others

Post on 04-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Transcriptional landscape of B cell precursor acute ... · Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases

Transcriptional landscape of B cell precursor acutelymphoblastic leukemia based on an internationalstudy of 1,223 casesJian-Feng Lia,1, Yu-Ting Daia,1, Henrik Lilljebjörnb,1, Shu-Hong Shenc, Bo-Wen Cuia, Ling Baia, Yuan-Fang Liua,Mao-Xiang Qiand, Yasuo Kubotae, Hitoshi Kiyoif, Itaru Matsumurag, Yasushi Miyazakih, Linda Olssonb, Ah Moy Tani,Hany Ariffinj, Jing Chenc, Junko Takitak, Takahiko Yasudal, Hiroyuki Manom, Bertil Johanssonb,n, Jun J. Yangd,o,Allen Eng-Juh Yeohp, Fumihiko Hayakawaq, Zhu Chena,r,s,2, Ching-Hon Puio,2, Thoas Fioretosb,n,2, Sai-Juan Chena,r,s,2,and Jin-Yan Huanga,s,2

aState Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, National Research Center for Translational Medicine, Rui-Jin Hospital, ShanghaiJiao Tong University School ofMedicine and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200025 Shanghai, China; bDepartment ofLaboratory Medicine, Division of Clinical Genetics, Lund University, 22184 Lund, Sweden; cKey Laboratory of Pediatric Hematology and Oncology, Ministry ofHealth, Department of Hematology and Oncology, Shanghai Children’s Medical Center, Shanghai Jiao Tong University School of Medicine, 200127 Shanghai,China; dDepartment of Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN 38105; eDepartment of Pediatrics, Graduate School ofMedicine, The University of Tokyo, 1138654 Tokyo, Japan; fDepartment of Hematology and Oncology, Nagoya University Graduate School of Medicine, 4668550Nagoya, Japan; gDivision of Hematology and Rheumatology, Kinki University Faculty of Medicine, 5778502 Osaka, Japan; hDepartment of Hematology, AtomicBomb Disease Institute, Nagasaki University, 8528521 Nagasaki, Japan; iDepartment of Paediatrics, KK Women’s & Children’s Hospital, 229899 Singapore;jPaediatric Haematology-Oncology Unit, University of Malaya Medical Centre, 59100 Kuala Lumpur, Malaysia; kDepartment of Pediatrics, Graduate School ofMedicine, Kyoto University, 6068501 Kyoto, Japan; lClinical Research Center, Nagoya Medical Center, National Hospital Organization, 4600001 Nagoya, Japan;mNational Cancer Center Research Institute, 1040045 Tokyo, Japan; nDepartment of Clinical Genetics, University and Regional Laboratories, Region Skåne, Lund22185, Sweden; oDepartment of Oncology, St. Jude Children’s Research Hospital, Memphis, TN 38105; pCentre for Translational Research in Acute Leukaemia,Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, 119228 Singapore; qDepartment of PathophysiologicalLaboratory Sciences, Nagoya University Graduate School of Medicine, 4618673 Nagoya, Japan; rKey Laboratory of Systems Biomedicine, Ministry of Education,Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China; and sPôle de Recherches Sino-Français en Science du Vivant etGénomique, Laboratory of Molecular Pathology, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China

Contributed by Zhu Chen, October 17, 2018 (sent for review August 29, 2018; reviewed by Christine J. Harrison and Patrick Tan)

Most B cell precursor acute lymphoblastic leukemia (BCP ALL) canbe classified into known major genetic subtypes, while a sub-stantial proportion of BCP ALL remains poorly characterized inrelation to its underlying genomic abnormalities. We thereforeinitiated a large-scale international study to reanalyze and de-lineate the transcriptome landscape of 1,223 BCP ALL cases usingRNA sequencing. Fourteen BCP ALL gene expression subgroups(G1 to G14) were identified. Apart from extending eight previouslydescribed subgroups (G1 to G8 associated with MEF2D fusions,TCF3–PBX1 fusions, ETV6–RUNX1–positive/ETV6–RUNX1–like, DUX4fusions, ZNF384 fusions, BCR–ABL1/Ph–like, high hyperdiploidy, andKMT2A fusions), we defined six additional gene expression sub-groups: G9 was associated with both PAX5 and CRLF2 fusions;G10 and G11 with mutations in PAX5 (p.P80R) and IKZF1 (p.N159Y),respectively; G12 with IGH–CEBPE fusion and mutations in ZEB2(p.H1038R); and G13 and G14 with TCF3/4–HLF and NUTM1 fu-sions, respectively. In pediatric BCP ALL, subgroups G2 to G5 andG7 (51 to 65/67 chromosomes) were associated with low-risk, G7(with ≤50 chromosomes) and G9 were intermediate-risk, whereasG1, G6, and G8 were defined as high-risk subgroups. In adult BCPALL, G1, G2, G6, and G8 were associated with high risk, while G4,G5, and G7 had relatively favorable outcomes. This large-scaletranscriptome sequence analysis of BCP ALL revealed distinct mo-lecular subgroups that reflect discrete pathways of BCP ALL,informing disease classification and prognostic stratification. Thecombined results strongly advocate that RNA sequencing be in-troduced into the clinical diagnostic workup of BCP ALL.

BCP ALL | RNA-seq | subtypes | gene fusion | gene mutation

Bcell precursor acute lymphoblastic leukemia (BCP ALL), themost common childhood cancer, is a highly heterogeneous

malignant hematological disorder (1). Previous genome- and/ortranscriptome-wide analyses of BCP ALLs have greatly im-proved our understanding of the pathogenesis and prognosticimpact of many molecular abnormalities in BCP ALL (2, 3).Structural chromosomal alterations as well as sequence muta-tions are common in childhood and adult BCP ALL. In the last

four decades, most of the recurring chromosomal abnormalities,including aneuploidy, chromosomal rearrangements/gene fu-sions (e.g., ETV6–RUNX1, BCR–ABL1, and TCF3–PBX1), andrearrangements of KMT2A (previously MLL), were identified by

Significance

In BCP ALL, molecular classification is used for risk stratificationand influences treatment strategies. We reanalyzed the tran-scriptomic landscape of 1,223 BCP ALLs and identified 14 sub-groups based on their transcriptional profiles. Eight of these(G1 to G8) are previously well-known subgroups, harboringspecific genetic abnormalities. The sample size allowed theidentification of six previously undescribed subgroups, con-sisting of cases harboring PAX5 or CRLF2 fusions (G9), PAX5(p.P80R) mutations (G10), IKZF1 (p.N159Y) mutations (G11),either ZEB2 (p.H1038R) mutations or IGH–CEBPE fusions (G12),HLF rearrangements (G13), or NUTM rearrangements (G14). Inaddition, this study allowed us to determine the prognosticimpact of several recently defined subgroups. This studysuggests that RNA sequencing should be a valuable tool in theroutine diagnostic workup for ALL.

Author contributions: Z.C., C.-H.P., T.F., S.-J.C., and J.-Y.H. designed research; J.-F.L.,Y.-T.D., H.L., S.-H.S., B.-W.C., L.B., Y.-F.L., M.-X.Q., Y.K., H.K., I.M., Y.M., L.O., A.M.T.,H.A., J.C., J.T., T.Y., H.M., B.J., J.J.Y., A.E.-J.Y., F.H., Z.C., C.-H.P., T.F., S.-J.C., and J.-Y.H.performed research; S.-H.S., Y.-F.L., J.C., J.J.Y., and F.H. collected the samples and clinicaldata; J.-F.L., Y.-T.D., H.L., B.-W.C., L.B., Z.C., C.-H.P., T.F., S.-J.C., and J.-Y.H. analyzed data;Z.C., C.-H.P., T.F., S.-J.C., and J.-Y.H. wrote the paper; and J.-F.L., Z.C., C.-H.P., T.F., S.-J.C.,and J.-Y.H. critically revised the manuscript.

Reviewers: C.J.H., Newcastle University; and P.T., Duke–NUS Medical School.

The authors declare no conflict of interest.

Published under the PNAS license.1J.-F.L., Y.-T.D., and H.L. contributed equally to this work.2To whom correspondence may be addressed. Email: [email protected], [email protected], [email protected], [email protected], or [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1814397115/-/DCSupplemental.

Published online November 28, 2018.

www.pnas.org/cgi/doi/10.1073/pnas.1814397115 PNAS | vol. 115 | no. 50 | E11711–E11720

GEN

ETICS

Dow

nloa

ded

by g

uest

on

Oct

ober

21,

202

0

Page 2: Transcriptional landscape of B cell precursor acute ... · Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases

cytogenetics and fluorescence in situ hybridization. Subsequently,gene expression profiling revealed that these cytogenetic sub-groups displayed specific gene expression patterns (3–5). With theadvent of genome sequencing technology, several groups discov-ered a large number of novel gene mutations and fusions, such as

those involving ZNF384, MEF2D, and DUX4 rearrangements (6–11), among those cases with no defining chromosomal abnor-malities, termed “B-other-ALL.”However, it remained unknown whether additional novel BCP

ALL subtypes could be detected by integrated analysis of pooled

AgeGender

MEF2D fusionsTCF3-PBX1

ETV6-RUNX1ETV6-RUNX1-like

DUX4 fusionsZNF384/ZNF362 fusions

BCR-ABL1Ph-like

CRLF2 fusionsHyperdiploidy

KMT2A fusionsPAX5 fusionsTCF3/4-HLF

NUTM1 fusionsIGH-CEBPE

PAX5 (p.P80R)ZEB2 (p.H1038R)IKZF1 (p.N159Y)

Subgroup

G1 (MEF2D fusions) G2 (TCF3-PBX1)G3 (ETV6-RUNX1/-like) G4 (DUX4 fusions)G5 (ZNF384 fusions) G6 (BCR-ABL1/Ph-like)G7 (Hyperdiploidy) G8 (KMT2A fusions)

G9 (PAX5 and CRLF2 fusions)

Color Key

Subgroup

G10 [PAX5 (p.P80R) mutation]G11 [IKZF1 (p.N159Y) mutation]G12 [ZEB2 (p.H1038R)/IGH-CEBPE]G13 (TCF3/4-HLF)G14 (NUTM1 fusions)

Low High

Adult PaediatricAgeMale FemaleGender

Subgroup(G10-G14)Age

Gender

TCF3/4-HLFNUTM1 fusions

IGH-CEBPE

PAX5 (p.P80R)

ZEB2 (p.H1038R)

IKZF1 (p.N159Y)

G1 G2 G3 G4 G5 G6 G7 G8 G9 G10-14

G10 G11 G12 G13 G14

Fig. 1. Two-step unsupervised hierarchical clustering of the global gene expression profile from 1,223 BCP ALL patients. In the gene expression subgroups ofG1 to G7 (Left) and G8 to G14 (Right), columns indicate 1,223 BCP ALL patients and rows represent gene expression levels or genetic features for each patient.Genes showing over- and underexpression in the heatmap are shown in red and blue, respectively. The first box above the heatmap indicates genotypes andfusion genes, followed by a box including three clusters of hotspot sequence mutations defined in this analysis. The first row below the heatmap specifies the14 BCP ALL subgroups identified on the basis of gene expression profiles. In the unsupervised hierarchical clustering heatmap of G10 to G14 (Lower Right),columns represent patients and rows are top variance genes in G10 to G14. The box below the heatmap indicates the five gene expression subgroups, gender,and genotypes of the G10 to G14 clusters.

E11712 | www.pnas.org/cgi/doi/10.1073/pnas.1814397115 Li et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

21,

202

0

Page 3: Transcriptional landscape of B cell precursor acute ... · Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases

datasets from studies with otherwise relatively small samplesizes. We hypothesized that the versatility provided by RNA-seq(sequencing) would uncover otherwise undetected genetic ab-normalities in BCP ALL, providing that sufficient numbers ofcases were analyzed. Thus, through the formation of an in-ternational consortium of five major study groups, we have de-lineated the transcriptomic landscape of BCP ALL and at thesame time identified new subgroups of biological and clinicalimportance.

ResultsIdentification of BCP ALL Subgroups with Distinctive Gene ExpressionProfiles and Genomic Aberrations. To comprehensively identifyBCP ALL subtypes, we first systematically classified gene ex-pression profiles, gene fusions, and gene mutations from RNA-seq data of 1,223 BCP ALL cases from five significant patientcohorts (Table 1 and SI Appendix, Fig. S2 and Dataset S2). Basedon a consecutive two-step unsupervised clustering, 14 distinctsubgroups based on their gene expression signatures were iden-tified (G1 to G14) (Fig. 1 and Table 2). Most of these geneexpression subgroups segregated with well-known genetic ab-normalities. TCF3–PBX1 fusions were present among theG2 subgroup (n = 76, 6%); ETV6–RUNX1 fusion belonged toG3; BCR–ABL1 (Ph) and BCR–ABL1–like (Ph-like, including acluster with CRLF2 fusions) comprised G6 (n = 167, 14%); andcases with a hyperdiploid karyotype formed the subgroup G7(n = 408, 33%). Three subgroups which had recently beenreported identified among B-other-ALL cases were those withMEF2D fusions (G1; n = 39, 3%), DUX4 rearrangements (G4;n = 63, 5%), and ZNF384 fusions (G5; n = 74, 6%) (6–11).These recently described subgroups formed distinctive geneexpression-based clusters, consistent with prior reports (6, 7, 10,11). The most recently defined BCP ALL ETV6–RUNX1–likecluster, characterized by the absence of ETV6–RUNX1 fusionsbut with similar gene expression profiles to ETV6–RUNX1–positive BCP ALLs (6), was also found among our combineddatasets. In concordance with previous findings (6), both fusionsinvolving ETV6 and fusions involving IKZF1 were common inthese ETV6–RUNX1–like cases (Dataset S2). However, allETV6–RUNX1–negative cases exhibiting a gene expression pro-file similar to ETV6–RUNX1–positive cases were defined asETV6–RUNX1–like. Together, ETV6–RUNX1–positive/ETV6–RUNX1–like BCP ALL constituted G3 (n = 161, 13%). KMT2A-rearranged cases formed a distinct subgroup (G8; n = 56, 5%).Notably, six previously undescribed gene expression subgroups(G9 to G14) with distinct genomic abnormalities were identified.G9 (n = 111, 9%) was associated with PAX5 fusions and “Ph-like” ALL with CRLF2 fusions (12). G10 (n = 23, 2%) and G11(n = 6, <1%) were characterized by two hotspot mutations inPAX5 (p.P80R) (21/22, 96%) and IKZF1 (p.N159Y) (6/6, 100%),respectively. The subgroup G12 (n = 8, <1%) was enriched forhotspot mutation in ZEB2 (p.H1038R) (5/8, 63%) and IGH–

CEBPE fusions (3/8, 27%). G13 (n = 11, <1%) and G14 (n =20, 2%) were associated with TCF3/4–HLF (7/11, 64%) andNUTM1 (6/20, 30%) rearrangements, respectively.

Nonsilent Sequence Mutation Profile. We next analyzed nonsilentsequence variants in available whole exome sequencing (WES)and RNA-seq data, based on in-house analysis criteria fromprevious studies (6, 8, 11, 13). We identified 44 genes that wererecurrently mutated in at least 1% of the cases (12/1,223 cases).Nonsilent variants in NRAS, KRAS, FLT3, KMT2D, PAX5,PTPN11, CREBBP, and TP53 exhibited the highest mutationfrequencies (3 to 14%) (SI Appendix, Figs. S5 and S6A). Themutated genes (>1%) were functionally divided into five cat-egories: signaling molecules, transcription factors, epigeneticfactors, cell cycle, and others (Dataset S3). Distinct gene mu-tation categories showed different levels of enrichment among

the gene expression subgroups G1 to G14. Gene mutationsamong signaling molecules were enriched in subgroups G5, G7,G9, and G10, while G4, G10, G11, and G12 harbored a highernumber of variants in transcription factor genes. HIST family(HIST1H2AG and HIST1H2AI) point mutations located in thehistone H2A type 1 domain (SI Appendix, Fig. S7A) were highlycorrelated with G2 (TCF3–PBX1), while WHSC1 (NSD2) pointmutations (p.E1099K) in the SET domain (SI Appendix, Fig.S7B) were significantly associated with G3 (ETV6–RUNX1–positive/ETV6–RUNX1–like; SI Appendix, Figs. S5–S7).Co-occurrence or mutual exclusivity of mutations was also

evaluated using two-sided Fisher’s exact test. A total of 36 genepairs (for example, TP53 and MYC) exhibited significant co-occurrence (P < 0.05; SI Appendix, Fig. S6B). Along with thenovel subgroups defined in this study (G9 to G14), 13 gene pairs(for example, PAX5 and PTPN11, and ZEB2 and NRAS) exhibitedsignificant co-occurrence (SI Appendix, Fig. S6 C and D). In G9, fourgene pairs, namely PAX5 and IKZF1, JAK1 and SETD2, SH2B3and ASXL1, and CDKN2A and ARID1B, exhibited significant co-occurrence (P < 0.05; SI Appendix, Fig. S6D).Enrichment of certain mutations differed between pediatric

and adult BCP ALL patients. Transcription factor mutations,such as in RUNX1, were more frequent in adult ALL, whilesignaling molecule and epigenetic factor WHSC1 mutations weremore prevalent in pediatric BCP ALL (Datasets S5 and S6).

ZNF362 Fusions Cluster with ZNF384 Rearrangements (G5) and DisplayActivation of the JAK-STAT Pathway. Four cases harbored pre-viously undescribed ZNF362 rearrangements (n = 4), includingSMARCA2–ZNF362 (n = 3) and TAF15–ZNF362 (n = 1). Thesecases clustered within the G5 subgroup, otherwise associatedwith ZNF384 fusions (Fig. 2 and SI Appendix, Figs. S8A and S9).ZNF384 and ZNF362 are homologous C2H2-type zinc-fingertranscription factors containing six zinc fingers that belong tothe zinc-finger protein 384/nuclear matrix transcription factor 4(ZFAM4) gene family (14). Of note, the zinc-finger domainswere retained in both fusion proteins (SI Appendix, Fig. S8B),and both clusters showed similar gene expression profiles withactivated JAK-STAT signaling pathway (SI Appendix, Fig. S8C).Moreover, the fusion partners of ZNF362, namely TAF15 andSMARCA2, were also found to fuse to ZNF384, with similarbreakpoints.

Previously Undescribed Subgroups Associated with Different GeneFusions/Sequence Mutations.G9: PAX5 and CRLF2 fusions are representative of this subgroup.According to the gene expression profiles, 46 cases with PAX5fusions and 13 cases with CRLF2 fusions (accounting for 41 and12%, respectively) clustered together in G9 (n = 111). Previouswork identified CRLF2 fusions in Down syndrome ALL and Ph-like BCP ALL, each of them accounting for approximately halfof the cases (12, 15). In our study, 30% of the cases with CRLF2fusions (13/44) were found in G9 and 57% (25/44) in the BCR–ABL1/Ph-like subgroup (G6), with the remaining cases presentin G7 and G10 (Fig. 1). Notably, all 13 CRLF2 fusions inG9 were P2RY8–CRLF2 fusions, in contrast to those in G6 inwhich the fusion partners of CRLF2 were either P2RY8 or IGH.In the 13 CRLF2 fusion cases (G9), seven coexisted with PAX5fusions. Signaling molecule mutations were also significantlyenriched in G9 (P < 0.001; SI Appendix, Fig. S5 and Dataset S5),a feature reminiscent of Down syndrome ALL (12). Comparedwith the CRLF2 fusion clusters in G6, the PI3K-Akt signaling(e.g., FLT4 and EGF), cytokine–cytokine receptor interaction(e.g., CCL17 and IL2RA), and hematopoietic cell lineage (e.g.,CD33 and CD34) pathways were significantly down-regulatedin the CRLF2 fusion-positive cases in G9 (SI Appendix, Fig.S10), whereas a B cell-specific member of the tumor necrosisfactor (TNF) receptor superfamily, TNFRSF13B, was up-regulated

Li et al. PNAS | vol. 115 | no. 50 | E11713

GEN

ETICS

Dow

nloa

ded

by g

uest

on

Oct

ober

21,

202

0

Page 4: Transcriptional landscape of B cell precursor acute ... · Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases

HHATLRRK1FLT3

P2RY8SLC17A9 STAT5A

ARHGEF28ARSG GNB3

LNX1EBF3

FLT4INHBBCRLF2

CXCL12 YAP1SALL1

10

20

−15 −10 −5 0 5 10

I

log2 (fold change)

[0 - 840]

[0 - 264]

[0 - 440]chr14:23,586,607

CEBPE

chr14:23,587,791

chr14:23,586,738

S65

SJB

ALL

0154

46

Clu

ster

in G

12

J31

Wildtype

C097

Ref Gene

ProteinExon 2 Exon 1

0281 Helix

[0 - 1264]

[0 - 12448]5/15 cases exon10:c.A3113G (p.H1038R) cluster in G12

IKZF1

Exp

ress

ion

leve

l (FP

KM

)

KEGG JAK-STAT signaling pathway

KEGG B-cell receptor signalingpathway

-0.35NES = -1.61P = 0.005

OtherBCP-ALLs

OtherBCP-ALLs

OtherBCP-ALLs

IKZF1 (p.N159Y) (G11)

IKZF1 (p.N159Y) (G11)

-0.15

0.05

Enr

ichm

ent S

core

(ES

)

-0.5-0.3-0.10.1

log2 (fold change)

−Log

10 (a

djus

ted

P−v

alue

)

Down-regulatedUp-regulated

IKZF1 (p.N159Y) (G11)

nsns

7.5

10.0

12.5

15.0

17.5

20.0

PAX5

p.P80R

NES = -1.43P = 0.11

*6/6 cases exon5:c.A475T (p.N159Y) cluster in G11

PAX5

ns****

***

10

PAX5 (p.P80R) (G10)Other PAX5 mutations Other BCP-ALLs

12141618

20

Exp

ress

ion

leve

l (FP

KM

)E

nric

hmen

t Sco

re (E

S)

KEGG Cell adhesionmolecules (CAMs)

HALLMARK PI3K_AKT_MTORsignaling

PAX5 (p.P80R) (G10)

OtherBCP-ALLs

PAX5 (p.P80R) (G10)

-0.6

-0.2-0.4

0.00.30.20.10.0

-0.1

−Log

10 (a

djus

ted

P−v

alue

)

NES = -1.47P = 0.02

NES = 1.63P = 0.01

** 21/22 cases exon3:c.C239G (p.P80R) cluster in G10

PAX and DNA binding Pax2_C

chr14:23,587,791

���

��

���

��

��

���

log2 (fold change)

Down-regulatedUp-regulatedPAX5 (p.P80R) (G10)

−Log

10 (a

djus

ted

P−v

alue

)

Down-regulatedUp-regulated

ZEB2 (p.H1038R)/CEBPE-IGH(G12)

GNAI1

SMAD1NT5EBMP2

LMO1FAT1ISL2KCTD12SALL4 SLC47A15

10

15

20

−10 −5 0 5

DPEP1

MEGF10

SMAD1KCNK3 TTC16CD72

CD58MYO10FAT1

CCL17 PCDH12IGF2BP1

NAV2IQCJ

0

25

50

75

−10 −5 0 5

A

C

B

D

F

G

H

E

Missense Frameshift NonsenseProtein insertion Protein deletion

1000 200 300 392PAX5

D2HY7CV20GQ22P

10

V26G

(10) N29D

N29K

R31WP34L(3)

2

R38H(2)R38C

3

L58F(3)L58P

C64F22

P80R

(22) A100P

N106D

F110VF110I

A111D

S131GV132AS133R(2)R140QR140L(2)

K196XE201fs(3)

N210fsR225LM335fsP363LA375VR377X

X392R(2)X392W

p.V26G

IKZF1 (p.N159Y) (G11)Other IKZF1 mutations Other BCP-ALLs

100 200 300 400 5200

0

IKZF1

D2YS17fs

L117_K118insNQR137_S138insAL

R143Q(3)G158S

6

N15

9Y(6

) E170KL177P

2

D186A(2)D186G

L216XT244A

D285NY348C M459fs

Y503XR511X

bZIP_1

200 400 600 800 1000 1215ZEB2

ZEB2

p.H1038R

M404IP595LI679N

N749SH777RA813TM824T

15

H10

38R

(15)

Q1072K(2)Q1072R(2)

*

23,588,000 23,589,00023,587,000

*

*

C2H2 Zn fingerC2H2 Zn finger

C2H2 Zn finger

Homeodomain

3

3

Fig. 2. Schematic representation of identified PAX5 (p.P80R) (G10), IKZF1 (p.N159Y) (G11), and ZEB2 (p.H1038R)/IGH–CEBPE (G12) subgroups in BCP ALL. (A, D, and F)Protein domain plots and the positions of amino acid substitutions in distinct domains of the PAX5, IKZF1, and ZEB2 proteins. Hotspot mutations enriched in BCP ALLsubgroups are marked with a red star (G10 to G12). (B and G) Structure prediction of the PAX5 and ZEB2 point mutations. The crystal structures of both the PAX5 andZEB2 proteins were generated based on the Protein Data Bank using homology modeling. (C) Gene expression levels and gene set enrichment analysis (GSEA) of PAX5(p.P80R) mutated cases. The violin plot (Left) shows the comparison of PAX5 expression levels between clusters of PAX5 (p.P80R)-positive samples, other PAX5mutations,and all other cases. Themean and 25th and 75th percentiles are presented in themiddle box of violin plots. The volcano plot (Right) shows differentially expressed genesbetween PAX5 (p.P80R)-positive (G10) patients and other patients. The x axis represents log2-transformed fold-change values, while the y axis is a −log10-transformed Pvalue. Significantly up-regulated and down-regulated genes are shown in red and blue, respectively. GSEA plot of B-lymphocyte maturation and cell-adhesion moleculesin PAX5 (p.P80R)-positive (G10) patients and other cases. P values were calculated by 1,000-gene set two-sided permutation tests. ns, not significant; *P< 0.05, **P < 0.01,***P < 0.001, and ****P < 0.0001. (E) Gene expression levels and GSEA of cases showing the IKZF1 (p.N159Y) mutation (G11). The violin plot (Left) shows the comparisonof IKZF1 expression levels between the cluster of IKZF1 (p.N159Y) cases (G11), cluster of other IKZF1 mutations, and other patients. The P values were calculated usingStudent’s t test. The volcano plot (Right) shows differentially expressed genes between IKZF1 (p.N159Y)-positive (G11) and -negative cases. GSEA plot of B cell receptorand the JAK-STAT signaling pathway in IKZF1 (p.N159Y)-positive (G11) and -negative cases. (H) Sequencing read coverage of CEBPE in four cases with IGH–CEBPE–positiveBCP ALL (three cases are clustered in G12). The blue arrows indicate the fusion breakpoints. (I) Gene expression volcano plot of ZEB2 (p.H1038R)/IGH–CEBPE (G12) cases.The volcano plot (Right) shows differentially expressed genes between ZEB2 (p.H1038R)/IGH–CEBPE (G12) cases and negative cases. FPKM, fragments per kilobase oftranscript per million mapped reads; ns, not significant; NES, normalized enrichment score; *P < 0.05, **P < 0.01, ***P < 0.001, and ****P < 0.0001.

E11714 | www.pnas.org/cgi/doi/10.1073/pnas.1814397115 Li et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

21,

202

0

Page 5: Transcriptional landscape of B cell precursor acute ... · Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases

among those cases with CRLF2 fusions in G9 (SI Appendix, Fig.S10C) (16). However, the expression patterns of cytokine receptorand tyrosine kinase signaling genes (CRLF2, PDGFRB, JAK1,JAK2, JAK3, and RAS) were similar in the CRLF2 fusion-positivecases in G9 and G6.G10: PAX5 (p.P80R) point mutation is strongly associated with a distinctgene expression profile. PAX5 encodes the B cell lineage-specificactivator protein that is normally expressed at the early stage ofB cell differentiation (17). It has previously been reported thatPAX5 haploinsufficiency is central to ALL pathogenesis (17). Inthe present study, 64 cases harbored PAX5 sequence mutations,including p.P80R (n = 22), p.V26G (n = 10), p.L58F/L58P (n =4), and others. PAX5 (p.P80R), located at the DNA-binding do-main, was correlated with increased expression of PAX5 (P <0.001) compared with other BCP ALLs without PAX5 mutations(Fig. 2 A–C and SI Appendix, Fig. S11A). Previous studies havedescribed heterozygous deletions of CDKN2A/B, IKZF1, andPAX5 in PAX5 (p.P80R)-positive BCP ALL patients (18). Nota-bly, 21 of the 22 PAX5 (p.P80R) cases clustered in subgroup G10,with no other known driver gene abnormalities detected, exceptfor one case with a P2RY8–CRLF2 fusion (C184) (Fig. 1). PAX5(p.P80R)-positive cases showed up-regulation of PI3K/Akt/mTORsignaling and down-regulation of cell-adhesion molecules (Fig. 2C).As in G9, TNFRSF13B gene up-regulation was seen in this sub-group (SI Appendix, Figs. S10C and S12A).G11: IKZF1 (p.N159Y) point mutation associated with a distinct geneexpression profile and increased SALL1 expression. Inherited or so-matic sequence mutations of IKZF1 have previously been de-scribed in BCP ALL (19–21). In the present series, 26 cases withIKZF1 sequence abnormalities were found, with mutationscommonly located in its DNA-binding domain (Fig. 2D and SIAppendix, Fig. S11B). Notably, IKZF1 (p.N159Y) cases (n = 6)formed a gene expression subgroup (G11) without other de-tectable genomic rearrangements (Fig. 1). Pathway analysisshowed down-regulation of B cell receptor signaling and JAK-STAT signaling such as FLT3 (P < 0.001) and STAT5A (P <0.001) (Fig. 2E). We also found that spalt-like transcription factor 1(SALL1) was overexpressed (P < 0.001) in G11 (SI Appendix, Fig.S12B). Previous studies have reported that SALL1 can recruithistone deacetylase (HDAC) to mediate transcriptional repressionand that its promoter is often methylated in BCP ALL (22, 23).G12: hotspot point mutations in ZEB2 (p.H1038R) and IGH–CEBPE fusion.ZEB2 is a member of the Zfh1 family of two-handed zinc-finger/homeodomain proteins. We and others have previously reportedmutations of ZEB2 in BCP ALL (9, 10, 24). Here, we showedthat ZEB2 was recurrently mutated (n = 25), with the p.H1038Rhotspot mutation (n = 15) being located within the DNA-bindingdomain (Fig. 2 F and G and SI Appendix, Fig. S11C). Based onunsupervised clustering of gene expression, cases with ZEB2(p.H1038R) (n = 5) clustered closely with cases with IGH–CEBPEfusions (n = 3). The remaining 10 cases with ZEB2 (p.H1038R)mutations mostly coexisted with other known gene fusions, such asTCF3–PBX1 (n = 1), DUX4 fusions (n = 1), ZNF384 fusions (n = 5),and ZNF362 fusions (n = 1). A significant enrichment of NRASmutations (5/8) was also found in the G12 cases. All four caseswith IGH–CEBPE fusion exhibited a truncation of the 3′ UTRregion of CEBPE (Fig. 2H). The known ALL driver gene LMO1was up-regulated in G12 (Fig. 2I and SI Appendix, Fig. S12C).G13: TCF3/4–HLF fusion. TCF3–HLF is a rare (<1%) fusion asso-ciated with high-risk BCP ALL and PAX5 haploinsufficiencyfrom allelic deletion. It has been shown that TCF3–HLF–positivecases may respond to the BCL2 inhibitor venetoclax (25). It hasalso been shown that the homologous TCF4 may compensate forTCF3 in a conditional knockout mice model (26). Herein, weidentified one case with a TCF4–HLF fusion, which clusteredwith six cases of TCF3–HLF in G13 (Fig. 1 and SI Appendix, Fig.S13). Both TCF3–HLF and TCF4–HLF retained part of the HLFbZIP_2 domain (Fig. 3A) and displayed significant up-regulated

expression of HLF (Fig. 3 A–C). Down-regulation of the JAK-STAT and an up-regulation of the NOTCH signaling pathwayswere also noted (Fig. 3D). Four cases with low expression ofHLF, but lacking TCF3/4–HLF fusions, were assigned to thiscluster (Fig. 1), based on evidence of expression signaturessimilar to TCF3/4–HLF fusion (e.g., BCL2, PAX5, JAK2, andSTAT5), suggesting that alternative genetic alterations may elicitthe same transcriptional program.G14: NUTM1 fusions with aberrantly high expression of NUTM1.NUTM1is a chromatin regulator that functions to recruit p300, leading toincreased local histone acetylation (27). NUTM1 is normally onlyexpressed in testis, but is frequently involved in NUT midlinecarcinoma (27). We found nine cases with distinct NUTM1 fu-sions (SI Appendix, Fig. S13), six of them clustering into theG14 subgroup (Fig. 1). The predicted protein structure showedthat all NUTM1 fusions retained part of the NUT domain (Fig. 3E and F). Furthermore, increased expression of NUTM1 result-ing from the fusion was found (Fig. 3G), possibly leading to aglobal change in chromatin acetylation. We also noted up-regulation of ZYG11A, a cell-cycle regulator, and HOXA fam-ily genes (Fig. 3H), which were slightly down-regulated in thethree NUTM1 fusion-positive cases which did not cluster in G14,especially ZYG11A and HOXA9 (Fig. 3I). In addition, gene setenrichment analysis showed a higher expression level of theNOTCH pathway and a down-regulation of genes in the Hedge-hog pathway among the G14 subgroup (Fig. 3J).

Prognostic Impact of Gene Expression Subgroups in BCP ALL. Wewere able to retrieve clinical follow-up data on 380 BCP ALLcases (31%), allowing us to investigate the prognostic impact ofthe different gene expression subgroups. As these patients weretreated on different protocols, we used BCR–ABL1–positivecases (n = 35) as a reference group for “high-risk” and ETV6–RUNX1–positive cases (n = 96) as a reference group for “low-risk” BCP ALL. We then compared the outcomes in terms of 5-yoverall survival and relapse-free survival rates of the other sub-types against these two reference groups and classified them intolow-, intermediate-, or high-risk groups. Due to the small samplesizes with available clinical data in subgroups G10 to G14, onlycases in G1 to G9 were analyzed for treatment outcome. Inpediatric BCP ALL, no deaths occurred in G2 (TCF3–PBX1),ETV6–RUNX1–like (a part of G3), G5 (ZNF384 fusions), andhigh hyperdiploidy (G7; 51 to 65/67 chromosomes) (SI Appendix,Fig. S14). In addition to these subtypes, G4 (DUX4 fusions) wasalso considered as low-risk, as no significant difference in overallsurvival was found in comparison with G3 (n = 46, P = 0.476; SIAppendix, Fig. S14). PAX5 and CRLF2 fusions (n = 33) andother cases in G7 (≤50 chromosomes, n = 14), however, wereclassified into the intermediate-risk group due to an inferior 5-yoverall survival compared with that of G3 (ETV6–RUNX1–positive/ETV6–RUNX1–like) (P < 0.05; SI Appendix, Fig. S14). In contrast,G1 (MEF2D fusions) and G8 (KMT2A fusions) were associated withhigh risk. Taken as a whole, among 295 pediatric patients, the RNA-seq–based subgroups stratified 193 (65%) as low-risk, 47 (16%) asintermediate-risk, and 55 (19%) as high-risk groups. Based on theCox proportional-hazards model, the range of hazard ratios betweenlow and intermediate risk was 10.7 [95% confidence interval (CI)3.3 to 34.1, P < 0.001] and between low and high risk was 14.52(4.8 to 44.1, P < 0.001) (Fig. 4). For 5-y relapse-free survival, hazardratios between low and intermediate risk was 2.1 (95% CI 1.0 to 4.5,P = 0.04) and between low and high risk was 3.6 (1.9 to 6.8, P <0.001). In adult BCP ALL, in the absence of G3 cases, the BCR–ABL–positive subgroup (G6) was used as the only reference, denot-ing high-risk BCP ALL. In this regard, G1 (MEF2D fusions), G2(TCF3–PBX1), and G8 (KMT2A fusions) were associated with poorprognosis, while G4 (DUX4 fusions), G5 (ZNF384 fusions), and G7(high hyperdiploidy) were associated with an intermediate prognosis(SI Appendix, Fig. S15). Overall, in adult BCP ALL, 47 (55%) of the

Li et al. PNAS | vol. 115 | no. 50 | E11715

GEN

ETICS

Dow

nloa

ded

by g

uest

on

Oct

ober

21,

202

0

Page 6: Transcriptional landscape of B cell precursor acute ... · Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases

A

E

F

G H

J

B

D

C

I

Fig. 3. Schematic representation of identified TCF3/4–HLF and NUTM1 fusions in BCP ALL. (A) Protein structure of TCF3, TCF4, HLF, and their fusion proteins. Thedotted red lines represent the joining points in the fusion proteins. (B) Violin plot of gene expression levels of HLF and NOTCH2 in TCF3/4–HLF fusion-positive and-negative patients. (C) Volcano plot of differentially expressed genes between TCF3/4–HLF fusion-positive and -negative patients. (D) GSEA plots of the JAK-STAT andNOTCH pathways in TCF3/4–HLF fusion-positive and -negative patients. (E) Protein structure of wild-type NUTM1 and distinct fusion partners. (F) Protein structure ofeach NUTM1 fusion protein. Red lines represent the joining points of the fusion proteins. (G) Violin plot of gene expression levels of NUTM1 in NUTM1 fusion-positiveand -negative cases. (H) Volcano plot of differentially expressed genes between NUTM1 fusion-positive and -negative cases. (I) Violin plot of gene expression levels ofZYG11A and HOXA9 in NUTM1 fusion-positive and -negative patients, excluding KMT2A fusions. (J) GSEA plot of the NOTCH signaling and Hedgehog signalingpathways in NUTM1 fusion-positive (G14) and -negative patients excluding KMT2A fusions (G8) cases. *P < 0.05, **P < 0.01, ***P < 0.001, and ****P < 0.0001.

E11716 | www.pnas.org/cgi/doi/10.1073/pnas.1814397115 Li et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

21,

202

0

Page 7: Transcriptional landscape of B cell precursor acute ... · Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases

85 patients were classified as intermediate-risk and 38 (45%) ashigh-risk.

DiscussionIn this comprehensive analysis of the transcriptomic landscape of1,223 BCP ALL cases, we identified 14 subgroups of BCP ALLbased on their gene expression profiles. Of these, eight werepreviously well-known subgroups, harboring specific genetic ab-normalities (MEF2D fusions, TCF3–PBX1, ETV6–RUNX1–positive/ETV6–RUNX1–like, DUX4 fusions, ZNF384 fusions, BCR–ABL1and Ph-like, high hyperdiploidy, and KMT2A fusions). Notably, thelarge sample size allowed us to identify six additional subgroups(G9 to G14), harboring distinct genetic alterations including genefusions and/or sequence mutations. The number of cases for someof the candidate leukemogenic abnormalities identified, such asZNF362 fusions, NUTM1 fusions, and PAX5/CRLF2 fusions, andhotspot mutations of PAX5 (p.P80R), IKZF1 (p.N159Y), and ZEB2(p.H1038R), was relatively small, which may explain the lack ofdetection of such cases in previous studies.

We addressed survival among the various BCP ALL subtypesin relation to pediatric and adult patients. Although the outcomedata originated from different study groups, we validated theprognostic impact of all previously known major subgroups ofBCP ALL and were able to ascertain the prognostic impact ofsome of the newly defined subgroups. Among the pediatric co-hort in this study, TCF3–PBX1 (G2), ETV6–RUNX1–positive/ETV6–RUNX1–like (G3), DUX4 fusions (G4), ZNF384 fusions(G5), and high hyperdiploidy (G7; 51 to 65/67 chromosomes)were defined as low-risk, PAX5 and CRLF2 fusions (G9) andother cases in G7 (≤50 chromosomes) were intermediate-risk,while MEF2D fusions (G1), BCR–ABL1/Ph-like (G6), and KMT2Afusions (G8) were defined as high-risk groups. In adults, MEF2Dfusions (G1), TCF3–PBX1 (G2), BCR–ABL1/Ph-like (G6), andKMT2A fusions (G8) were high-risk, as previously described,while DUX4 fusions (G4), ZNF384 fusions (G5), and high hyper-diploidy (G7) showed relatively favorable outcomes, albeit in-ferior to those of their pediatric counterparts. Even though thisis a large study, the numbers of patients with novel subgroups

Intermediate-risk vs low-risk: HR 2.1(95%CI 1.0 - 4.5), P = 0.04High-risk vs low-risk: HR 3.6(95%CI 1.9 - 6.8), P < 0.001

Low-riskIntermediate-riskHigh-risk

0 10 20 30 40 50 600

25

50

75

100

C D

Ove

rall

surv

ival

(%)

Adult Adult

Intermediate-riskHigh-risk

High-risk vs Intermediate-risk: HR 3.2(95%CI 1.5 - 6.9), P = 0.003

Intermediate-riskHigh-risk

0 10 20 30 40 50 600

25

50

75

100

0 10 20 30 40 50 600

25

50

75

100

A B

Ove

rall

surv

ival

(%)

Rel

apse

free

sur

viva

l (%

)R

elap

se fr

ee s

urvi

val (

%)

Pediatric Pediatric

Number at risk(number censored)

Number at risk(number censored)

Low-riskIntermediate-risk

High-risk

Intermediate-risk vs low-risk: HR 10.7(95%CI 3.3 - 34.1), P < 0.001High-risk vs low-risk: HR 14.52(95%CI 4.8 - 44.1), P < 0.001

Low-riskIntermediate-riskHigh-risk

0 10 20 30 40 50 600

25

50

75

100

High-risk vs Intermediate-risk: HR 3.0(95%CI 1.5 - 6.3), P = 0.003

Intermediate-riskHigh-risk

+ +++++

++

+++

+ ++++ + +

+++ + + + +

+ ++

+

+

++

+

+

+

+ + + + ++

47 (0) 36 (10) 25 (15) 20 (19) 17 (21) 10 (26) 7 (36)38 (0) 23 (9) 8 (15) 5 (16) 4 (17) 4 (17) 1 (21)

++++++++++++ ++++++++++++++++++++++++ + ++ ++ + + ++++ + ++++ ++ ++++++ +

++++ + + ++

++

++

++++++

+ ++ + + +

193 (0) 179 (14) 151 (41) 147 (44) 144 (47) 139 (53) 127 (189)47 (0) 44 (0) 35 (5) 32 (7) 31 (8) 28 (11) 26 (37)55 (0) 50 (2) 33 (9) 31 (11) 29 (12) 28 (13) 26 (41)

+++ ++++ ++ ++++++++++++++ ++++++++ + ++ +++ ++ ++ +++ ++++ +

++ + ++

+++ +

+ + ++

++

+

++++

++

+ + +

188 (0) 169 (15) 145 (37) 138 (40) 129 (43) 123 (47)112 (165)41 (0) 39 (0) 29 (5) 27 (7) 25 (8) 21 (11) 19 (31)47 (0) 38 (3) 27 (8) 24 (9) 22 (10) 20 (11) 19 (31)

+ +

++

++ +

++

++ +++ + + + ++ + + ++

+

+

+

+

+++

+ +

+

+ + +

45 (0) 30 (9) 21 (12) 17 (16) 14 (18) 9 (23) 6 (32)35 (0) 13 (10) 6 (14) 5 (14) 3 (15) 3 (15) 1 (18)

Fig. 4. Overall survival rates of pediatric and adult BCP ALL patients. Five-year overall survival (OS) curves (A) and 5-y relapse-free survival (RFS) curves (B) ofpediatric patients with low, intermediate, and high risk. Five-year OS curves (C) and 5-y RFS curves (D) of adult patients with intermediate and high risk. Theranges of hazard ratios (HRs) between low and intermediate risk, and low and high risk, are presented below the survival curves. Survival curves were es-timated with the Kaplan–Meier method and compared by two-sided log-rank test. Note: In pediatric cases, TCF3–PBX1 (G2), ETV6–RUNX1–like (G3), DUX4fusions (G4), ZNF384 fusions (G5), and high hyperdiploidy (G7; 51 to 65/67 chromosomes) subgroups displayed a low risk, other cases in G7 (≤50 chromosomes)and PAX5 and CRLF2 fusions (G9) showed an intermediate risk, whereas MEF2D fusions (G1), BCR-ABL1 (G6), and KMT2A fusions (G8) defined high-risksubgroups. In adult cases, MEF2D fusions (G1), TCF3–PBX1 (G2), BCR–ABL1 (G6), and KMT2A fusions (G8) were associated with high risk, while DUX4 fusions(G4), ZNF384 fusions (G5), and hyperdiploidy (G7) had relatively favorable outcomes. (A–D) Numbers listed on the x-axis are in months.

Li et al. PNAS | vol. 115 | no. 50 | E11717

GEN

ETICS

Dow

nloa

ded

by g

uest

on

Oct

ober

21,

202

0

Page 8: Transcriptional landscape of B cell precursor acute ... · Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases

(G10 to G14) are small and the treatments are heterogeneous, andthus more cases are needed to be analyzed in independent studiesin the future to validate their prognostic impact.Notably, fusion genes were generally mutually exclusive, sug-

gesting their role as drivers in the leukemogenic process. In con-trast, while some hotspot gene mutations, such as PAX5 (p.P80R)and IKZF1 (p.N159Y), were independent abnormalities suggestiveof their function as leukemia drivers, co-occurrence of many of theother point mutations indicated their potential cooperative role inleukemogenesis. A schematic summary of the major gene expres-sional/structural aberrations identified in our analysis is provided inFig. 5. These alterations are functionally located within distinct andwide-ranging cellular compartments, from cell-surface receptors tocytosolic signaling pathways, to transcription factors/cofactors fortranscriptional regulation essential for B-precursor development,and molecules involved in epigenetic regulation.A large body of evidence suggests that many BCP ALL sub-

groups have a unifying gene expression signature driven bysimilar but not identical gene fusion/mutation events. Hencethese genetic abnormalities molecularly “phenocopy” each otherand point to a convergent signaling pathway related to patho-genesis with the same transcriptomic subgroup. For example, thesimilarities in gene expression profiles and genetic aberrationsbetween the Ph-like and the BCR–ABL1 subtypes indicate thatthese are phenocopies of each other; a similar relationshipappears to exist between the ETV6–RUNX1 and ETV6–RUNX1–like subtypes. This large dataset has allowed us to systematically

identify such previously undescribed molecular phenocopies.ZNF384 fusions were the predominant fusions in the G5 subgroup,while ZNF362 fusions displayed the same expression signature andthus plausibly the same pathogenetic process. Also, the rare TCF4–HLF fusion evidently phenocopies TCF3–HLF rearrangements.Intriguingly, the hotspot point mutation ZEB2 (p.H1038R)appeared to phenocopy the IGH–CEBPE fusion, although themolecular relationship between these genetic aberrations is lessobvious than in the previous examples. All of these observationspoint to a common theme in BCP ALL: There are likely a lim-ited number of pathways leading to leukemogenesis in BCPALL, and each is identified by a distinct gene expression pattern.However, there are presumably several factors, such as complexgenetic backgrounds, coexisting genetic abnormalities, alterna-tive partner genes of fusions, and different cells of origin, that allcontribute to determine the dominating pathway in a single case,which can partially explain why cases sometime present outsideof the expected cluster.In conclusion, we additionally defined six gene expression sub-

groups. These six subgroups included cases characterized by PAX5and CRLF2 fusions; point mutations in PAX5 (p.P80R); pointmutations in IKZF1 (p.N159Y); IGH–CEBPE fusion or mutationsin ZEB2 (p.H1038R); TCF3/4–HLF fusion; and NUTM1 fusions.We have also demonstrated that transcriptome profiling by RNAsequencing allows the identification of distinct gene expressionsubgroups in BCP ALL with characteristic gene fusions and/or se-quence mutations that can be readily called using the integrative

NUTM1 fusions

CRLF2 fusionsIG

H-EPO

R

IGH-CEBPE

MEF2D fusions

TCF3-PBX1

BCR-ABL1

ZNF384 fusions

ZNF362 fusionsTCF3-HLF

TCF4-HLF

MLL fusions

PAX5 fusions

ETV6-RUNX1-like

JAK-STAT signaling

Cell cycle

RAS signaling

Signaling molecules

B-cell receptorsignaling

ETV6-RUNX1

IL-7RCRLF2

JAK2JAK1RAS

Antigen

EGFRB-cell receptor

NRAS, KRAS

TP53, MED12, CDKN2A/B

Epigenetic factorsTranscription factors

and co-factors

FLT3

PTPN11, NF1, SH2B3, IL7R, STAT5B

H2A

H3

H2B

H4

ZEB2

IKZF1ETV6

MYC

RUNX1

MGA

PAX5

PAX5(p.P80R)

ZEB2(p.H1038R)

T C CAGGA AC

T T G

G

T

C ACG T G

G

IKZF1(p.N159Y)

ASXL1/2, CHD4, NCOR2

TRRAP

Bind erasers

KDM6AErasers

CHD8Readers

Bind writers ARID1A/B, CTCFRemodel chromatin

CREBBP, EP300, EZH2,

KMT2A/C/DSET1B, SETD2,WHSC1

Writers

Fig. 5. Schematic figure of gene expression alterations and structural aberrations identified in this study. Representation of the various molecular abnor-malities that lead to leukemogenesis in BCP ALL. Known and novel gene fusions and their subcellular localizations are schematically represented. Threehotspot mutations, ZEB2 (p.H1038R), IKZF1 (p.N159Y), and PAX5 (p.P80R), that define distinct BCP ALL subgroups are located in the DNA-binding domains ofeach protein. Identified mutations in epigenetic regulators, such as KMT2D and WHSC1, are colored in green and shown as a pentagon in the nucleus.Additionally, transcription factor mutations such as IKZF1 and PAX5 are depicted at the left in the nucleus near the DNA chain, and mutations in cell-cycleregulators are depicted at the top left of the nucleus. Mutations found in signaling pathways such as JAK-STAT, RAS, and B cell receptor are depicted belowthe cell-surface membrane. Note: The epigenetic regulatory genes that covalently modify histones are classified as writers, erasers, readers, and remodel.Writers: proteins that can add epigenetic modifications; erasers: proteins that erase epigenetic modifications; readers: proteins that can recognize epigeneticmodifications; bind writers: proteins that can bind the writers; bind erasers: proteins that can bind the erasers. Remodel chromatin: proteins that arefunctionally relevant to chromatin remodeling. MLL fusions are also known as KMT2A fusions.

E11718 | www.pnas.org/cgi/doi/10.1073/pnas.1814397115 Li et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

21,

202

0

Page 9: Transcriptional landscape of B cell precursor acute ... · Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases

analysis described in this study. Apart from providing informationon perturbed transcriptional programs/signaling pathways that maybe amenable to therapeutic targeting, the identified gene expressionsubgroups are likely important for improved disease stratificationand prognostication of BCP ALL. Hence, our combined results ofthis collaborative study strongly advocate for RNA-seq being ap-plied in the clinical diagnostic workup of BCP ALL.

Materials and MethodsPatients. Transcriptome (RNA-seq) and other genomic data of all patientsanalyzed in this study are listed in Dataset S1. All of the included datasetshave been analyzed as part of previous publications (3, 6–8, 10, 11, 18, 19).Basic clinical characteristics and genetic types of collected BCP ALL cohortsare shown in Table 1 and Dataset S2. The Lund University Hospital (LUH)cohort (cohort 2) (6) and the Singapore and Malaysia MaSpore cohort(MaSpore; cohort 4) (7) included only childhood BCP ALL cases. The vast

majority of cohort-2 patients were treated according to the Nordic Society ofPaediatric Haematology and Oncology (NOPHO) ALL 1992, 2000, or 2008 pro-tocols (6), and cohort-4 patients were enrolled on the MaSpore frontline ALLprotocols (7). The Japan Adult Leukemia Study Group (JALSG) cohort (cohort 3)(8) comprised adolescents and young adults with Philadelphia chromosome-negative ALL who were treated with the JALSG ALL202-U (adults) and TCCSGL04-16 (pediatric) protocols (8, 28). BCP ALL patients in the Chinese cohort(cohort 4) enrolled in this study were diagnosed and/or treated in the Multi-center Hematology-Oncology Protocols Evaluation System (M-HOPES) by theShanghai Institute of Hematology (SIH)-based hospital network. Adult patientswere enrolled in an SIH trial (Chinese Clinical Trial Registry; no. ChiCTR-ONRC-14004968), which was basically a modification of the vincristine, daunorubicin,L-asparaginase, cyclophosphamide, and prednisone regimen. Pediatric patientsin the Chinese cohort were enrolled in the Shanghai Children’s Medical CenterALL-2005 protocol (Chinese Clinical Trial Registry; no. ONC-14005003) (10).There were two TARGET/COG (Therapeutically Applicable Research to Gener-ate Effective Treatments/Children’s Oncology Group) cohorts (cohort 5 and

Table 1. Clinical characteristics and major genetic features of the BCP ALL cohorts included in the analysis

CharacteristicsCohort 1

(SIH, n = 166)Cohort 2

(LUH, n = 182)Cohort 3

(JALSG, n = 71)Cohort 4

(MaSpore, n = 194)Cohort 5

(TARGET/COG, n = 394)Cohort 6

(TARGET/COG, n = 216)Total

(n = 1,223)

Age at diagnosisMean, y 19.41 5.04 15.42 5.84 17.96 7.87 12.60Median, y 15.01 4.00 17.00 4.56 13.00 6.44 7.95<18 y 91 (55) 182 (100) 39 (55) 194 (100) 248 (63) 152 (70) 906 (74)≥18 y 75 (45) 0 32 (45) 0 145 (37) 6 (3) 258 (21)Not available 0 0 0 0 1 58 (27) 59 (5)

GenderMale 95 (57) 107 (59) 30 (42) 111 (57) 209 (53) 105 (49) 657 (54)Female 71 (43) 75 (41) 41 (58) 83 (43) 185 (47) 111 (51) 566 (46)

FusionsBCR–ABL1 27 (16) 5 (3) NA 9 (5) 12 (3) 6 (3) 59 (5)ETV6–RUNX1 19 (11) 45 (25) 2 (3) 36 (19) 18 (5) 14 (6) 134 (11)TCF3–PBX1 17 (10) 13 (7) 6 (8) 13 (7) 11 (3) 16 (7) 76 (6)KMT2A 8 (5) 14 (8) 2 (3) 7 (4) 9 (2) 6 (3) 46 (4)DUX4 9 (5) 8 (4) 10 (14) 23 (12) NA 2 (1) 52 (4)MEF2D 7 (4) 1 (1) 7 (10) 2 (1) 18 (5) 5 (2) 40 (3)ZNF384 15 (9) 2 (1) 10 (14) 11 (6) 11 (3) 17 (8) 66 (5)

Data are years or no. of patients (%). Percentages might not add up to 100% because of rounding. Note: Cohort 3 (JALSG) does not include BCR–ABLpatients. NA, not available.

Table 2. Proposed BCP ALL subgroups based on gene expression and gene fusion/sequence mutation patterns

RNA-seq data-based subgroups

Frequency in the studycohort (n = 1,223),no. of patients (%) Most frequently mutated genes (%)

MEF2D fusions (G1) 39 (3) MEF2D–BCL9 (67), MEF2D–HNRNPUL1 (21), NRAS (13), KMT2A (10)TCF3–PBX1 (G2) 76 (6) TCF3–PBX1 (100), TP53 (8)ETV6–RUNX1/–like (G3) 161 (13) ETV6–RUNX1 (82), WHSC1 (9), KRAS (7), NRAS (6)DUX4 fusions (G4) 63 (5) DUX4–IGH (78), NRAS (30), MYC (11), TP53 (11), PTPN11 (11), KMT2D (11), CTCF (8),

FLT3 (8), PAX5 (8)ZNF384 fusions (G5) 74 (6) EP300–ZNF384 (53), TCF3–ZNF384 (12), TAF15–ZNF384 (11), SMARCA2–ZNF362 (4),

NRAS (14), KRAS (12),FLT3 (14), PTPN11 (14), SETD1B (9), ZEB2 (8), EZH2 (8), KMT2D (7)

BCR–ABL1/Ph–like (G6) 167 (14) BCR–ABL1 (31), IGH–CRLF2 (10), JAK2 fusions (10), ABL1 fusions (7), IGH–EPOR (7),P2RY8–CRLF2 (5), KRAS (6), JAK2 (7), RUNX1 (5)

Hyperdiploidy (G7) 408 (33) NRAS (19), KRAS (18), FLT3 (13), PTPN11 (8), KMT2D (7), CREBBP (6)KMT2A fusions (G8) 56 (5) KMT2A–AFF1 (29), KMT2A–MLLT1 (25), KMT2A–MLLT3 (13), KRAS (13), NRAS (14),

FLT3 (7)PAX5 and CRLF2 fusions (G9) 111 (9) P2RY8–CRLF2 (12), PAX5–NOL4L (8), PAX5–AUTS2 (6), NRAS (23), KRAS (23), PAX5

(12), FLT3 (11), JAK1 (8)PAX5 (p.P80R) mutation (G10) 23 (2) PAX5 (96), PTPN11 (26), NRAS (22), KRAS (17), FLT3 (13), IL7R (9), SETD2 (9)IKZF1 (p.N159Y) mutation (G11) 6 (<1) IKZF1 (100), KRAS (17), KMT2D (17)ZEB2 (p.H1038R)/IGH–CEBPE (G12) 8 (<1) ZEB2 (75), NRAS (62), KMT2D (25), KRAS (12), KMT2A (12), CDKN2A (12)TCF3/4–HLF (G13) 11 (<1) TCF3/4–HLF (64), KRAS (18), NRAS (9), ZEB2 (9), ASXL2 (9)NUTM1 fusions (G14) 20 (2) NUTM1 fusions (30), TP53 (15), KRAS (10), CREBBP (15), KMT2D (10), SETD1B (10)

Li et al. PNAS | vol. 115 | no. 50 | E11719

GEN

ETICS

Dow

nloa

ded

by g

uest

on

Oct

ober

21,

202

0

Page 10: Transcriptional landscape of B cell precursor acute ... · Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases

cohort 6), with the data accession nos. EGAS00001001952 and phs000463/phs000464, respectively (3, 11, 18, 29). Informed consent was obtained from allpatients, and the study was approved by the ethics committee of Rui-jinHospital. The clinical outcome data of the TARGET/COG cohorts were notavailable. The comparability of the clinical data from the different cohorts wassupported by very similar survival curves for the favorable genetic subtype(ETV6–RUNX1) and the unfavorable genetic subtype (BCR–ABL1/Ph-like cases)of ALL among these cohorts (SI Appendix, Fig. S1).

RNA-Seq Data Analyses. Reading pairs were aligned to human referencegenomes hg38 (fusion gene analysis) and hg19 (gene expression and genemutation calling). Principal component analysis was applied on the RNA-seqdata of the 1,223 BCP ALL cases, and batch effects were adjusted by the SVApackage (30) (SI Appendix, Fig. S3). To investigate the bias of different co-horts, age, gender, and race on gene expression, we checked the distribu-tion of well-known biomarkers in the gene expression clusters. No obviousbias based on cohort, age, gender, and race was found. The patients mainlyclustered based on the different gene expression profiles related to un-derlying genetic abnormalities. Procedures of reading pair alignment, mu-tation calling from RNA-seq data, and gene expression/pathway analysis arelisted in SI Appendix, Materials and Methods.

Statistical Analyses. We tested mutual exclusivity and co-occurrence of mu-tations for the 44 most frequently mutated genes (>1%). For gene pairs, wecompleted the two-sided Fisher’s exact test according to their mutationstatus (positive or negative). The R package QVALUE (v2.10.1) (31) was usedto control for multiple testing. Comparisons of categorical variables wereascertained by Pearson’s χ2 test or Fisher’s exact test. Overall survival wascalculated from time of diagnosis to death, while relapse-free survival was

calculated from time of complete remission to relapse. The Kaplan–Meiermethod, log-rank test, and Cox proportional-hazards model were used tocalculate estimates of survival probabilities and hazard ratios. Two-sided Pvalues are reported, and the significance level was set to less than 0.05.Analyses were performed with the use of R (v3.4.4).

ACKNOWLEDGMENTS. We thank TARGET/COG and St. Jude Children’s Re-search Hospital for providing the RNA-seq data in this analysis. The RNA-seqdataset and clinical information for the TARGET/COG ALL project used in thisstudy are available in the database of Genotypes and Phenotypes (dbGaP)under accession phs000218.v20.p7 and European Genome Phenome archive,accessions EGAS00001000654 and EGAS00001001952. This work was sup-ported by Mega-Projects of Scientific Research for the 12th Five-Year Plan(2013ZX09303302); National Natural Science Foundation of China (Grants81570122 and 81570122); Shanghai Municipal Education Commission-GaofengClinical Medicine Grant Support (Grant 20161303); Program for Professor ofSpecial Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learn-ing (Grant QD2015005); Fine Classification and Standardized Treatment of Chil-dren with Acute Leukemia of Multi Center Clinical Research (Grant 14411950600),Shanghai Municipal Science and Technology Commission; National Key Researchand Development Program (Grant 2016YFC0902800); Practical Research forInnovative Cancer Control from the Japan Agency for Medical Research andDevelopment; US National Institutes of Health Grants CA21765, CA36401,and U01 GM115279; American Lebanese Syrian Associated Charities (ALSAC);Swedish Cancer Society, Swedish Childhood Cancer Foundation, Swedish Re-search Council, Knut and Alice Wallenberg Foundation, and Governmental(ALF) Funding of Clinical Research within the Swedish National Health Service;NMRC/CSA/0053/2013; VIVA Foundation for Children with Cancer; Goh Foun-dation; Children’s Cancer Foundation, Singapore Totalisator Board; SamuelWaxman Cancer Research Foundation; and Center for HPC at Shanghai JiaoTong University.

1. Pui C-H, Yang JJ, Bhakta N, Rodriguez-Galindo C (2018) Global efforts toward the cureof childhood acute lymphoblastic leukaemia. Lancet Child Adolesc Health 2:440–454.

2. Holmfeldt L, et al. (2013) The genomic landscape of hypodiploid acute lymphoblasticleukemia. Nat Genet 45:242–252.

3. Roberts KG, et al. (2012) Genetic alterations activating kinase and cytokine receptorsignaling in high-risk acute lymphoblastic leukemia. Cancer Cell 22:153–166.

4. Den Boer ML, et al. (2009) A subtype of childhood acute lymphoblastic leukaemiawith poor treatment outcome: A genome-wide classification study. Lancet Oncol 10:125–134.

5. Andersson A, et al. (2007) Microarray-based classification of a consecutive series of121 childhood acute leukemias: Prediction of leukemic and genetic subtype as well asof minimal residual disease status. Leukemia 21:1198–1203.

6. Lilljebjörn H, et al. (2016) Identification of ETV6-RUNX1-like and DUX4-rearrangedsubtypes in paediatric B-cell precursor acute lymphoblastic leukaemia. Nat Commun7:11790.

7. Qian M, et al. (2017) Whole-transcriptome sequencing identifies a distinct subtype ofacute lymphoblastic leukemia with predominant genomic abnormalities of EP300 andCREBBP. Genome Res 27:185–195.

8. Yasuda T, et al. (2016) Recurrent DUX4 fusions in B cell acute lymphoblastic leukemiaof adolescents and young adults. Nat Genet 48:569–574.

9. Zhang J, et al.; St. Jude Children’s Research Hospital–Washington University PediatricCancer Genome Project (2016) Deregulation of DUX4 and ERG in acute lymphoblasticleukemia. Nat Genet 48:1481–1489.

10. Liu YF, et al. (2016) Genomic profiling of adult and pediatric B-cell acute lympho-blastic leukemia. EBioMedicine 8:173–183.

11. Gu Z, et al. (2016) Genomic analyses identify recurrent MEF2D fusions in acute lym-phoblastic leukaemia. Nat Commun 7:13331.

12. Schwartzman O, et al. (2017) Suppressors and activators of JAK-STAT signaling atdiagnosis and relapse of acute lymphoblastic leukemia in Down syndrome. Proc NatlAcad Sci USA 114:E4030–E4039.

13. Chen B, et al. (2018) Identification of fusion genes and characterization of tran-scriptome features in T-cell acute lymphoblastic leukemia. Proc Natl Acad Sci USA 115:373–378.

14. Seetharam A, Bai Y, Stuart GW (2010) A survey of well conserved families ofC2H2 zinc-finger genes in Daphnia. BMC Genomics 11:276.

15. Mullighan CG, et al. (2009) Rearrangement of CRLF2 in B-progenitor- and Downsyndrome-associated acute lymphoblastic leukemia. Nat Genet 41:1243–1246.

16. Salzer U, et al. (2009) Relevance of biallelic versus monoallelic TNFRSF13B mutationsin distinguishing disease-causing from risk-increasing TNFRSF13B variants in antibodydeficiency syndromes. Blood 113:1967–1976.

17. Dang J, et al. (2015) PAX5 is a tumor suppressor in mouse mutagenesis models ofacute lymphoblastic leukemia. Blood 125:3609–3617.

18. Roberts KG, et al. (2014) Targetable kinase-activating lesions in Ph-like acute lym-phoblastic leukemia. N Engl J Med 371:1005–1015.

19. Churchman ML, et al. (2015) Efficacy of retinoids in IKZF1-mutated BCR-ABL1 acutelymphoblastic leukemia. Cancer Cell 28:343–356.

20. Churchman ML, et al. (2018) Germline genetic IKZF1 variation and predisposition tochildhood acute lymphoblastic leukemia. Cancer Cell 33:937–948.e8.

21. Olsson L, et al. (2015) Cooperative genetic changes in pediatric B-cell precursor acutelymphoblastic leukemia with deletions or mutations of IKZF1. Genes ChromosomesCancer 54:315–325.

22. Ma C, et al. (2018) SALL1 functions as a tumor suppressor in breast cancer by regu-lating cancer cell senescence and metastasis through the NuRD complex. Mol Cancer17:78.

23. Kuang SQ, et al. (2008) Genome-wide identification of aberrantly methylated pro-moter associated CpG islands in acute lymphocytic leukemia. Leukemia 22:1529–1538.

24. Ma X, et al. (2018) Pan-cancer genome and transcriptome analyses of 1,699 paediatricleukaemias and solid tumours. Nature 555:371–376.

25. Fischer U, et al. (2015) Genomics and drug profiling of fatal TCF3-HLF-positive acutelymphoblastic leukemia identifies recurrent mutation patterns and therapeutic op-tions. Nat Genet 47:1020–1029.

26. Nguyen H, et al. (2009) Tcf3 and Tcf4 are essential for long-term homeostasis of skinepithelia. Nat Genet 41:1068–1075.

27. Alekseyenko AA, et al. (2015) The oncogenic BRD4-NUT chromatin regulator drivesaberrant transcription within large topological domains. Genes Dev 29:1507–1523.

28. Takahashi H, et al. (2018) Treatment outcome of children with acute lymphoblasticleukemia: The Tokyo Children’s Cancer Study Group (TCCSG) study L04-16. Int JHematol 108:98–108.

29. Pui CH, et al. (2015) Childhood acute lymphoblastic leukemia: Progress through col-laboration. J Clin Oncol 33:2938–2948.

30. Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD (2012) The sva package for re-moving batch effects and other unwanted variation in high-throughput experiments.Bioinformatics 28:882–883.

31. Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. ProcNatl Acad Sci USA 100:9440–9445.

E11720 | www.pnas.org/cgi/doi/10.1073/pnas.1814397115 Li et al.

Dow

nloa

ded

by g

uest

on

Oct

ober

21,

202

0