most variable genes and transcription factors in acute...

11
Vol.:(0123456789) 1 3 Interdisciplinary Sciences: Computational Life Sciences https://doi.org/10.1007/s12539-019-00325-y ORIGINAL RESEARCH ARTICLE Most Variable Genes and Transcription Factors in Acute Lymphoblastic Leukemia Patients Anil Kumar Tomar 1  · Rahul Agarwal 2  · Bishwajit Kundu 1 Received: 24 September 2018 / Revised: 21 January 2019 / Accepted: 26 February 2019 © International Association of Scientists in the Interdisciplinary Areas 2019 Abstract Acute lymphoblastic leukemia (ALL) is a hematologic tumor caused by cell cycle aberrations due to accumulating genetic disturbances in the expression of transcription factors (TFs), signaling oncogenes and tumor suppressors. Though survival rate in childhood ALL patients is increased up to 80% with recent medical advances, treatment of adults and childhood relapse cases still remains challenging. Here, we have performed bioinformatics analysis of 207 ALL patients’ mRNA expression data retrieved from the ICGC data portal with an objective to mark out the decisive genes and pathways responsible for ALL pathogenesis and aggression. For analysis, 3361 most variable genes, including 276 transcription factors (out of 16,807 genes) were sorted based on the coefficient of variance. Silhouette width analysis classified 207 ALL patients into 6 subtypes and heat map analysis suggests a need of large and multicenter dataset for non-overlapping subtype classification. Overall, 265 GO terms and 32 KEGG pathways were enriched. The lists were dominated by cancer-associated entries and highlight crucial genes and pathways that can be targeted for designing more specific ALL therapeutics. Differential gene expression analysis identified upregulation of two important genes, JCHAIN and CRLF2 in dead patients’ cohort suggesting their pos- sible involvement in different clinical outcomes in ALL patients undergoing the same treatment. Keywords Gene expression · KEGG pathways · Leukemia · Most variable genes · Subtype classification 1 Introduction Leukemia, cancer of blood or bone morrow, is widely clas- sified into four major categories—acute myeloid leukemia (AML), chronic myeloid leukemia (CML), acute lympho- cytic leukemia (ALL) and chronic lymphocytic leukemia (CLL). The basic parameters of this classification are rate of cancer progression and site of cancer development (http:// www.cancercenter.com/). ALL is a blood malignancy char- acterized by uncontrolled proliferation of lymphoblasts, immature B and/or T cells. Though ALL can occur at any age, it is most common in children and adolescents. B-cell acute lymphocytic leukemia (B-ALL) accounts for about 85% and 75% of childhood ALL and adult ALL cases, respectively, with male predominance, while T-cell acute lymphocytic leukemia (T-ALL) accounts for the remaining cases [1, 2]. With recent medical advances in treatment pro- tocols, global survival rate in childhood ALL is increased substantially (> 80%); however, survival rate in adults still remains less than 40% [35]. Also, survival in the ALL patients who experience a relapse is very poor [6]. Uncontrolled cell proliferation due to loss of cell cycle control is the hallmark of cancer [7, 8]. Chromosomal rear- rangements are common genetic abnormalities in B-ALL, e.g., BCR-ABL1, ETV6-RUNX1 and TCF3-PBX1 [9]. Also, aberrant expression of transcription factors associated with lymphoid development, e.g., PAX5, EBF1 and IKZF1 has been reported in more than 60% B-ALL cases [10, 11]. CRLF2 rearrangements and JAK mutations are also detected in B-ALL cases [12]. The genomic profiling of high-risk ALL patients has identified rearrangements of ABL1, JAK2, PDGFRB, CRLF2 and EPOR, activating mutations of IL7R and FLT3 and deletion of SH2B3 [13]. Regardless of all the Electronic supplementary material The online version of this article (https://doi.org/10.1007/s12539-019-00325-y) contains supplementary material, which is available to authorized users. * Anil Kumar Tomar [email protected] 1 Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India 2 Department of Reproductive Biology, All India Institute of Medical Sciences, New Delhi 110029, India

Upload: others

Post on 19-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Most Variable Genes and Transcription Factors in Acute …web.iitd.ac.in/~bkundu/files/Tomar2019_Article_Most... · 2019. 6. 24. · Interdisciplinary Sciences: Computational Life

Vol.:(0123456789)1 3

Interdisciplinary Sciences: Computational Life Sciences https://doi.org/10.1007/s12539-019-00325-y

ORIGINAL RESEARCH ARTICLE

Most Variable Genes and Transcription Factors in Acute Lymphoblastic Leukemia Patients

Anil Kumar Tomar1  · Rahul Agarwal2 · Bishwajit Kundu1

Received: 24 September 2018 / Revised: 21 January 2019 / Accepted: 26 February 2019 © International Association of Scientists in the Interdisciplinary Areas 2019

AbstractAcute lymphoblastic leukemia (ALL) is a hematologic tumor caused by cell cycle aberrations due to accumulating genetic disturbances in the expression of transcription factors (TFs), signaling oncogenes and tumor suppressors. Though survival rate in childhood ALL patients is increased up to 80% with recent medical advances, treatment of adults and childhood relapse cases still remains challenging. Here, we have performed bioinformatics analysis of 207 ALL patients’ mRNA expression data retrieved from the ICGC data portal with an objective to mark out the decisive genes and pathways responsible for ALL pathogenesis and aggression. For analysis, 3361 most variable genes, including 276 transcription factors (out of 16,807 genes) were sorted based on the coefficient of variance. Silhouette width analysis classified 207 ALL patients into 6 subtypes and heat map analysis suggests a need of large and multicenter dataset for non-overlapping subtype classification. Overall, 265 GO terms and 32 KEGG pathways were enriched. The lists were dominated by cancer-associated entries and highlight crucial genes and pathways that can be targeted for designing more specific ALL therapeutics. Differential gene expression analysis identified upregulation of two important genes, JCHAIN and CRLF2 in dead patients’ cohort suggesting their pos-sible involvement in different clinical outcomes in ALL patients undergoing the same treatment.

Keywords Gene expression · KEGG pathways · Leukemia · Most variable genes · Subtype classification

1 Introduction

Leukemia, cancer of blood or bone morrow, is widely clas-sified into four major categories—acute myeloid leukemia (AML), chronic myeloid leukemia (CML), acute lympho-cytic leukemia (ALL) and chronic lymphocytic leukemia (CLL). The basic parameters of this classification are rate of cancer progression and site of cancer development (http://www.cance rcent er.com/). ALL is a blood malignancy char-acterized by uncontrolled proliferation of lymphoblasts, immature B and/or T cells. Though ALL can occur at any

age, it is most common in children and adolescents. B-cell acute lymphocytic leukemia (B-ALL) accounts for about 85% and 75% of childhood ALL and adult ALL cases, respectively, with male predominance, while T-cell acute lymphocytic leukemia (T-ALL) accounts for the remaining cases [1, 2]. With recent medical advances in treatment pro-tocols, global survival rate in childhood ALL is increased substantially (> 80%); however, survival rate in adults still remains less than 40% [3–5]. Also, survival in the ALL patients who experience a relapse is very poor [6].

Uncontrolled cell proliferation due to loss of cell cycle control is the hallmark of cancer [7, 8]. Chromosomal rear-rangements are common genetic abnormalities in B-ALL, e.g., BCR-ABL1, ETV6-RUNX1 and TCF3-PBX1 [9]. Also, aberrant expression of transcription factors associated with lymphoid development, e.g., PAX5, EBF1 and IKZF1 has been reported in more than 60% B-ALL cases [10, 11]. CRLF2 rearrangements and JAK mutations are also detected in B-ALL cases [12]. The genomic profiling of high-risk ALL patients has identified rearrangements of ABL1, JAK2, PDGFRB, CRLF2 and EPOR, activating mutations of IL7R and FLT3 and deletion of SH2B3 [13]. Regardless of all the

Electronic supplementary material The online version of this article (https ://doi.org/10.1007/s1253 9-019-00325 -y) contains supplementary material, which is available to authorized users.

* Anil Kumar Tomar [email protected]

1 Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India

2 Department of Reproductive Biology, All India Institute of Medical Sciences, New Delhi 110029, India

Page 2: Most Variable Genes and Transcription Factors in Acute …web.iitd.ac.in/~bkundu/files/Tomar2019_Article_Most... · 2019. 6. 24. · Interdisciplinary Sciences: Computational Life

Interdisciplinary Sciences: Computational Life Sciences

1 3

advances at the molecular level understanding of the disease, ALL remains a challenging and aggressive disease due to high genetic heterogeneity among patients and its progno-sis is uncertain in relapse cases. For tailoring effective and specific therapies, it is essential to classify patients and rec-ognize those with high probability of relapse at the time of disease diagnosis. Comprehensive subtype classification of risk groups of a disease is important for more specific treat-ment of patients and better therapeutic outcomes.

The International Cancer Genome Consortium (ICGC) coordinates a large number of projects elucidating the genomic changes in various cancer types. Through its data portal (https ://dcc.icgc.org/), ICGC has provided open access to the gene expression data of about 70 cancer pro-jects to research community worldwide. Here, we have per-formed bioinformatics analysis of 207 ALL patients’ mRNA expression data (16,807 genes) retrieved from the ICGC data portal. The primary objective was to delineate the crucial genes, transcription factors and pathways responsible for ALL pathogenesis. Also, differential gene expression analy-sis was performed (male vs. female and alive vs. dead) to identify genomic variability in patient subgroups. A male predominance is well known in case of leukemia that led us to perform differential gene expression analysis in male vs. female B-ALL patients to identify decisive gene(s), if any, for high occurrence of ALL in males, while alive vs. dead patient cohorts were chosen for differential gene expression analysis to identify crucial genes that possibly can define disease aggression. Recent studies have shown interest in global profiling of differentially expressed genes (DEGs) in ALL. Li et al. have identified DEGs between diagnostic and relapsed cases with an aim to explore the underlying mecha-nism of relapsed ALL [14]. In another study, Sedek et al. have shown aberrant (over)expression of CD73, CD86 and CD304 in a substantial percentage of B-ALL patients [15].

2 Materials and Methods

2.1 Retrieval of Patient Data

Gene expression array matrices and associated clinical data of 207 high-risk B-ALL patients were retrieved from ICGC data portal (Project: ALL-US; DCC data release; December 7, 2016) and expression matrix of 16,807 genes for all of the patients was normalized. The sample details are given in Table 1.

2.2 Subtype Classification and Survival Analysis

To predict B-ALL subtypes using ICGC-ALL data, genes were filtered out based on the coefficient of variance (CV). A CV value of ≥ 0.8 was used as cut-off to define most variable

genes. Overall 3361 most variable genes were sorted out for predicting the subtypes. Unsupervised hierarchical cluster-ing was done on these genes across all the 207 patient sam-ples using Bioconductor R package ConsensusClusterPlus [16]. Final cluster attained the consensus after 1000 reitera-tions. The number of clusters that represented the expression data most significantly was selected by silhouette method of KMeans clustering, a method that calculates the separa-tion distance between the resulting clusters. This method basically estimates how close each point in one of the clus-ters is to the points of the neighboring clusters. The value of a silhouette coefficient always lies in the range of [− 1, 1]. Bioconductor R based package Cluster [17] was used to estimate these coefficients. Samples with positive silhou-ette coefficient values were selected for further analysis. Top variable genes were obtained for each k = 1 to n subtypes by employing sam function from bioconductor package siggenes [18]. Overall median survival analysis of predicted B-ALL subtypes was performed using coxph model [19] and Kaplan–Meier (KM) curve was used for presenting the results [20].

2.3 Pathway Analysis

Gene ontology (GO) annotations and pathway analysis of 3000 most variable genes among 207 ALL patients was performed using Database for Annotation, Visualization and Integrated Discovery (DAVID) gene enrichment tool with default settings [21]. GO annotations and pathways with FDR < 0.05 were considered significant. This program enlists enriched GO terms and pathways as an output along with many other important features.

2.4 Differential Gene Expression

Gene expression array data of 207 samples were pre-pro-cessed and genes with more than 50% missing data were excluded. Those genes which have expression greater than 5 (in more than 80% of the samples) were used for further

Table 1 Sample details

Description Percentage (number)

Total samples 207Sex Male 66.18% (137)

Female 33.81% (70)Vital status Alive 33.81% (70)

Deceased 25.12% (52)No data 41.06% (85)

Age at diagnosis (years) 1–9 35.74% (74)10–19 63.28% (131)20–29 0.96% (2)

Page 3: Most Variable Genes and Transcription Factors in Acute …web.iitd.ac.in/~bkundu/files/Tomar2019_Article_Most... · 2019. 6. 24. · Interdisciplinary Sciences: Computational Life

Interdisciplinary Sciences: Computational Life Sciences

1 3

analysis. The patient samples were grouped separately in two different biological categories (as per the information provided in clinical data), male/female and dead/alive. Dif-ferential gene expression analysis was performed by employ-ing gene expression data analysis limma package [22]. To account for multiple testing, adjusted p value was estimated using Benjamini–Hochberg method. Genes having log2-fold change values > 1.5 (for up-regulation) and < − 1.5 (for down-regulation) were considered as significantly dif-ferentially expressed. Differentially expressed genes (DEGs) in predicted subtypes (associated with least and maximum survival) were also identified.

3 Results and Discussion

3.1 Variably Expressed Genes

As accessed by box plot analysis of randomly selected 50 samples, high gene expression variability was observed that indicates tumor heterogeneity among the patients (Fig. 1). The significantly variable genes (3361 out of total 16,807) among all the patients were shortlisted using coefficient of variance method (Supplementary file 1). The most variable genes (top 20) are listed in Table 2, including S100 calcium-binding protein (S100P), interferons (IFNB1, IFNA2), lac-totransferrin (LTF), forkhead box genes (FOXC1, FOXR1), membrane spanning 4-domains A3 (MS4A3), protein tyros-ine phosphatase, receptor type Z1 (PTPRZ1), metallothio-neins (MT1E, MT1H), SRY-Box 8 (SOX8), homeobox A5 (HOXA5), and annexin A3 (ANXA3). The proteins encoded by these genes are linked with progression or suppression of various tumors including leukemia.

S100P, a metastasis-inducing protein, has been associ-ated with the regulation of cell cycle progression, differ-entiation and poor patient survival [23, 24]. Interferons are cytokines naturally produced by our immune system for defense against viral infections. Additionally, they exhibit anti-tumor activity [25]. LTF gene codes for lactotransfer-rin, a well-known iron-binding glycoprotein involved in

Fig. 1 Box plot analysis of randomly selected 50 ALL sam-ples. The plot shows high gene expression variability among the samples

Table 2 List of most variable genes and transcription factors among 207 ALL patients

S. no. Most variable genes Most variable TFs

1. S100P FOXC12. SFTPA1 FOXR13. GJB6 SOX84. LTF HOXA55. IFNB1 NFIB6. FOXC1 IRX37. CD1E MYT1L8. FOXR1 SIX39. MS4A3 SALL410. PTPRZ1 MEIS111. RBFOX2 ZNF52112. MT1E IRX213. S100A12 SOX1114. SOX8 CEBPD15. MT1H ID116. IFNA2 HES117. HOXA5 CEBPB18. IFI27 PBX119. CLDN5 WT120. ANXA3 IRX1

Page 4: Most Variable Genes and Transcription Factors in Acute …web.iitd.ac.in/~bkundu/files/Tomar2019_Article_Most... · 2019. 6. 24. · Interdisciplinary Sciences: Computational Life

Interdisciplinary Sciences: Computational Life Sciences

1 3

several physiological and protective functions [26]. LTF and its peptides have been widely explored for their anti-cancer potential and found to prevent different cancer stages, including initiation and progression [27, 28]. It has been shown that bovine LTF induces apoptosis and kills human T-ALL cells [29–32]. MS4A3 participates in innate immune system pathway and acts as a tumor suppressor in CML [33]. PTPRZ1 is the receptor of a heparin-binding glycoprotein pleiotrophin (PTN), a crucial cytokine which regulates vari-ous physiological functions [34]. Over-expression of PTN was observed in various malignant tumors resulting in poor prognosis of patients [35, 36]. Similarly, aberrant expression of PTPRZ1 was frequent in several cancer types [37–39]. The metallothioneins (MTs) are cysteine-rich metal-binding proteins and MTs encoded by MT-1 genes, such as MT1E, MT1G and MT1H are closely associated with carcinogen-esis in various human tumors [40]. MT1H functions as a tumor suppressor and was consistently down-regulated in human malignancies including neuroblastoma, breast can-cer, lung cancer, colon cancer, prostate cancer, B-cell lym-phoma and leukemia in comparison to healthy tissues [41, 42]. ANXA3 is a reported prognostic biomarker of various cancer types including breast [43], prostate [44] and gastric [45]; however, it is not much explored in leukemia. Four out of 20 most variable genes (FOXC1, FOXR1, SOX8 and HOXA5) were transcription factors. In addition, there were several other important genes in the list of most variable genes which were extensively studied and linked with devel-opment of different cancer types, such as DLX6 antisense RNA 1 (DLX6-AS1), interleukin 32 (IL-32), sphingosine-1-phosphate receptor 2 (S1PR2), gastrokine 1 (GKN1), tubulin polymerization promoting protein family member 2 (TPPP2), polypeptide GalNAc transferase 6 (GALNT6) and solute carrier family 6 Member 14 (SLC6A14). Hypermeth-ylation of DLX6-AS1 was reported in aggressive metastatic neuroblastoma in comparison to low-grade tumors [46]. IL-32 is a plausible chemotactic factor, which participates in crosstalk between stromal and leukemia cells resulting in chemo-resistance [47]. It has been found that normal epithe-lial cells are involved in tumor suppressive activity by sens-ing and actively eliminating the neighboring transformed cells, but mechanism is largely unknown. A recent study has shown that S1PR2 mediates activation of Rho in the normal epithelial cells and thus helps in apical extrusion of surrounding transformed cells [48]. GKN1 protein is under-expressed in gastric tumor tissues and considered as a tumor suppressor because its over-expression induces apoptosis in gastric cancer cells [49]. Also, its absence is associated with metastasis [50]. GALNT6 is hardly detectable in human nor-mal tissues and specifically expressed in higher amounts in several cancer types [51]. SLC6A14, an amino acid trans-porter, helps cancer cells in managing their increased amino acid demand and was found over-expressed in many types

of cancers, including colon, pancreatic, cervical, and breast cancers [52]. Due to this, it has been suggested as a potential target for cancer therapy. In addition, SLC6A14 has been tested as a probable delivery system for drugs as well as for amino acid-based pro-drugs [53].

3.2 Gene Enrichment and Pathway Analysis

GO terms (Biological processes, Molecular functions and Cellular components) and Pathways (KEGG, Reactome, BBID and Biocarta) associated with 3000 most variable genes were identified by DAVID gene enrichment tool. The GO analysis enriched 168 biological processes, 62 molecu-lar functions and 35 cellular components (Supplementary file 2). The enriched GO terms were sorted based on their p values (lowest to highest). The top biological processes included inflammatory responses, cell–cell signaling, cell adhesion, immune response, chemokine-mediated signaling, G-protein coupled receptor signaling, neutrophil chemotaxis, homophilic cell adhesion via plasma membrane adhesion molecules, multicellular organism development and posi-tive regulation of ERK1 and ERK2 cascade. The molecular functions associated with these genes are binding (calcium, heparin, receptor, heme, protease, sequence-specific DNA, etc.), cytokine, growth factor, chemokine, hormone, oxygen transporter and transcriptional activator activities. Overall 32 and dominantly cancer-related KEGG pathways were enriched, such as PI3K–Akt signaling, Jak–STAT signal-ing, complement and coagulation cascades, transcriptional misregulation in cancer, cytokine–cytokine receptor inter-action, chemokine signaling, natural killer cell-mediated cytotoxicity, ECM–receptor interaction and transform-ing growth factor-β (TGF-β) signaling (Supplementary file 2). The most significant KEGG pathways enriched were cytokine–cytokine receptor interaction (hsa04060), systemic lupus erythematosus (hsa05322) and neuroac-tive ligand–receptor interaction (hsa04080). To obtain more insights, genes clusters of top KEGG pathways were reanalyzed. Total 79 genes were connected with KEGG pathway hsa04060, including interleukins, tumor necrosis factor (TNF) superfamily members, chemokines and their receptors and interferon genes. Further analysis using Pan-ther tools [54] revealed that these 79 genes are part of 14 crucial pathways (Supplementary file 3), most of which have been widely studied in various cancer types includ-ing leukemia. These pathways include interleukin signaling, interferon-gamma signaling, apoptosis, inflammation medi-ated by chemokine and cytokine signaling, Wnt signaling, toll receptor signaling, TGF-beta signaling, CCKR signal-ing and PDGF signaling. Total 54 genes were clustered in hsa05322 which mostly include histone cluster genes and are involved in seven crucial pathways including Wnt sign-aling, inflammation mediated by chemokine and cytokine

Page 5: Most Variable Genes and Transcription Factors in Acute …web.iitd.ac.in/~bkundu/files/Tomar2019_Article_Most... · 2019. 6. 24. · Interdisciplinary Sciences: Computational Life

Interdisciplinary Sciences: Computational Life Sciences

1 3

signaling pathway, apoptosis, T cell activation, interleukin signaling pathway and interferon-gamma signaling. Simi-larly, 81 genes were associated with hsa04080 and deep analysis shows that they were involved in 25 panther path-ways, including glutamate receptor pathways, inflamma-tion mediated by chemokine and cytokine signaling, blood coagulation and transcriptional regulation. In addition to widely studied pathways such as NOTCH1 signaling, JAK signaling, PI3K–AKT signaling and BCL-2 pathways, these pathways can also be explored and targeted in high-risk B-ALL. As leukemia is a complex blood malignancy and heterogeneous in nature, integrative analysis of these mul-tiple pathways can provide comprehensive disease insights to derive improved and specific therapeutics.

3.3 Transcription Factors

Comprehensive analysis of TFs and their pathways is cru-cial for better understanding of disease regulation and thus, we shortlisted transcription factors out of the 3361 most variable genes. For this purpose, a database of 1474 human TFs (Supplementary file 4) was retrieved from the Animal Transcription Factor DataBase (http://bioin fo.life.hust.edu.cn/Anima lTFDB 1.0) and used as a reference to search TFs in the list of most variable genes (Supplementary file 1). Overall 276 TFs were found among the most variable genes (Supplementary file 5). The top 20 variable TFs are listed in Table 2, including forkhead box genes (FOXC1, FOXR1), SRY-related HMG-box (SOX8), nuclear factor I B (NFIB), homeobox genes (HOXA5, IRX1-3, SIX3, MEIS1, PBX1), spalt-like TF 4 (SALL4), zinc finger protein 521 (ZNF521), CCAAT enhancer-binding proteins (CEBPD, CEBPB), inhibitor Of DNA binding 1, HLH protein (ID1), hes family BHLH TF1 (HES1) and Wilms tumor 1 (WT1). Deregulated expression of FOX proteins has been reported in human malignancies including leukemia [55]. FOXC1, which is expressed in human AML patients but not in healthy popu-lations, collaborates with a leukemic gene HOXA9 and accelerates onset of leukemia [56]. TFs of SOX family are well established regulators of cell fate during development and their deregulation causes various diseases including can-cers [57–59]. HOXA5, IRX1-3, SIX3, MEIS1, and PBX1 belong to highly conserved gene family of homeodomain (HOX) transcription factors and their aberrant expression is associated with several malignancies including ALL and AML [60]. SALL4 regulates expression of BMI-1, a proto-oncogene and a suggested prognostic marker of pediatric ALL [61]. ZNF521 TF is expressed in human hematopoi-etic cells and can act as both, a repressor or an activator. Its translocation with PAX5 is linked with pediatric ALL [62]. CEBPD and CEBPB play key roles in cell proliferation and differentiation and act as suppressors of leukemogen-esis [63]. Several TFs including HES1 and ID1 were found

over-expressed NOTCH1-transduced T-ALL indicating their role in leukemia progression [64]. NOTCH1 signaling is the prominent pathway in T-ALL, which promotes proliferation and inhibits apoptosis. Human WT1 gene can function as both, an oncogene as well as a tumor suppressor and it has been found over-expressed in leukemia and solid tumors. In addition, somatic mutations of WT1 were common in AML, CML and ALL [65].

Other crucial TFs were Kruppel like factor 4 (KLF4), goosecoid homeobox (GSC), Ikaros family zinc finger 3 (IKZF3), zinc finger protein 300 (ZNF300), runt-domain transcription factor (RUNX3), CCAAT enhancer-binding protein alpha (CEBPA), thymocyte selection associ-ated high mobility group box (TOX), NK2 homeobox 5 (NKX2-5), T-Box 21 (TBX21), and E2F transcription fac-tor 2 (E2F2). KLF4 acts as a tumor suppressor in leukemic T cells as its over-expression induces apoptosis. A pre-req-uisite for early human T cell development and homeosta-sis is down-regulated expression of KLF4 [66]. It is well reported that IKZF3 regulates lymphopoiesis and IKZF3 mutations may lead to speedy progression of leukemia and lymphoma [67, 68]. One of the studies has suggested that ZNF300 plays a key role in leukemia development and progression [69]. The TFs of RUNX family (e.g., RUNX1, RUNX2, RUNX3) play imperative roles in hematopoiesis regulation [70]. RUNX3 is one of the master regulators of gene expression in major developmental pathways and considered as a tumor suppressor in a number of cancer types [71]. Like CEBPB and CEBPD, CEBPA is also recognized as a tumor repressor TF due to the fact that loss of-function mutations in CEBPA can contribute to AML development. In addition, expression of CEBPA was dysregulated in human cancers of various origins includ-ing liver, breast and lung [72]. TOX participates in T-cell maturation [73]. Based on copy number alterations, TOX was shown to be associated with relapse in pediatric ALL [74]. TBX21 is expressed in immune cells and plays vital role in the cytotoxic activity of NK cells [75–77].

Overall, five KEGG pathways were enriched among the most variable TFs, viz. TGF-β signaling pathway, acute myeloid leukemia, pathways in cancer, maturity onset dia-betes of the young (MODY) pathway and prostate cancer. It is well known that TGF-β signaling pathway plays a com-plex role in cancer development, progression, and metasta-sis. The MODY pathway was surprising because no studies were found suggesting its close and strong association with leukemia; thus, we reviewed literature for six genes of this pathway, viz. HES1, BHLHA15, FOXA3, HNF4G, NKX2-2, NKX6-1 and found that all of these were closely linked with leukemia. More interestingly, HES1 plays a central role in the control of NOTCH1-induced leukemia cell survival [78] and its expression has been suggested as a useful prog-nostic factor in AML patients [79].

Page 6: Most Variable Genes and Transcription Factors in Acute …web.iitd.ac.in/~bkundu/files/Tomar2019_Article_Most... · 2019. 6. 24. · Interdisciplinary Sciences: Computational Life

Interdisciplinary Sciences: Computational Life Sciences

1 3

3.4 Classification of ALL Subtypes and Patient Survival Analysis

For predicting disease subtypes, we applied unsupervised hierarchical clustering method with Pearson correlation using significantly variable 3361 genes. Each subtype predicted here attained a final cluster after going through bootstrapping (n = 1000) and output was generated for k number of subtypes (k = 1 to n). The specific k subtypes were selected based on silhouette width, cumulative dis-tribution function and the heat map plot. Silhouette width analysis of expression data estimated highest silhouette width for k = 6 (Fig. 2). The heat map generated through stratification of B-ALL patients based on gene expression data predicted six distinct subtypes, designated as A, B, C, D, E and F (Fig. 3). These subtypes accommodated all of the 207 patient samples, A (103), B (13), C (39), D (20), E (18) and F (14) (Supplementary file 6) and over-all analysis based on gene expression variability shows that subtypes are well separated. Median survival analy-sis shows that subtype C patients have maximum survival chances (4151 days), while subtype B patients have least (508 days) as compared to patients belonging to other pre-dicted subtypes (Fig. 4). The comparative gene expres-sion analysis between subtypes B and C patients identi-fied 412 (p value cut-off ≤ 0.05) and 237 (p value cut-off ≤ 0.01) DEGs. Out of 412, 400 were upregulated and 2 were down-regulated in subtype C. Top 10 DEGs were CYTL1, SHANK3, IFI44L, DLL1, CCND2, CMTM2, ITGA6, PDE4B, SH3BP5 and EGFL7 and all these genes were upregulated in subtype C.

3.5 Differential Gene Expression

Sorted on the basis of log2-fold change > 1.5 (for up-reg-ulation) and < − 1.5 (for down-regulation), DEGs in two selected groups, male vs. female and alive vs. dead, are listed in Supplementary file 7. Overall, 13 genes were found dif-ferentially expressed and 11 of these were over-expressed in males. Some of these genes have been studied in context of leukemia in general and their associations are established. However, it would be biased here to correlate these genes with either high rate of B-ALL incidence in males or low rate of incidence in females as genes over-expressed in male patients were all Y chromosome associated while those over-expressed in female patients were located on X chromosome. To ascertain their association with B-ALL incidence/pro-gression, their expression needs to be evaluated in compari-son to healthy controls.

In alive vs. dead cohorts, only two genes were found significantly differentially expressed, viz. joining chain of multimeric IgA and IgM (JCHAIN; fold change = 2.52; p value = 7.00E−05) and cytokine receptor-like factor 2 (CRLF2; fold change = 1.77; p value = 1.52E−04) and interestingly, both were upregulated in dead patients’ cohort. Adjusted p values that account for multiple test-ing (Benjamini–Hochberg method) were slightly higher (0.14 and 0.16), possibly due to high heterogeneity among the samples of same class. JCHAIN and CRLF2 encode for immunoglobulin J chain and cytokine receptor-like factor 2, respectively. JCHAIN links monomer units of IgA and IgM and also helps them to bind with secretory

Fig. 2 Silhouette width plot. Here, maximum silhouette width is for k = 6 and thus, ALL patients can likely be classified into six subtypes

Fig. 3 Heat map of gene expression in predicted ALL subtypes

Page 7: Most Variable Genes and Transcription Factors in Acute …web.iitd.ac.in/~bkundu/files/Tomar2019_Article_Most... · 2019. 6. 24. · Interdisciplinary Sciences: Computational Life

Interdisciplinary Sciences: Computational Life Sciences

1 3

components. CRLF2 along with thymic stromal lym-phopoietin (TSLP) and interleukin 7 receptor (IL7R) activates three important pathways—STAT3, STAT5 and JAK2. These pathways are known to control various pro-cesses such as cell proliferation and hematopoietic system development. Several previous studies have linked over-expression of CRLF2 and JCHAIN with low treatment response and poor survival in ALL patients [12, 80–82]. The functional partners of JCHAIN and CRLF2 were pre-dicted and analyzed by string interactions [83]. Ten func-tional partners were predicted for JCHAIN (alias IGJ), including CD79A, PAX5, PAX8, SPI1, and IL2 (Fig. 5a, Supplementary file 8). KEGG pathway analysis reveals

that three of these proteins (PAX5, PAX8 and SPI1) are associated with transcriptional misregulation in cancer. On the other hand, CRLF2 interacts with JAK 1, JAK2, JAK3, IL3, IL7, etc., and KEGG pathway analysis returned 14 pathways (Fig. 5b, Supplementary file 8). Interestingly, all of them (CRLF2 and its functional partners) participate in JAK–STAT signaling pathway. Another important pathway term was PI3K–Akt signaling and associated proteins were IL3, IL7, IL7R, JAK1, JAK2, JAK3, OSM and PRL. Their involvement in cancer progression pathways is a sugges-tive evidence that upregulation of JCHAIN and CRLF2 genes in dead patients’ cohort is likely associated with ALL aggression.

Fig. 4 Overall survival analysis of ALL patients. Kaplan–Meier survival analysis of six subtypes (A, B, C, D, E and F) was per-formed based on clinical data of the patients

Fig. 5 Protein–protein interac-tion networks of differentially expressed genes (alive vs. dead): a CRLF2 and b JCHAIN

Page 8: Most Variable Genes and Transcription Factors in Acute …web.iitd.ac.in/~bkundu/files/Tomar2019_Article_Most... · 2019. 6. 24. · Interdisciplinary Sciences: Computational Life

Interdisciplinary Sciences: Computational Life Sciences

1 3

4 Conclusions

This study presents gene expression analysis in 207 high-risk B-ALL patients. The gene sorting based on the vari-able expression among patients and associated clinical information has exposed a number of interesting genes and pathways that can be exploited in future for better under-standing of disease pathogenesis as well as for designing specific B-ALL therapy. The most variable TFs identified and associated pathways may help us to draw a compre-hensive map B-ALL regulation. Six subtypes were identi-fied based on the most variable genes and the patients of subtype C and B had the highest and lowest probability to survive, respectively, as per the clinical data. Differ-ential gene expression analysis revealed over-expression of JCHAIN and CRLF2 genes in dead patients’ cohort in comparison to alive patients’ cohort. We believe that these genes can further be explored and targeted in high-risk B-ALL relapse cases.

Acknowledgements This work was supported by grants received by AKT from Science and Engineering Research Board (SERB), Department of Science & Technology, Govt. of India, New Delhi, under National-Postdoctoral Fellowship Scheme (File Number: PDF/2015/000979). Authors also thank the IIT Delhi HPC facility for computational resources.

Compliance with Ethical Standards

Conflict of interest Authors declare no conflict of interest.

References

1. Chiaretti S, Foa R (2009) T-cell acute lymphoblastic leukemia. Haematologica 94:160–162. https ://doi.org/10.3324/haema tol.2008.00415 0

2. Pui CH, Behm FG, Singh B, Schell MJ, Williams DL, Rivera GK, Kalwinsky DK, Sandlund JT, Crist WM, Raimondi SC (1990) Heterogeneity of presenting features and their relation to treat-ment outcome in 120 children with T-cell acute lymphoblastic leukemia. Blood 75:174–179

3. Paul S, Kantarjian H, Jabbour EJ (2016) Adult acute lympho-blastic leukemia. Mayo Clin Proc 91:1645–1666. https ://doi.org/10.1016/j.mayoc p.2016.09.010

4. Redaelli A, Laskin BL, Stephens JM, Botteman MF, Pashos CL (2005) A systematic literature review of the clinical and epide-miological burden of acute lymphoblastic leukaemia (ALL). Eur J Cancer Care (Engl) 14:53–62. https ://doi.org/10.1111/j.1365-2354.2005.00513 .x

5. You MJ, Medeiros LJ, Hsi ED (2015) T-lymphoblastic leuke-mia/lymphoma. Am J Clin Pathol 144:411–422. https ://doi.org/10.1309/AJCPM F03LV SBLHP J

6. Salzer WL, Devidas M, Carroll WL, Winick N, Pullen J, Hunger SP, Camitta BA (2010) Long-term results of the pediatric oncol-ogy group studies for childhood acute lymphoblastic leukemia 1984–2001: a report from the children’s oncology group. Leuke-mia 24:355–370. https ://doi.org/10.1038/leu.2009.261

7. Ferrando AA, Neuberg DS, Staunton J, Loh ML, Huard C, Rai-mondi SC, Behm FG, Pui CH, Downing JR, Gilliland DG, Lander ES, Golub TR, Look AT (2002) Gene expression signatures define novel oncogenic pathways in T cell acute lymphoblastic leukemia. Cancer Cell 1:75–87

8. Sherr CJ (1996) Cancer cell cycles. Science 274:1672–1677 9. Pui CH, Robison LL, Look AT (2008) Acute lymphoblastic leu-

kaemia. Lancet 371:1030–1043. https ://doi.org/10.1016/S0140 -6736(08)60457 -2

10. Kuiper RP, Schoenmakers EF, van Reijmersdal SV, Hehir-Kwa JY, van Kessel AG, van Leeuwen FN, Hoogerbrugge PM (2007) High-resolution genomic profiling of childhood ALL reveals novel recurrent genetic lesions affecting pathways involved in lymphocyte differentiation and cell cycle progression. Leukemia 21:1258–1266. https ://doi.org/10.1038/sj.leu.24046 91

11. Mullighan CG, Su X, Zhang J, Radtke I, Phillips LA, Miller CB, Ma J, Liu W, Cheng C, Schulman BA, Harvey RC, Chen IM, Clifford RJ, Carroll WL, Reaman G, Bowman WP, Devidas M, Gerhard DS, Yang W, Relling MV, Shurtleff SA, Campana D, Borowitz MJ, Pui CH, Smith M, Hunger SP, Willman CL, Down-ing JR, Children’s Oncology Group (2009) Deletion of IKZF1 and prognosis in acute lymphoblastic leukemia. N Engl J Med 360:470–480. https ://doi.org/10.1056/NEJMo a0808 253

12. Harvey RC, Mullighan CG, Chen IM, Wharton W, Mikhail FM, Carroll AJ, Kang H, Liu W, Dobbin KK, Smith MA, Carroll WL, Devidas M, Bowman WP, Camitta BM, Reaman GH, Hunger SP, Downing JR, Willman CL (2010) Rearrangement of CRLF2 is associated with mutation of JAK kinases, alteration of IKZF1, Hispanic/Latino ethnicity, and a poor outcome in pediatric B-pro-genitor acute lymphoblastic leukemia. Blood 115:5312–5321. https ://doi.org/10.1182/blood -2009-09-24594 4

13. Roberts KG, Morin RD, Zhang J, Hirst M, Zhao Y, Su X, Chen SC, Payne-Turner D, Churchman ML, Harvey RC, Chen X, Kasap C, Yan C, Becksfort J, Finney RP, Teachey DT, Maude SL, Tse K, Moore R, Jones S, Mungall K, Birol I, Edmonson MN, Hu Y, Buetow KE, Chen IM, Carroll WL, Wei L, Ma J, Kleppe M, Levine RL, Garcia-Manero G, Larsen E, Shah NP, Devidas M, Reaman G, Smith M, Paugh SW, Evans WE, Grupp SA, Jeha S, Pui CH, Gerhard DS, Downing JR, Willman CL, Loh M, Hunger SP, Marra MA, Mullighan CG (2012) Genetic alterations acti-vating kinase and cytokine receptor signaling in high-risk acute lymphoblastic leukemia. Cancer Cell 22:153–166. https ://doi.org/10.1016/j.ccr.2012.06.005

14. Li S, Wang C, Wang W, Liu W, Zhang G (2018) Abnormally high expression of POLD1, MCM2, and PLK4 promotes relapse of acute lymphoblastic leukemia. Medicine (Baltimore) 97(20):e10734. https ://doi.org/10.1097/MD.00000 00000 01073 4

15. Sędek Ł, Theunissen P, Sobral da Costa E, van der Sluijs-Gelling A, Mejstrikova E, Gaipa G, Sonsala A, Twardoch M, Oliveira E, Novakova M, Buracchi C, van Dongen JJM, Orfao A, van der Velden VHJ, Szczepański T, EuroFlow Consortium (2018) Dif-ferential expression of CD73, CD86 and CD304 in normal vs. leu-kemic B-cell precursors and their utility as stable minimal residual disease markers in childhood B-cell precursor acute lymphoblas-tic leukemia. J Immunol Methods. https ://doi.org/10.1016/j.jim.2018.03.005

16. Wilkerson MD, Hayes DN (2010) ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26:1572–1573. https ://doi.org/10.1093/bioin forma tics/btq17 0

17. Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K (2013) cluster: Cluster analysis basics and extensions. R package v1.14.4 edn. https ://www.rdocu menta tion.org/packa ges/clust er

18. Schwender H (2012) siggenes: Multiple testing using SAM and Efron’s empirical Bayes approaches. R package v1.46.0 edn. https ://www.rdocu menta tion.org/packa ges/sigge nes

Page 9: Most Variable Genes and Transcription Factors in Acute …web.iitd.ac.in/~bkundu/files/Tomar2019_Article_Most... · 2019. 6. 24. · Interdisciplinary Sciences: Computational Life

Interdisciplinary Sciences: Computational Life Sciences

1 3

19. Cox DR (1972) Regression models and life tables. J R Stat Soc B 34:187–220

20. Kaplan E, Meier P (1958) Nonparametric estimation from incom-plete observations. J Am Stat Assoc 53:457–481. https ://doi.org/10.2307/22818 68

21. Huang DW, Sherman BT, Tan Q, Kir J, Liu D, Bryant D, Guo Y, Stephens R, Baseler MW, Lane HC, Lempicki RA (2007) DAVID bioinformatics resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res 35:W169–W175. https ://doi.org/10.1093/nar/gkm41 5

22. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015) limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43:e47. https ://doi.org/10.1093/nar/gkv00 7

23. Ishii Y, Kasukabe T, Honma Y (2005) Immediate up-regulation of the calcium-binding protein S100P and its involvement in the cytokinin-induced differentiation of human myeloid leu-kemia cells. Biochim Biophys Acta 1745:156–165. https ://doi.org/10.1016/j.bbamc r.2005.01.005

24. Clarke C, Gross SR, Ismail TM, Rudland PS, Al-Medhtiy M, Santangeli M, Barraclough R (2017) Activation of tissue plasmi-nogen activator by metastasis-inducing S100P protein. Biochem J 474(19):3227–3240. https ://doi.org/10.1042/BCJ20 17057 8

25. Westcott MM, Liu J, Rajani K, D’Agostino R Jr, Lyles DS, Poros-nicu M (2015) Interferon beta and interferon alpha 2a differen-tially protect head and neck cancer cells from vesicular stoma-titis virus-induced oncolysis. J Virol 89:7944–7954. https ://doi.org/10.1128/JVI.00757 -15

26. Giansanti F, Panella G, Leboffe L, Antonini G (2016) Lactoferrin from milk: nutraceutical and pharmacological properties. Pharma-ceuticals (Basel) 9(4):E61. https ://doi.org/10.3390/ph904 0061

27. Benaissa M, Peyrat JP, Hornez L, Mariller C, Mazurier J, Pierce A (2005) Expression and prognostic value of lactoferrin mRNA isoforms in human breast cancer. Int J Cancer 114:299–306. https ://doi.org/10.1002/ijc.20728

28. Hoedt E, Hardiville S, Mariller C, Elass E, Perraudin JP, Pierce A (2010) Discrimination and evaluation of lactoferrin and delta-lactoferrin gene expression levels in cancer cells and under inflammatory stimuli using TaqMan real-time PCR. Biometals 23:441–452. https ://doi.org/10.1007/s1053 4-010-9305-5

29. Lee SH, Hwang HM, Pyo CW, Hahm DH, Choi SY (2010) E2F1-directed activation of Bcl-2 is correlated with lactoferrin-induced apoptosis in Jurkat leukemia T lymphocytes. Biometals 23:507–514. https ://doi.org/10.1007/s1053 4-010-9341-1

30. Lu Y, Zhang TF, Shi Y, Zhou HW, Chen Q, Wei BY, Wang X, Yang TX, Chinn YE, Kang J, Fu CY (2016) PFR peptide, one of the antimicrobial peptides identified from the derivatives of lactoferrin, induces necrosis in leukemia cells. Sci Rep 6:20823. https ://doi.org/10.1038/srep2 0823

31. Mader JS, Salsman J, Conrad DM, Hoskin DW (2005) Bovine lactoferricin selectively induces apoptosis in human leukemia and carcinoma cell lines. Mol Cancer Ther 4:612–624. https ://doi.org/10.1158/1535-7163.MCT-04-0077

32. Richardson A, de Antueno R, Duncan R, Hoskin DW (2009) Intracellular delivery of bovine lactoferricin’s antimicrobial core (RRWQWR) kills T-leukemia cells. Biochem Biophys Res Com-mun 388:736–741. https ://doi.org/10.1016/j.bbrc.2009.08.083

33. Eiring AM, Khorashad JS, Agarwal A, Mason CC, Yu F, Red-wine HM, Bowler AD, Gantz KC, Reynolds KR, Clair PM (2015) MS4A3 improves imatinib response and survival in BCR-ABL1 primary TKI resistance and in blastic transformation of chronic myeloid leukemia. Blood 126:14

34. Yokoi H, Kasahara M, Mori K, Ogawa Y, Kuwabara T, Imamaki H, Kawanishi T, Koga K, Ishii A, Kato Y, Mori KP, Toda N, Ohno S, Muramatsu H, Muramatsu T, Sugawara A, Mukoyama M, Nakao K (2012) Pleiotrophin triggers inflammation and increased

peritoneal permeability leading to peritoneal fibrosis. Kidney Int 81:160–169. https ://doi.org/10.1038/ki.2011.305

35. Chang Y, Zuka M, Perez-Pinera P, Astudillo A, Mortimer J, Ber-enson JR, Deuel TF (2007) Secretion of pleiotrophin stimulates breast cancer progression through remodeling of the tumor micro-environment. Proc Natl Acad Sci USA 104:10888–10893. https ://doi.org/10.1073/pnas.07043 66104

36. Du ZY, Shi MH, Ji CH, Yu Y (2015) Serum pleiotrophin could be an early indicator for diagnosis and prognosis of non-small cell lung cancer. Asian Pac J Cancer Prev 16:1421–1425

37. Ma Y, Ye F, Xie X, Zhou C, Lu W (2011) Significance of PTPRZ1 and CIN85 expression in cervical carcinoma. Arch Gynecol Obstet 284:699–704. https ://doi.org/10.1007/s0040 4-010-1693-9

38. Makinoshima H, Ishii G, Kojima M, Fujii S, Higuchi Y, Kuwata T, Ochiai A (2012) PTPRZ1 regulates calmodulin phosphorylation and tumor progression in small-cell lung carcinoma. BMC Cancer 12:537. https ://doi.org/10.1186/1471-2407-12-537

39. Shi Y, Ping YF, Zhou W, He ZC, Chen C, Bian BS, Zhang L, Chen L, Lan X, Zhang XC, Zhou K, Liu Q, Long H, Fu TW, Zhang XN, Cao MF, Huang Z, Fang X, Wang X, Feng H, Yao XH, Yu SC, Cui YH, Zhang X, Rich JN, Bao S, Bian XW (2017) Tumour-associated macrophages secrete pleiotrophin to promote PTPRZ1 signalling in glioblastoma stem cells for tumour growth. Nat Commun 8:15080. https ://doi.org/10.1038/ncomm s1508 0

40. Thirumoorthy N, Shyam Sunder A, Manisenthil Kumar K, Senthil Kumar M, Ganesh G, Chatterjee M (2011) A review of metal-lothionein isoforms and their role in pathophysiology. World J Surg Oncol 9:54. https ://doi.org/10.1186/1477-7819-9-54

41. Han YC, Zheng ZL, Zuo ZH, Yu YP, Chen R, Tseng GC, Nelson JB, Luo JH (2013) Metallothionein 1 h tumour suppressor activity in prostate cancer is mediated by euchromatin methyltransferase 1. J Pathol 230:184–193. https ://doi.org/10.1002/path.4169

42. Zheng Y, Jiang L, Hu Y, Xiao C, Xu N, Zhou J, Zhou X (2017) Metallothionein 1H (MT1H) functions as a tumor suppressor in hepatocellular carcinoma through regulating Wnt/beta-catenin signaling pathway. BMC Cancer 17:161. https ://doi.org/10.1186/s1288 5-017-3139-2

43. Zhou T, Li Y, Yang L, Tang T, Zhang L, Shi J (2017) Annexin A3 as a prognostic biomarker for breast cancer: a retro-spective study. Biomed Res Int 2017:2603685. https ://doi.org/10.1155/2017/26036 85

44. Hamelin-Peyron C, Vlaeminck-Guillem V, Haidous H, Schwall GP, Poznanovic S, Gorius-Gallet E, Michel S, Larue A, Guillotte M, Ruffion A, Choquet-Kastylevsky G, Ataman-Onal Y (2014) Prostate cancer biomarker annexin A3 detected in urines obtained following digital rectal examination presents antigenic variability. Clin Biochem 47:901–908. https ://doi.org/10.1016/j.clinb ioche m.2014.05.063

45. Wang K, Li J (2016) Overexpression of ANXA3 is an independent prognostic indicator in gastric cancer and its depletion suppresses cell proliferation and tumor growth. Oncotarget 7:86972–86984. https ://doi.org/10.18632 /oncot arget .13493

46. Olsson M, Beck S, Kogner P, Martinsson T, Caren H (2016) Genome-wide methylation profiling identifies novel methylated genes in neuroblastoma tumors. Epigenetics 11:74–84. https ://doi.org/10.1080/15592 294.2016.11381 95

47. Lopes MR, Pereira JK, de Melo Campos P, Machado-Neto JA, Traina F, Saad ST, Favaro P (2017) De novo AML exhibits greater microenvironment dysregulation compared to AML with myelodysplasia-related changes. Sci Rep 7:40707. https ://doi.org/10.1038/srep4 0707

48. Yamamoto S, Yako Y, Fujioka Y, Kajita M, Kameyama T, Kon S, Ishikawa S, Ohba Y, Ohno Y, Kihara A, Fujita Y (2016) A role of the sphingosine-1-phosphate (S1P)-S1P receptor 2 pathway in epi-thelial defense against cancer (EDAC). Mol Biol Cell 27:491–499. https ://doi.org/10.1091/mbc.E15-03-0161

Page 10: Most Variable Genes and Transcription Factors in Acute …web.iitd.ac.in/~bkundu/files/Tomar2019_Article_Most... · 2019. 6. 24. · Interdisciplinary Sciences: Computational Life

Interdisciplinary Sciences: Computational Life Sciences

1 3

49. Altieri F, Di Stadio CS, Federico A, Miselli G, De Palma M, Rippa E, Arcari P (2017) Epigenetic alterations of gastrokine 1 gene expression in gastric cancer. Oncotarget 8:16899–16911. https ://doi.org/10.18632 /oncot arget .14817

50. Xing R, Cui JT, Xia N, Lu YY (2015) GKN1 inhibits cell inva-sion in gastric cancer by inactivating the NF-kappaB pathway. Discov Med 19:65–71

51. Park JH, Nishidate T, Kijima K, Ohashi T, Takegawa K, Fuji-kane T, Hirata K, Nakamura Y, Katagiri T (2010) Critical roles of mucin 1 glycosylation by transactivated polypeptide N-acetylgalactosaminyltransferase 6 in mammary carcinogen-esis. Cancer Res 70:2759–2769. https ://doi.org/10.1158/0008-5472.CAN-09-3911

52. Bhutia YD, Babu E, Prasad PD, Ganapathy V (2014) The amino acid transporter SLC6A14 in cancer and its potential use in chem-otherapy. Asian J Pharm Sci 9:293–303. https ://doi.org/10.1016/j.ajps.2014.04.004

53. Ganapathy ME, Ganapathy V (2005) Amino acid transporter ATB0,+ as a delivery system for drugs and prodrugs. Curr Drug Targets Immune Endocr Metabol Disord 5:357–364

54. Mi H, Huang X, Muruganujan A, Tang H, Mills C, Kang D, Thomas PD (2017) PANTHER version 11: expanded annotation data from gene ontology and reactome pathways, and data analysis tool enhancements. Nucleic Acids Res 45:D183–D189. https ://doi.org/10.1093/nar/gkw11 38

55. Zhu H (2014) Targeting forkhead box transcription factors FOXM1 and FOXO in leukemia (Review). Oncol Rep 32:1327–1334. https ://doi.org/10.3892/or.2014.3357

56. Somerville TD, Wiseman DH, Spencer GJ, Huang X, Lynch JT, Leong HS, Williams EL, Cheesman E, Somervaille TC (2015) Frequent derepression of the mesenchymal transcription factor gene FOXC1 in acute myeloid leukemia. Cancer Cell 28:329–342. https ://doi.org/10.1016/j.ccell .2015.07.017

57. Sarkar A, Hochedlinger K (2013) The sox family of transcription factors: versatile regulators of stem and progenitor cell fate. Cell Stem Cell 12:15–30. https ://doi.org/10.1016/j.stem.2012.12.007

58. Oliemuller E, Kogata N, Bland P, Kriplani D, Daley F, Haider S, Shah V, Sawyer EJ, Howard BA (2017) SOX11 promotes inva-sive growth and ductal carcinoma in situ progression. J Pathol 243(2):193–207. https ://doi.org/10.1002/path.4939

59. Xie C, Han Y, Liu Y, Han L, Liu J (2014) miRNA-124 down-regulates SOX8 expression and suppresses cell proliferation in non-small cell lung cancer. Int J Clin Exp Pathol 7:7518–7526

60. Alharbi RA, Pettengell R, Pandha HS, Morgan R (2013) The role of HOX genes in normal hematopoiesis and acute leukemia. Leu-kemia 27:1000–1008. https ://doi.org/10.1038/leu.2012.356

61. Peng HX, Liu XD, Luo ZY, Zhang XH, Luo XQ, Chen X, Jiang H, Xu L (2017) Upregulation of the proto-oncogene Bmi-1 predicts a poor prognosis in pediatric acute lymphoblastic leukemia. BMC Cancer 17:76. https ://doi.org/10.1186/s1288 5-017-3049-3

62. Yu M, Al-Dallal S, Al-Haj L, Panjwani S, McCartney AS, Edwards SM, Manjunath P, Walker C, Awgulewitsch A, Hent-ges KE (2016) Transcriptional regulation of the proto-oncogene Zfp521 by SPI1 (PU.1) and HOXC13. Genesis 54:519–533. https ://doi.org/10.1002/dvg.22963

63. Akasaka T, Balasas T, Russell LJ, Sugimoto KJ, Majid A, Walewska R, Karran EL, Brown DG, Cain K, Harder L, Gesk S, Martin-Subero JI, Atherton MG, Bruggemann M, Calasanz MJ, Davies T, Haas OA, Hagemeijer A, Kempski H, Lessard M, Lillington DM, Moore S, Nguyen-Khac F, Radford-Weiss I, Schoch C, Struski S, Talley P, Welham MJ, Worley H, Strefford JC, Harrison CJ, Siebert R, Dyer MJ (2007) Five members of the CEBP transcription factor family are targeted by recurrent IGH translocations in B-cell precursor acute lymphoblastic leukemia (BCP-ALL). Blood 109:3451–3461. https ://doi.org/10.1182/blood -2006-08-04101 2

64. Chadwick N, Zeef L, Portillo V, Fennessy C, Warrander F, Hoyle S, Buckle AM (2009) Identification of novel Notch tar-get genes in T cell leukaemia. Mol Cancer 8:35. https ://doi.org/10.1186/1476-4598-8-35

65. Bielinska E, Matiakowska K, Haus O (2017) Heterogeneity of human WT1 gene. Postepy Hig Med Dosw (Online) 71:595–601

66. Shen Y, Park CS, Suppipat K, Mistretta TA, Puppi M, Horton TM, Rabin K, Gray NS, Meijerink JP, Lacorazza HD (2017) Inac-tivation of KLF4 promotes T-cell acute lymphoblastic leukemia and activates the MAP2K7 pathway. Leukemia 31(6):1314–1324. https ://doi.org/10.1038/leu.2016.339

67. Kronke J, Hurst SN, Ebert BL (2014) Lenalidomide induces degradation of IKZF1 and IKZF3. Oncoimmunology 3:e941742. https ://doi.org/10.4161/21624 011.2014.94174 2

68. Winandy S, Wu P, Georgopoulos K (1995) A dominant mutation in the Ikaros gene leads to rapid development of leukemia and lymphoma. Cell 83:289–299

69. Xu JH, Wang T, Wang XG, Wu XP, Zhao ZZ, Zhu CG, Qiu HL, Xue L, Shao HJ, Guo MX, Li WX (2010) PU.1 can regulate the ZNF300 promoter in APL-derived promyelocytes HL-60. Leuk Res 34:1636–1646. https ://doi.org/10.1016/j.leukr es.2010.04.009

70. de Bruijn M, Dzierzak E (2017) Runx transcription factors in the development and function of the definitive hematopoietic system. Blood 129:2061–2069. https ://doi.org/10.1182/blood -2016-12-68910 9

71. Selvarajan V, Osato M, Nah GS, Yan J, Chung TH, Voon DC, Ito Y, Ham MF, Salto-Tellez M, Shimizu N, Choo SN, Fan S, Chng WJ, Ng SB (2017) RUNX3 is oncogenic in natural killer/T-cell lymphoma and is transcriptionally regulated by MYC. Leukemia 31(10):2219–2227. https ://doi.org/10.1038/leu.2017.40

72. Lourenco AR, Coffer PJ (2017) A tumor suppressor role for C/EBPalpha in solid tumors: more than fat and blood. Oncogene 36(37):5221–5230. https ://doi.org/10.1038/onc.2017.151

73. Wilkinson B, Chen JY, Han P, Rufner KM, Goularte OD, Kaye J (2002) TOX: an HMG box protein implicated in the regula-tion of thymocyte selection. Nat Immunol 3:272–280. https ://doi.org/10.1038/ni767

74. Mullighan CG, Phillips LA, Su X, Ma J, Miller CB, Shurtleff SA, Downing JR (2008) Genomic analysis of the clonal origins of relapsed acute lymphoblastic leukemia. Science 322:1377–1380. https ://doi.org/10.1126/scien ce.11642 66

75. Gordon SM, Chaix J, Rupp LJ, Wu J, Madera S, Sun JC, Lindsten T, Reiner SL (2012) The transcription factors T-bet and Eomes control key checkpoints of natural killer cell maturation. Immunity 36:55–67. https ://doi.org/10.1016/j.immun i.2011.11.016

76. Lazarevic V, Glimcher LH, Lord GM (2013) T-bet: a bridge between innate and adaptive immunity. Nat Rev Immunol 13:777–789. https ://doi.org/10.1038/nri35 36

77. Yu H, Yang J, Jiao S, Li Y, Zhang W, Wang J (2014) T-box tran-scription factor 21 expression in breast cancer and its relationship with prognosis. Int J Clin Exp Pathol 7:6906–6913

78. Schnell SA, Ambesi-Impiombato A, Sanchez-Martin M, Belver L, Xu L, Qin Y, Kageyama R, Ferrando AA (2015) Therapeu-tic targeting of HES1 transcriptional programs in T-ALL. Blood 125:2806–2814. https ://doi.org/10.1182/blood -2014-10-60844 8

79. Tian C, Tang Y, Wang T, Yu Y, Wang X, Wang Y, Zhang Y (2015) HES1 is an independent prognostic factor for acute myeloid leuke-mia. Onco Targets Ther 8:899–904. https ://doi.org/10.2147/OTT.S8351 1

80. Dou H, Chen X, Huang Y, Su Y, Lu L, Yu J, Yin Y, Bao L (2017) Prognostic significance of P2RY8-CRLF2 and CRLF2 overex-pression may vary across risk subgroups of childhood B-cell acute lymphoblastic leukemia. Genes Chromosomes Cancer 56:135–146. https ://doi.org/10.1002/gcc.22421

81. Palmi C, Savino AM, Silvestri D, Bronzini I, Cario G, Paganin M, Buldini B, Galbiati M, Muckenthaler MU, Bugarin C, Della

Page 11: Most Variable Genes and Transcription Factors in Acute …web.iitd.ac.in/~bkundu/files/Tomar2019_Article_Most... · 2019. 6. 24. · Interdisciplinary Sciences: Computational Life

Interdisciplinary Sciences: Computational Life Sciences

1 3

Mina P, Nagel S, Barisone E, Casale F, Locatelli F, Lo Nigro L, Micalizzi C, Parasole R, Pession A, Putti MC, Santoro N, Testi AM, Ziino O, Kulozik AE, Zimmermann M, Schrappe M, Villa A, Gaipa G, Basso G, Biondi A, Valsecchi MG, Stanulla M, Conter V, Te Kronnie G, Cazzaniga G (2016) CRLF2 over-expression is a poor prognostic marker in children with high risk T-cell acute lymphoblastic leukemia. Oncotarget 7:59260–59272. https ://doi.org/10.18632 /oncot arget .10610

82. Cruz-Rodriguez N, Combita AL, Enciso LJ, Quijano SM, Pinzon PL, Lozano OC, Castillo JS, Li L, Bareno J, Cardozo C, Solano J, Herrera MV, Cudris J, Zabaleta J (2016) High expression of ID

family and IGJ genes signature as predictor of low induction treat-ment response and worst survival in adult Hispanic patients with B-acute lymphoblastic leukemia. J Exp Clin Cancer Res 35:64. https ://doi.org/10.1186/s1304 6-016-0333-z

83. Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic M, Roth A, Santos A, Tsafou KP, Kuhn M, Bork P, Jensen LJ, von Mering C (2015) STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res 43:D447–D452. https ://doi.org/10.1093/nar/gku10 03