bioinformatic identification of disease driver …

66
Institute for Molecular Medicine Finland, FIMM University of Helsinki, Helsinki, Finland Doctoral Programme in Biomedicine (DPBM) BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER NETWORKS USING FUNCTIONAL PROFILING DATA Agnieszka Szwajda ACADEMIC DISSERTATION To be presented, with the permission of the Faculty of Medicine, University of Helsinki, for public examination in lecture hall 3, Biomedicum Helsinki, Haartmaninkatu 8, on Friday 2 nd of February 2018, at 12 noon. Helsinki, 2018

Upload: others

Post on 04-May-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

Institute for Molecular Medicine Finland, FIMM University of Helsinki, Helsinki, Finland

Doctoral Programme in Biomedicine (DPBM)

BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER NETWORKS USING

FUNCTIONAL PROFILING DATA

Agnieszka Szwajda

ACADEMIC DISSERTATION

To be presented, with the permission of the Faculty of Medicine, University of Helsinki,

for public examination in lecture hall 3, Biomedicum Helsinki, Haartmaninkatu 8, on Friday 2nd of February 2018, at 12 noon.

Helsinki, 2018

Page 2: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

2

Supervised by: Professor Tero Aittokallio, Ph.D. Institute for Molecular Medicine Finland, FIMM University of Helsinki Helsinki, Finland Krister Wennerberg, Ph.D. Institute for Molecular Medicine Finland, FIMM University of Helsinki Helsinki, Finland Reviewed by: Associate Professor Harri Lähdesmäki, D.Sc. Department of Computer Science Aalto University School of Science Espoo, Finland Associate Professor Henri Xhaard, Ph.D. Division of Pharmaceutical Chemistry and Technology University of Helsinki Helsinki, Finland Opponent: Professor Matti Nykter, Ph.D. Institute of Biosciences and Medical Technology University of Tampere Tampere, Finland Custos: Professor Samuli Ripatti, Ph.D. Institute for Molecular Medicine Finland, FIMM University of Helsinki Helsinki, Finland © Agnieszka Szwajda Cover layout by Anita Tienhaara ISBN 978-951-51-3959-7 (paperback) ISBN 978-951-51-3960-3 (PDF) ISSN 2342-3161 (print) ISSN 2342-317X (online) Unigrafia Oy, Helsinki 2018

Page 3: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

3

Table of contents ABBREVIATIONS ....................................................................................................................................... 5

LIST OF ORIGINAL PUBLICATIONS ............................................................................................................ 6

AUTHOR’S CONTRIBUTIONS .................................................................................................................... 7

ABSTRACT ................................................................................................................................................ 8

1. INTRODUCTION ................................................................................................................................. 10

2. REVIEW OF LITERATURE .................................................................................................................... 12

2.1. Methods for the identification of disease drivers (II, III) ............................................................ 12

2.2. Oncogene addiction and other molecular vulnerabilities in cancer (II) ..................................... 13

2.3. Functional proteomics using mass spectrometry (III, IV) ........................................................... 14

2.3.1. Quantitative MS ................................................................................................................... 15

2.3.2. Affinity Purification (AP) coupled with MS .......................................................................... 15

2.3.3. Limitation of MS-approaches .............................................................................................. 15

2.4. Protein-protein interaction networks (III, IV) ............................................................................. 16

2.5. Protein phosphorylation and nitrosylation (II, IV) ...................................................................... 17

2.6. Chemical and genetic perturbation screening (II) ...................................................................... 18

2.7. Quantitative scoring of single and combinatorial drug responses (II) ........................................ 20

2.8. Compound-target selectivity profiling (I, II) ............................................................................... 21

2.9. Background of diseases used in the studies (II, IV) .................................................................... 23

2.9.1. Breast cancer ....................................................................................................................... 23

2.9.2. Alzheimer’s disease ............................................................................................................. 24

3. AIMS OF THE STUDY .......................................................................................................................... 25

4. MATERIALS AND METHODS ............................................................................................................... 26

4.1. Comparison of kinase inhibitor profiling datasets (I) ................................................................. 26

4.2. Evaluation of reproducibility and reliability of kinase inhibitor profiling datasets (I) ................ 26

4.3. Integration of the target bioactivity measurements (I) .............................................................. 27

4.4. Kinase inhibition sensitivity score (KISS) for prediction of molecular cancer addictions (II) ..... 27

4.5. Combinatorial KISS for prediction of co-essential target pairs and synergistic drug combinations (II) ...................................................................................................................................................... 27

4.6. Compound sensitivity testing (II) ................................................................................................ 28

4.7. Compound-target mappings (II) ................................................................................................. 28

4.8. Comparison of driver de-convolution methods (II) .................................................................... 29

4.9. Independent compound and siRNA testing to validate KISS predictions (II).............................. 30

4.10. Construction of addiction networks (II) .................................................................................... 30

4.11. Generation of PME-1 interactome using MS/AP (III) ................................................................ 31

4.12. Relevance rank platform (RRP) for predicting functional similarity (III) ................................... 31

Page 4: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

4

4.13. Identification of nitrosylated proteins (IV) ............................................................................... 33

4.14. Functional analysis of the proteomic data (IV) ......................................................................... 33

5. RESULTS ............................................................................................................................................. 34

5.1. Comparative evaluation and integration of target selectivity profiles (I) ................................... 34

5.1.1. Comparative analysis of the compound bioactivity studies ................................................ 34

5.1.2. Evaluation of the reliability of the compound bioactivity studies ....................................... 34

5.1.3. Integration of kinase inhibitor bioactivity data ................................................................... 35

5.2. KISS for prediction of single and combinatorial cancer drivers (II)............................................. 35

5.2.1. Comparison of KISS with other driver deconvolution approaches...................................... 36

5.2.2. Comparison of KISS and transcriptomic profiles ................................................................. 37

5.2.3. KISS-predicted kinase addictions showed consistency between independent drug collections ...................................................................................................................................... 37

5.2.4. KISS-based prediction of pharmacologically actionable drivers in breast cancer cell lines 37

5.2.5. Predicted signal addictions form cell line specific driver networks .................................... 38

5.2.6. Combinatorial KISS prediction of co-essential target pairs and synergistic drug combinations ................................................................................................................................. 39

5.3. RRP analysis of PIN1 and PME-1 interactome (III) ...................................................................... 39

5.3.1. Functional similarity ranking of PIN1 interactome .............................................................. 39

5.3.2. Functional similarity ranking of AP-MS-derived PME-1 interactome .................................. 41

5.4. Functional analysis of S-nitrosylated proteins (IV) ..................................................................... 42

6. DISCUSSION AND CONCLUSIONS....................................................................................................... 44

7. ACKNOWLEDGEMENTS...................................................................................................................... 49

8. REFERENCES....................................................................................................................................... 51

Page 5: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

5

ABBREVIATIONS

AD – Alzheimer’s disease

AP – affinity purification

AP-MS – affinity purification coupled with mass spectrometry

APP-amyloid precursor protein

CML- chronic myeloid leukemia

DSS – drug sensitivity score

EC50 - half-maximal effective concentration

ER – estrogen receptor

GSEA – gene set enrichment analysis

HSA - highest singe agent

IC50 – half-maximal inhibitory concentration

Kd – dissociation constant

Ki – inhibition constant

KIBA - kinase inhibitor bioactivity

KISS - kinase inhibition sensitivity score

KS – Kolmogorov-Smirnoff

MS-mass spectrometry

NGS – next generation sequencing

NO – nitrogen oxide

PPI- protein-protein interaction

PR – progesterone receptor

PTM – posttranslational modification

qDT – quantitative drug target (data set)

RRP – relevance rank platform

SNO -S-nitrosothiol

TNBC – triple negative breast cancer

uDT – unary drug-target (data set)

URSA – universal response surface approach

ZIP - zero interaction potency

Page 6: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

6

LIST OF ORIGINAL PUBLICATIONS

This thesis is based on the following four publications (referred to in the text by their Roman numerals I-IV):

I. Tang J, Szwajda A, Shakyawar S, Xu T, Hintsanen P, Wennerberg K, Aittokallio T. Making sense of large-scale kinase inhibitor bioactivity data sets: a comparative and integrative analysis. J Chem Inf Model. 2014 Mar 24; 54(3):735-43.

II. Szwajda A, Gautam P, Karhinen L, Jha SK, Saarela J, Shakyawar S, Turunen L, Yadav B, Tang J, Wennerberg K, Aittokallio T. Systematic mapping of kinase addiction combinations in breast cancer cells by integrating drug sensitivity and selectivity profiles. Chem Biol. 2015 Aug 20; 22(8):1144-55.

III. Pokharel YR*, Saarela J*, Szwajda A*, Rupp C, Rokka A, Karna SKL, Teittinen K, Corthals G, Kallioniemi O, Wennerberg K, Aittokallio T, Westermarck J. Relevance Rank Platform (RRP) for functional filtering of high content protein-protein interaction data. Mol Cell Proteomics. 2015 Dec; 14(12):3274-83.

IV. Zaręba-Kozioł M, Szwajda A, Dadlez M, Wysłouch-Cieszyńska A, Lalowski M. Global analysis of S-nitrosylation sites in the wild type (APP) transgenic mouse brain-clues for synaptic pathology. Mol Cell Proteomics. 2014 Sep; 13(9):2288-305.

* Equal contribution

Page 7: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

7

AUTHOR’S CONTRIBUTIONS

Publication I: Pre-processing of the datasets to enable comparative analysis: collection of kinase and compound name synonyms and identifiers from external databases; comparison of the three bioactivity profiling studies (Anastassiadis et al., 2011; Davis et al., 2011; Metz et al., 2011) in terms of target, compound and interaction overlaps; evaluation of consistency between different bioactivity measurements; extraction of the target profiling data from ChEMBL database and comparison of their distributions to the three studies; analysis of the frequencies of different bioactivity types in ChEMBL; evaluation of the reproducibility of the data from the three studies using ChEMBL; participation in writing of the manuscript.

Publication II: Method development and implementation; all data analysis needed for the publication: generation of single and combinatorial kinase addiction rankings, comparison with other target deconvolution methods, method evaluations, design and analysis of the validation experiments from the drug and siRNA combinations, functional analysis of the protein interaction networks, literature-based evaluation of the top predictions, writing of the manuscript. The compound screening and validation experiments were performed by other co-authors.

Publication III: Participation in the design of the ranks of relevance rank platform (RRP), application of the rank definitions and RSA method to the siRNA and gene expression data, analysis of the protein interaction data using Prohits platform and FunCoup database as well as comparison of the results to RRP; participation in writing of the manuscript. The contributions of the other first authors are related to the experimental part of the publication: Jani Saarela performed siRNA screens in PC3 and PNT2 cell lines; Yuba R. Pokharel designed the RNAi libraries and performed the functional experiments.

Publication IV: Functional analysis of the mass spectrometry data: finding enriched GO biological process and KEGG pathway terms for S-nitrosylated proteins in FVB and hAPP mouse brain synaptosomes as well as in the reference set using ClueGO plug-in for Cytoscape; generation of disease network connecting APP with differentially nitrosylated proteins using GeneMania database; participation in preparing the manuscript and figures.

Page 8: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

8

ABSTRACT

Genomics-based drug discovery utilizing sequencing data for elucidation of candidate targets has led to the development of a number of successful treatments in the last decades. However, the molecular driver signals for many complex diseases cannot be easily derived from genome sequencing. Functional profiling studies, such as those involving the detection of protein interaction networks or the effects of perturbations with small molecules or siRNAs on cellular phenotypes, offer a complementary approach for the identification of molecular vulnerabilities that can be exploited in the development of new treatment strategies. The goal of this thesis was to develop computational systems biology methods for supporting such functional endeavors, and through their application use cases, to elucidate novel disease driver signals in cancer and Alzheimer’s disease networks. The availability of functional profiling data (such as biochemical target selectivity information or efficacy readouts) for numerous small molecule compounds has enabled building interaction network models to predict cancer addictions i.e. genes that are essential for disease progression but are not necessarily mutated. In this work, network-based computational methods (such as kinase inhibition sensitivity score – KISS) were developed to infer disease addictions (either single genes or sub-networks) using functional data from high-throughput drug sensitivity screens, and applied in breast cancer cell lines. Further extension of the KISS method, named combinatorial KISS, was introduced as a novel approach to predict synergistic drug combinations and their underlying co-essential target pairs. Driver deconvolution from drug response profiles relies on extensive and reliable drug-target interaction networks. Therefore, a systematic evaluation of target selectivity profiles was performed among recently published large-scale biochemical assays of kinase inhibitors, combined with data reported in the drug-target databases ChEMBL and STITCH. Our comparative evaluation revealed relative benefits and potential limitations among various bioactivity types, including IC50, Ki, and Kd. To make better use of the complementary information captured by the various bioactivity types, we developed a model-based integration approach, termed KIBA, and demonstrated how it can be used to classify kinase inhibitor targets. As a result, we created kinome-level, quantitative drug–target interaction network for further modeling studies. Besides the analysis of drug responses, another way to find novel disease drivers or molecular vulnerabilities is to explore the interaction partners of known oncogenes, since the existence of a protein-protein interaction suggests their involvement in the same biological pathway and thereby in the same biological process. However, one of the major challenges in the protein-protein interaction screens is the identification of functionally relevant interactions from the long hit list, in particular when their functional annotations are missing. This motivated the development of Relevance Rank Platform (RRP) approach that can suggest the candidate proteins from the high throughput screens that most likely contribute to the function of the

Page 9: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

9

bait protein. The method predicts functionally similar candidate interactors regardless of either the reliability of the mass spectrometry-based identification, or the knowledge of the biological function of the putative interactor. RRP was applied and validated in PIN1 (Peptidyl-prolyl cis-trans isomerase NIMA-interacting 1) and PME-1 (Protein Phosphatase Methylesterase 1) interaction networks in prostate cancer. Finally, we carried out functional comparison of nitrosylated proteins in the brain synaptosomes of Alzheimer’s disease (AD) mouse models and their healthy controls, with the aim to reveal the disease processes in which this posttranslational modification plays a role. We also elucidated the amyloid precursor protein (APP) - centered Alzheimer’s disease network of differentially nitrosylated proteins that are likely to be implicated in this neurological disorder. Taken together, this thesis work introduces novel experimental-computational strategies for the deconvolution or prioritization of potential disease drivers, either single proteins or their subnetworks. These methods are applicable to various cell lines or patient-derived samples. They can provide directly druggable therapeutic targets for personalized treatment applications and may be used in the development of novel therapeutic options.

Page 10: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

10

1. INTRODUCTION

The advances in DNA sequencing technologies have led to a massive growth of data describing the genomic aberrations implicated in complex diseases such as cancers. As a result, thousands of mutations of various types have been identified and associated with clinical phenotypes (Campbell et al., 2008; Forbes et al., 2008; Stephens et al., 2009). Mutations shown to be causative of a given disease have typically been referred to as disease drivers, as opposed to the passengers, i.e., mutations without causative effect (Bozic et al., 2010). Translation of such sequencing-based findings into clinically actionable drug targets presents several challenges related to the extensive heterogeneity of the cancer samples and the possibility of changing roles of driver and passenger mutations during disease progression (Garraway and Lander, 2013; Vandin et al., 2012). In addition, the development of treatment resistance due to compensatory signals highlights the need for combinatorial treatments (Al-Lazikani et al., 2012). The translation is further complicated by the fact that many of the genetic aberrations are not clinically actionable due to the lack of existing drugs or they may be difficult to modulate by a small molecule. A good example of such genomic driver is RAS gene whose mutations have been found in many cancers, while its therapeutic targeting continues to be a significant challenge (Cox et al., 2014). Thus, it has become evident that the knowledge of DNA sequence must be complemented with functional studies to enable effective translation into new therapeutic strategies.

Functional and phenotypic profiling of disease samples has become widely used in recent years. These profiling efforts involve the investigation of cellular response following a perturbation by small chemical molecules or siRNAs as well as experimental or in silico mapping of protein-protein and drug-target interactions (Ba-alawi et al., 2016; Barretina et al., 2012; Brough et al., 2011). Personalized medicine programs have started offering high throughput drug sensitivity and resistance testing for rapid identification of effective compounds in patient samples (Pemovska et al., 2013). The availability of these data provides an opportunity for the development of novel approaches for disease driver identification, complementary to those based on DNA sequencing. In addition, computational methods for predicting effective combinations of drugs or co-essential targets for a given disease are necessary because experimental testing of all possible pairs is laborious and expensive even in automated high-throughput settings. Therefore, the goal of this thesis was to develop computational systems biology strategies to address these needs and to elucidate essential disease drivers from functional studies. The definition of a disease driver in the context of this thesis is broader from the one described above, because it does not imply the existence of a genomic mutation. Here, disease drivers are equivalent to signal addictions i.e. molecular vulnerabilities essential for the disease process and inferred from functional profiling data. Since driver signal deconvolution from high throughput compound response screens requires the knowledge of targeted proteins, we also aimed to compare and integrate the available small molecule selectivity profiles in order to elucidate the relevant compound-target interactions.

Page 11: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

11

It is important to remember that cellular phenotypes, including disease phenotypes, are ultimately established at the protein level, even if they have genetic basis (Lage, 2014). For example, DNA sequencing does not capture posttranslational modifications or protein degradation. Moreover, it does not tell if and how a mutation affects the protein function, how it impacts the protein interaction networks and re-wires the signal flow. On the road from genotype to phenotypes, it is the protein signaling that ultimately determine the phenotype (Sevimoglu and Arga, 2014). Protein interaction networks are thus crucial cellular effectors and the link between genotype and phenotype. Therefore, deciphering the interactomes of disease drivers may provide additional layer of information for better understanding of their function and how they could be modulated (Yi et al., 2017). These networks are, however, tissue and disease-specific and highly dynamic (Yeger-Lotem and Sharan, 2015). This presents a challenge in the translation of the existing knowledge of protein interactomes, because not all known interactions of a disease driver are indeed functionally relevant for a particular biological question (Yeger-Lotem and Sharan, 2015). Therefore, we aimed to develop a method that may assess functional relevance of protein interaction networks. As a case study, we chose proteins that play a part in cancer and we focused on predicting the interactors whose function would be relevant for this oncogenic role. Taken together, the systems biology strategies presented in this work will contribute to the elucidation of single and combinatorial disease drivers as well as relevant drug-target and protein-protein interactions. These methods may become valuable in the translation of the functional profiling data into novel therapeutic options and, through their application use cases, they may improve the understanding of complex diseases such as cancer or Alzheimer’s disease.

Page 12: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

12

2. REVIEW OF LITERATURE

The focus of this work was to develop integrative data analysis methods aiming at the identification of disease drivers and their underlying networks. As input for this analysis we used data from various high throughput profiling experiments obtained from publically available databases or generated by our collaborators. These included readouts from chemical and genetic perturbation screening, compound-target selectivity profiling as well as mass spectrometry–derived protein-protein interactions and posttranslational modifications. In this literature review I will first briefly introduce the concept of disease drivers and the typical approaches for their identification. Next, I will give an overview of the experimental methods used to generate the data that served as input for the subsequent bioinformatic identification of disease drivers in this work. Finally, I will provide a general description of the diseases that were used in our method application case studies.

2.1. Methods for the identification of disease drivers (II, III) Identification of cancer driver genes has typically involved next generation sequencing (NGS)-based genomic studies (Wood et al., 2007; Vogelstein et al., 2013). The advances in DNA sequencing technologies have enabled rapid identification of the spectrum of genetic changes in cancer genomes, where hundreds or thousands of various types of mutations are often found (Jones et al., 2008; Sjöblom et al., 2006). However, only a small portion of these mutations has a functional (or causative) effect, providing a selective growth advantage to the cancer cell and is therefore positively selected in a mixed cell population (Carter et al., 2009; Vogelstein et al., 2013). In order to elucidate the molecular mechanisms behind cancer development, which could be used as diagnostic markers or therapeutic targets, these so-called driver mutations need to be distinguished from the passengers, i.e., the non-causative sequence variants that do not contribute to the cancer development (Stratton et al., 2009). In the last decade, various methods have been developed to differentiate between driver and passenger somatic mutations revealed from large-scale cancer genome sequencing data (Gnad et al., 2013). Some of the statistical approaches focus on comparing the frequencies of mutations found in cancer samples, assuming that the driver genes tend to be mutated more often than passengers in cancer cells. Examples of implementation of this type of approach include CancerMutationAnalysis R package (Parmigiani et al., 2014) and Mutsig (for ‘Mutation Significance’) (Lawrence et al., 2013). Other computational predictive methods for distinguishing drivers from passengers are based on machine learning (for example CHASM (Carter et al., 2009; Wong et al., 2011), Polyphen-2 (Adzhubei et al., 2013), SNAP (Bromberg and Rost, 2007) or on assessment of functional impact of mutations (for example TransFIC (Gonzalez-Perez et al., 2012) or Oncodrive-fm (Gonzalez-Perez and Lopez-Bigas, 2012)). However, comparative analysis of these approaches has revealed significant differences in their prediction results (Gnad et al., 2013). Moreover, these approaches are challenged by the possibility of changing roles of driver and passenger mutations during tumor progression, among several other drawbacks (Garraway

Page 13: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

13

and Lander, 2013; Raphael et al., 2014; Vandin et al., 2012). In particular, they may often pinpoint drivers that are not druggable or clinically actionable, if there are no targeting drugs available. It is also known that many apparent driver mutations do not link to addictions to the function of the gene products. Moreover, genes not altered at the genomic level may also play essential role in the disease progression (Pe’er and Hacohen, 2011). Translation of such genomic findings into therapeutic strategies is further complicated by extensive mutational heterogeneity both between different individuals diagnosed with the same cancer and within a given tumor (Vandin et al., 2012). Further, the functional importance and physiological impact of the majority of cancer genetic alterations is unclear (Akavia et al., 2010). All these issues highlight the need for the development of complementary, functional approaches for disease driver identification.

2.2. Oncogene addiction and other molecular vulnerabilities in cancer (II) In cancer the presence of oncogenic driver mutations not only allows for malignant transformation but also supports the survival of established tumours (Luo and Lam, 2013). This leads to a phenomenon called ‘oncogene addiction’, in which tumor cells survival and growth is dependent on an activated mutant oncogene or an inactivated tumor suppressor (Yan et al., 2011). Consequently, despite the numerous genetic and epigenetic abnormalities accumulated in cancer cell, the reversal of only one or a few of them can have significant therapeutic benefits (Torti and Trusolino, 2011; Weinstein and Joe, 2008). The concept of oncogene addition has become the basic rationale for the development of targeted therapies for cancer treatment (Torti and Trusolino, 2011; Weinstein and Joe, 2006; Yan et al., 2011).

However, the clinical targeting of mutated cancer drivers is often hampered by clonal heterogeneity and mutational evolution of cancer cells that ultimately leads to treatment resistance (Bozic et al., 2014; Shah et al., 2012; Torti and Trusolino, 2011). The difficulties in clinical translation of oncogenetic driver mutations resulted in the search for other molecular vulnerabilities that can be exploited by new therapies (Luo et al., 2009). In particular, the proper functioning of non-mutated genes has been shown to be critical for maintaining the malignant phenotype (Bradner et al., 2017; Chen et al., 2010; Freije et al., 2011). This phenomenon called non-oncogene addiction involves the dependency of cancer cells on the activities of a wide variety of genes and pathways, many of which are not inherently oncogenic themselves (Solimini et al., 2007). Importantly, these genes and pathways are essential to enhance cancer survival but are not required to the same degree for the viability of normal cells (Luo et al., 2009). Detection of such addictions or molecular vulnerabilities usually involves siRNA or CRISPR- based gene knock-downs and is often referred to as gene essentiality screening (Brough et al., 2011; Cheung et al., 2011; Tsherniak et al., 2017; Wang et al., 2017).

Non-oncogene addiction can arise as a result of synthetic lethal relations (Nagel et al., 2016). Synthetic lethality is an extreme case of genetic interaction in which two single gene perturbations that individually cause no changes in phenotype are lethal in combination (O’Neil et al., 2017). This concept originates from studies in yeast where it was observed that the presence of a mutation in two genes causes cell death but a mutation in only of them has

Page 14: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

14

no impact on survival (Tong et al., 2001). Identification and targeting of synthetic lethal partners of inactivated tumor suppressors can therefore be utilized in new treatment strategies. An example of such therapy that reached clinic is the treatment of BRCA1- or BRCA2- deficient breast cancer with PARP inhibitors (Dedes et al., 2011). Further examples of non-oncogene addiction include the dependency of PTEN-mutant breast cancer cells on the mitotic checkpoint kinase (TKK) or the addiction of KRAS-mutant lung cancer to GATA2 (Brough et al., 2011; Kumar et al., 2012). In the context of this work, the definition of synthetic lethality is equivalent to decreased cell viability observed only upon joint but not individual inhibition of two target proteins. These two target proteins are thus co-essential and jointly constitute the signal co-addiction.

The presence of non-oncogene addictions including synthetic lethal interactions reflects the complex interplay and interdependence of biological pathways. A mutation in one gene can alter the activity or expression of multiple other genes making them essential for disease process (Bradner et al., 2017). Complex diseases like cancer often exhibit more than one molecular vulnerability (Brough et al., 2011; Cheung et al., 2011). In addition, driver mutations tend to cluster around certain pathways suggesting an addiction to the signal flow in a given interaction network (Tonon, 2008). These observations suggest that the concept of ’network addiction’ or ’driver network’ is more suitable to describe the dependency of the disease process on multiple interacting genes or proteins (Tonon, 2008; Yan et al., 2011). The elucidation of such networks whose components drive the whole system from normal to disease state has recently gained a lot of attention not only in cancer research, but also in other complex diseases (Dutta et al., 2012; Huan et al., 2015; Li et al., 2014; Mäkinen et al., 2015; Ren et al., 2017).

The term ’disease addiction’ or ‘disease driver’ used later in the context of this thesis is used in a broad sense to describe a molecular vulnerability and encompasses both oncogene and non-oncogene addictions. This is because the presence of a molecular addiction in this work does not imply the existence of any genetic aberrations, but only a dependence of disease phenotype on a given protein activity or the signal flow in the interaction network it is part of. Addictions can thus refer to both single genes or proteins and their combinations or networks.

2.3. Functional proteomics using mass spectrometry (III, IV) Functional proteomics investigates the biological function of proteins by detecting their interconnectivity based on the assumption that the association of proteins is indicative of their common involvement in a biological process (Köcher and Superti-Furga, 2007). Modern global proteomics relies primarily on mass spectrometry (MS), the development of which has revolutionized biomedical proteomic studies, similar to the impact of next generation sequencing on genomics (Stapley et al., 2010). The MS instruments work by converting the analyte molecules into gas-phase ions and separating them based on their mass-to-charge ratio. Next, the number of ions at each mass-to-charge value is recorded and from these measurements a spectrum is generated (Han et al., 2008). The mass-to-charge spectrum enables the identification of the molecules based on their molecular weights and also allows

Page 15: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

15

for quantification based on the height to intensity ratio of the peak for each ionized molecule. The advent of the techniques to ionize peptides (MALDI and electrospray) made it possible to implement MS in biology where it has found many uses (Banerjee and Mazumdar, 2012; El-Aneed et al., 2009); those relevant to this thesis are briefly described below.

2.3.1. Quantitative MS Quantitative MS enables the identification of protein complexes at near to endogenous levels as well as the determination of protein abundance and their post-translational modifications (PTMs) (Aebersold and Mann, 2003; Matthiesen et al., 2011). There are basically two experimental approaches to protein quantification, labeled and label-free. The label-free approach to quantitative MS has become commonly used method for determining protein levels due to its simplicity since it does not require any special sample preparations. This method is solely based on the relative quantification of spectral counts or peak heights in the mass-to-charge spectrum, and it is commonly being used for estimation of the relative abundance of proteins within a complex sample (Old et al., 2005). For instance, label-free quantitative MS was recently used to create a map of the human proteome (Kim et al., 2014; Wilhelm et al., 2014) and for characterizing the proteomic landscape of triple negative breast cancers (Lawrence et al., 2015). Labeling the proteins of one sample can be achieved by culturing those cells in medium containing amino acids made up of heavier nitrogen isotopes (the so called stable isotope labeling by amino acids in cell culture, SILAC) (Ong et al., 2002). Analyzing labeled peptides from cells cultured in normal vs. heavy-isotope medium enables side-by-side comparison of the peaks in the mass-to-charge spectrum and thus relative peptide quantification. Alternative methods for labeled, quantitative MS include the isobaric tags for relative and absolute quantification (iTRAQ) (Ross et al., 2004) and tandem mass tags (TMTs) (Dayon et al., 2008; Thompson et al., 2003).

2.3.2. Affinity Purification (AP) coupled with MS Protein-protein interactions (PPIs) can be detected using affinity purification coupled with MS (AP-MS). This method involves the introduction of a tag to mark the protein of interest (bait) inside cells in culture through recombinant DNA-technology. There are many different types of tags that can be used for these experiments, including Strep, FLAG and TAP-tag (Li, 2011). By capturing the tag after lysing the cells, it is possible to isolate the tagged protein along with any binding partners it may have. The tag-capture affinity purification followed by MS has become a standard method for characterization of protein complexes, and it has been used to characterize the interactomes of Saccharomyces cerevisiae (Gavin et al., 2006; Krogan et al., 2006), Escherichia coli (Butland et al., 2005; Hu et al., 2009), as well as for large scale analysis of human complexes (Bouwmeester et al., 2004; Jeronimo et al., 2007). Currently, there is a variety of tags and their combinations used in the purification process, as well as several types of tandem MS instruments, which allow for efficient sequencing of peptides derived from digested complexes. The various experimental setups are reviewed in more detail in previous work (Gingras et al., 2007; Li, 2011; Paul et al., 2011).

2.3.3. Limitation of MS-approaches The usage of MS technology for protein identification has several drawbacks, including the dependence on the predefined sequences in protein databases or being biased against

Page 16: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

16

identifying interactions with proteins expressed at low levels. Weak, transient associations can also be missed in AP-MS (Cho et al., 2004; Völkel et al., 2010). The false negative findings in MS can happen as a consequence of a disturbance in complex formation. This may occur at the level of bait tagging, during expression, lysis or complex purification (von Mering et al., 2002). Failure to detect true interactions in AP-MS can also arise due to a problem with relative abundances of proteins in the AP sample. On the other hand, contaminants present in the sample and the difficulties in distinguishing background from complex-forming protein may give rise to false positive interactions (Altelaar et al., 2013). However, the use of double purification protocols in a large-scale experiment helps to minimize false positive rates due to increased sample quality as well as the possibility to apply noise filtering techniques, since it is easier to identify contaminating proteins pulled-down with many unrelated baits in a high-throughput setting (Cho et al., 2004). The frequency of co-purification in the data obtained from multiple interaction screens has been collected in CRAPome database (Mellacheruvu et al., 2013), which is a useful resource for elimination of likely contaminants in AP-MS experiments. The development of statistical approaches such as SAINT (Choi et al., 2011; Skarra et al., 2011), which assign confidence scores for identified protein interactions, has allowed for better data filtering and improvement of hit reliability.

2.4. Protein-protein interaction networks (III, IV) PPIs are intrinsic to every cellular process, ranging from DNA replication through all stages of gene expression, cell cycle to formation of macrostructures such as cytoskeleton or mitotic spindle. Transient PPIs are responsible for most modifications of proteins, such as phosphorylation or transfer of other functional groups that are crucial for proper protein functioning. These interactions are in turn fundamental to signal transduction and metabolic pathways. The analysis of proteins with known functions shows that many proteins that share the same cellular functions interact with each other (von Mering et al., 2002). Moreover, a single protein can interact with a diverse number of partners under different conditions, which results in different phenotypes (so-called plasticity). In other words, the proteome is a highly dynamic entity, constantly changing through time and space or under various environmental cues. This, in turn, can have important consequences for many diseases (Cho et al., 2004).

A protein exerts its functions through its stable and/or transient interactions with other molecules. Predicting gene/protein function on the basis of comparative sequence analysis is a powerful approach but its use is limited (Huynen et al., 2000; Korbel et al., 2004). One reason for this is that sequence homology oversees the fact that the regulation and context-specificity of gene expression constitute an important part of gene function. For example, regulatory proteins, such as Ras or Myc, may have many disparate, and sometimes opposite functions, e.g. either proliferation or differentiation or apoptosis, depending on the cell type and state as well as the activity of other proteins (Huang, 2004; Huang and Ingber, 2000). Similarly, C-Jun N-terminal kinase (JNK) can be both anti- and pro-apoptotic, depending on the state of the network when it receives a tumor necrosis factor (TNF) or epidermal growth factor (EGF) cue. The context (i.e. the state of the molecular network) also accounts for 60-80% of any in vivo kinase substrate specificity during phosphorylation events (Linding et al., 2007). Thus, it

Page 17: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

17

seems reasonable to assume that most proteins exert their biological function as members of a molecular network.

The interconnectivity of PPI networks, together with the resulting functional interdependency between the underlying cellular components, implies that the potential impact of a mutation on a given gene is not limited only to the protein the gene encodes, but it can also spread along the links in the interaction network, and thereby alter the activity of other gene products that may be otherwise free of any defects (Yeger-Lotem and Sharan, 2015). Therefore, in-depth elucidation and understanding of molecular networks is essential to determine the impact of any defect on the phenotype (Jørgensen and Linding, 2010). Consequently, a disease seldom results from abnormality of a single network component, such as gene or protein, but reflects a perturbation of the complex network that spans the tissues and organs (Barabási et al., 2011). In particular, complex diseases, such as cancers and diabetes, usually arise from small defect in many genes, rather than from large defects in a few genes (Schadt, 2009).

Finding additional components in PPI networks provides a more global view of a given cellular process in a normal healthy tissue or in its disease state. This, in turn, can be fruitful for understanding disease mechanisms, identifying new biomarkers and discovering novel therapeutic targets. For instance, analysis of the first draft of human interactome revealed that protein network connectivity and diseases are strongly associated, which provided insights into the molecular mechanisms underlying complex diseases (Zanzoni et al., 2009). These findings suggest that both PPIs and their sub-networks (so-called network modules) could be used as targetable entities in novel therapeutic strategies. Complex traits and diseases, such as cancer, diabetes or atherosclerosis, are in fact currently regarded as diseases of networks rather than of individual components (Tegnér et al., 2007).

Awareness of the importance of the network context has led to the emergence of the so-called network medicine that aims to understand pathological phenotypes, which are regarded as network diseases, at the systems level (Barabási et al., 2011). It has also influenced drug discovery strategies, which have recently shifted towards identification of “network drugs”, i.e., targeting the architecture and dynamics of the aberrant signaling networks associated with a given disease (Erler and Linding, 2010; Pawson and Linding, 2008). This approach may also enable the prediction of patient groups of responders or side effects of the candidate drugs (Schadt, 2009). For all these reasons, many PPI maps have been elucidated for various organisms, including human, both proteome-wide (interactomes), and more focused on a particular disease or process (Albers et al., 2005; Ito et al., 2001; Li et al., 2004; Rual et al., 2005; Stanyon et al., 2004; Stelzl et al., 2005).

2.5. Protein phosphorylation and nitrosylation (II, IV) Following translation, the newly produced proteins are often covalently modified with specific non-peptide moieties, depending on their designated function (e.g., membrane proteins are often glycosylated). One common type of a reversible posttranslational modification (PTM) that is of special interest to cancer is phosphorylation, which involves the

Page 18: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

18

covalent transfer of a phosphate group from ATP to serine, threonine, histidine or tyrosine residues at specific sites on a protein. Covalent attachment of one or more phosphate groups creates conformational changes to the protein that may either activate or inhibit its function. Kinases, which are enzymes that perform phosphorylation, are the key actors in signal transduction networks that regulate cell growth, replication and apoptosis (Ciesla et al., 1996; Johnson, 2009; Johnson et al., 2014). Since intracellular kinases are essential gatekeepers of cell replication and survival, their activity is normally tightly controlled as unchecked kinase activity may cause aberrant cell proliferation, as seen with the BCR-ABL fusion protein in chronic myeloid leukemia (CML). When fused to BCR through a translocation between chromosomes 9 and 22, the ABL tyrosine kinase activity becomes very high, which promotes proliferation and survival (Di Bacco et al., 2000). Another well-known kinase driver is B-Raf, which when mutated at specific points, is unleashed from its tight regulation and starts sending proliferation/survival signals through phosphorylation of its targets. BCR-ABL in CML and mutated B-Raf in many malignant melanomas are two known examples of cancers being addicted to a single kinase driver. In the case of BCR-ABL, the drug imatinib is usually able to completely control CML-progression (Assouline and Lipton, 2011). Though often effective for only a limited period of time, inhibitors of B-Raf, such as vemurafenib, are also approved drugs for treatment of metastatic melanomas. In general, however, kinases are rarely mutated in cancers, even though many cancer cells exhibit an addiction to kinase signaling i.e. their survival is dependent on kinase activity (Torkamani et al., 2009; Tyner et al., 2013).

S-nitrosylation is another type of reversible PTM that involves the covalent attachment of reactive nitrogen oxide (NO) to the free thiol-residue of certain exposed cysteines on nearby proteins, forming an S-nitrosothiol (SNO) (Stamler et al., 2001). NO is a free radical that is locally produced from L-arginine and NADPH by NO synthase enzyme. Analogous to phosphorylations, S-nitrosylation may modulate protein activity and function through conformational changes. This PTM is believed to be one of the key mechanisms in NO-based signaling, which is found in almost all biological systems (Hess et al., 2005; Stamler et al., 2001). The cellular processes that are regulated by nitrosylation include for instance transcription regulation (Marshall et al., 2000), DNA repair (Foster et al., 2009), and apoptosis (Mannick, 2007). Aberrant S-nitrosylation has been associated with cancer (Wang, 2012), heart disease (Murphy et al., 2012), as well as with many neurodegerenerative diseases including Alzheimer’s (Nakamura and Lipton, 2016; Nakamura et al., 2013). In neurodegeneration the nitrosylation process, which is important for proper neural function, is dysregulated by the decreased antioxidant production and the increased production of free radicals that occurs during aging, along with certain environmental toxins. This dysregulation affects protein folding, mitochondrial fragmentation and synaptic transmission (Nakamura et al., 2013).

2.6. Chemical and genetic perturbation screening (II) High-throughput screening of chemical compound libraries is the comprehensive testing of the effect of multiple small molecules on protein targets or cellular phenotypes such as cell proliferation or death. This technology has become increasingly popular because it provides a

Page 19: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

19

readout that is useful for various medical applications. In the drug discovery process, for instance, compound screening has been traditionally used to find molecules interacting with a target of interest (McFedries et al., 2013). In cancer research, profiling of cancer cell responses to a diverse collection of anti-cancer drugs and other drug-like molecules provides an opportunity to discover not only novel therapeutic options but also new indications to the already existing drugs (so-called drug repurposing). For example, CML patient-based ex vivo drug sensitivity and resistance screening recently showed that axitinib, an anti-angiogenic drug used in the therapy of renal cancer, show selective efficacy also in a subset of CML patient cells resistant to the standard treatment with ABL1 inhibitors (Pemovska et al., 2015).

When combined with drug target information, phenotypic screening of cellular responses can also be used to suggest the signaling pathways and networks behind drug sensitivity or resistance and to identify druggable molecular vulnerabilities in individual cancer cell lines or patient samples (Heiser et al., 2012; Pemovska et al., 2013). Large scale profiling of compound responses across hundreds of small molecules in various concentration and across many cancer samples facilitates stratification of patients and disease subtypes. In general, such ex vivo drug sensitivity and resistance screens are an important component of personalized medicine programs, which aim at testing the individual patient responses and thereby helping to choose the right therapeutic options for each patient (Pemovska et al., 2013). Recent in vitro studies have also linked specific genetic aberration with the observed drug responses or found genomic biomarkers predictive of the drug sensitivity in cancer cell lines (Barretina et al., 2012; Garnett et al., 2012).

Functional investigation of disease-related proteins has traditionally been carried out using their knock-downs via antibody-based inhibition on a protein level or RNAi-induced silencing of the target genes. In RNAi-based technology, repression of the target gene expression is primarily achieved by the degradation of mRNA transcript using synthetic anti-sense oligonucleotides or plasmids. High-throughput implementation of this method led to identification of genes associated with specific biological processes (Badertscher et al., 2015; Simpson et al., 2012) and to the emergence of large scale vulnerability maps in various cancer cell line models (Brough et al., 2011; Campbell et al., 2016; Cheung et al., 2011). More recently, CRISPR-Cas9 knockout-based technology has been developed and applied to investigate gene function and essentiality in cancer (Hart et al., 2015; Tzelepis et al., 2016; Wang et al., 2014). However, the physiological effects of such perturbations are not equivalent to pharmacological modulation of a target protein by a small drug-like molecule (Weiss et al., 2007). Therefore, it has been realized that screening of chemical compounds specifically targeting the proteins under study could offer a complementary and more pharmaceutically-relevant readout for evaluation of the consequences of their inhibition.

Going beyond single gene perturbation, double knockdowns to find synthetic lethal or synergistic interactions have recently gained a lot of attention (Luo et al., 2009; Wang et al., 2014). Simultaneous perturbation of such gene pairs, for example by a drug or siRNA combination, may provide highly selective means to kill cancer cells without severe damage to normal cells (Porcelli et al., 2012). It can also prevent the emergence of treatment resistance which is less likely to arise when multiple targets are jointly inhibited (Jokinen et

Page 20: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

20

al., 2014). Therefore, computational prediction of synthetic lethal interactions as well as synergistic drug combinations targeting co-essential proteins has become an area of intensive research in systems biology (Goswami et al., 2015; Huang et al., 2014; Jerby-Arnon et al., 2014; Sinha et al., 2017).

2.7. Quantitative scoring of single and combinatorial drug responses (II) The analysis of high-throughput ex vivo small molecule response profiles is highly dependent on the metric used for their quantification. Traditionally, measuring the degree of drug efficacy in a cell viability/death assay has involved the estimation of half-maximal inhibitory or effective concentration (IC50 or EC50, respectively) on the basis of the dose-response curve (Barretina et al., 2012; Garnett et al., 2012; Tyner et al., 2013). Even though these scoring metrics have turned out to be sufficient for many applications, they involve only a single parameter and can therefore capture only limited information about the drug response. The relative IC50 values for a given pair of compounds can in fact be identical despite a clear difference in the shapes of their dose response curves (Figure 1A). This limitation poses challenges in high-throughput profiling setting that involves screening of multiple compounds and cell types using several compound concentrations, especially when the aim is to identify selective responses for a particular subset of samples (e.g. patients vs. controls) (Yadav et al., 2014).

Figure 1. Quantitative scoring of small molecule responses tested across several concentrations in a cell viability assay. Two compounds may have the same relative half-maximal inhibitory concentration (IC50) i.e. the concentration of the inhibitor resulting in 50% of the maximal observed cell growth inhibition effect, even though their dose-response curves and thus their efficacies are clearly different (A). The drug sensitivity score (DSS) is a novel metric designed to effectively capture complex dose-response relationships. Its calculation is based on the integration of the area under the dose-response curve that exceeds a given minimum activity level (Amin) taking into account several parameters such as IC50, the slope of the curve at IC50 as well as the bottom and top asymptotes of the curve (Rmin and Rmax) (B).

Page 21: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

21

Recently, we developed a novel metric, named drug sensitivity score (DSS), which was shown to perform better than the traditional quantification by IC50 in various application cases and experimental settings (Yadav et al., 2014). The DSS metric takes into account multiple response parameters estimated through fitting the logistic dose-response function, and thereby allows for better capturing of the difference in response profiles between various cell types or between normal and cancer cells. More specifically, DSS is based on the integration of the area under the estimated dose response curve over the tested concentration range that exceeds a given minimum activity level (Yadav et al., 2014). The parameters used in the DSS calculation include the half-maximal inhibitory concentration (IC50), the slope of the curve at IC50 as well as the bottom and top asymptotes of the curve (Rmin and Rmax) (Figure 1B).

Quantification of drug combination effects in cell viability or cytotoxicity assays and their classification into synergistic, antagonistic or non-interactive is often based on the comparison of the observed and expected effects using reference models of non-interaction. The most commonly used reference models include Bliss independence (Bliss, 1939), Loewe additivity (Loewe, 1953) and highest single agent (HSA) model (Berenbaum, 1989). These models differ in their assumption regarding the expected effect of non-interaction. In HSA, this effect is defined as the higher individual drug effect, whereas in Loewe additivity model it is equivalent to the effect of combination of a given drug with itself. The definition of the expected effect of non-interaction in Bliss model assumes independent mechanisms of action of the combined drugs and is calculated using the complete additivity concept from probability theory. Drug combinations resulting in greater than expected response are considered to be synergistic and combinations whose effect is smaller than expected are classified as antagonistic.

All the synergy scoring models mentioned above were originally developed for small scale experiments involving a limited number of drugs and their concentrations. Thus, they are not always compatible with the complex drug interaction patterns observed in high-throughput experiments of multiple compounds across a range of concentrations in a matrix format. However, several extensions of these models such as URSA (universal response surface approach) (Greco et al., 1990) or a novel interaction model, named zero interaction potency (ZIP) (Yadav et al., 2015), have been developed to address some of these limitations by incorporating the response surface over all tested concentrations. For instance, the ZIP model combines the Bliss and the Loewe models, and utilizes a delta score to characterize the entire synergy landscape resulting from high throughput experiments in a dose-response matrix format.

2.8. Compound-target selectivity profiling (I, II) The drug-like molecules often affect multiple targets and thus the efficacy observed in their screening is a result of modulating both their on and off-target proteins. This applies not only to their toxic responses but also to the therapeutic effects (Anighoro et al., 2014; Peters, 2013). At the same time, targeting multiple disease-specific targets is a promising strategy in drug development (so-called polypharmacology), with the potential to overcome or reduce the

Page 22: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

22

drug resistance and side effects, while enhancing the therapeutic effects (Antolin et al., 2016). Identifying the drug mode of action and the targets underlying the functional responses from small compound screens (so-called target deconvolution) is an important part of phenotype-based drug discovery (Lee and Bogyo, 2013; Terstappen et al., 2007). Target or driver deconvolution is highly dependent on the coverage and accuracy of compound-target interaction networks. Accordingly, target selectivity profiling is essential for the understanding of drug action mechanisms, for the identification of molecular vulnerabilities, and for the development of improved treatment strategies (Plouffe et al., 2008; Tyner et al., 2013).

Recently, several attempts to predict the polypharmacological drug effects have been made using network models, which connect the compounds’ responses with their on and off protein targets, and integrate them with the signaling pathway information (so-called network pharmacology) (Hopkins, 2008; Tang and Aittokallio, 2014; Zhao and Iyengar, 2012). Although the binary network models are a typical way to depict the drug-target interactions, they lack the quantitative bioactivity information and ignore the dependence of these interactions on the tested compound concentrations. This missing information can be obtained using biological assays developed for the compound bioactivity profiling, as in three recent large-scale studies exploring the selectivity of small molecule kinase inhibitors (Anastassiadis et al., 2011; Davis et al., 2011; Metz et al., 2011). In these studies the inhibitory ability of compounds was quantified by thermodynamic dissociation constant (Kd in Davis et al.), inhibition constant (Ki in Metz et al) or the percentage of enzyme activity that remains following its inhibition (activity % in Anastassiadis et al.) In addition, there exist several databases for experimentally-derived or computationally-predicted binary target interaction for various group of chemical compounds, such as KEGG BRITE/DRUG (Kanehisa et al., 2012; Ogata et al., 1999), DrugBank (Knox et al., 2011; Wishart et al., 2006) or Therapeutic Target Database (TTD) (Chen et al., 2002; Zhu et al., 2012).

The most commonly used metrics to quantify compound-target interactions involve measurements of thermodynamic dissociation constant (Kd), inhibition constant (Ki) or the concentration of inhibitor necessary to reduce the target activity by half (IC50) (Gaulton et al., 2012). The equilibrium dissociation constant is determined in a saturation assay. It is equivalent to the concentration of a compound need to occupy 50% of the receptors i.e. the concentration at which 50% of protein target is bound. Thus, the affinity of the drug-like ligand is inversely proportional to the value of Kd (Salahudeen and Nishtala, 2017). The inhibition constant measures the binding affinity of the inhibitor. For a competitive inhibitor which competes with the substrate for binding of the enzyme’s active site, the relationship of Ki and IC50 is described by the Cheng-Prusoff equation:

where [S] is the experimental substrate concentration and Km (Michaelis–Menten constant) is substrate concentration at which enzyme activity is at half maximal. When substrate concen-

Page 23: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

23

tration [S] is very low in relation to Km, the IC50 is essentially equal to Ki (Cheng and Prusoff, 1973). While Ki value for a given inhibitor and its target is constant, the magnitude of IC50 is relative and dependent on the concentration of substrate (Burlingham and Widlanski, 2003).

The currently available large-scale bioactivity datasets have several drawbacks. They are highly heterogeneous because they originate from diverse types of assays and evidence sources, such as pharmacodynamic experiments, examination of direct binding or downstream treatment effects, or computational predictions using, for instance, text mining methods. In addition, these drug target data sources often lack the information about the non-targets since they do not list all the proteins that have been assessed for target selectivity. This missing distinction between true negatives and unreported drug target interactions may have profound consequences on both the specificity and sensitivity of drug-target interaction prediction models (Chen and Zhang, 2013; Pahikkala et al., 2015). Quantitative information about the bioactive compound - target associations is available in ChEMBL, which is one of the most comprehensive chemical biology databases (Gaulton et al., 2012). The possibility to define both positive and negative compound-target interactions contributes to the growing popularity of this resource for many applications, including the validation of computational drug-target predictions (Gfeller et al., 2013; Lounkine et al., 2012; Martínez-Jiménez et al., 2013).

2.9. Background of diseases used in the studies (II, IV)

2.9.1. Breast cancer Breast cancer is the most common cancer type among women and it is increasing in incidence (Youlden et al., 2012). There are several inherited and postnatally acquired genetic mutations that predispose breast cancer development, most notably in BRCA1, BRCA2, BRAF, PIK3CA, ATM, CHEK2 and PALB2. While these genetic defects (particularly BRCA1 and 2) have a high penetrance and are associated with very high rates of breast cancer for the individual carriers, they only account for a minor part of all breast cancers (Ford et al., 1998). Therefore, novel approaches are needed to better stratify breast cancer subtypes, and to develop more selective treatments for each breast cancer patient individually (so-called precision medicine). There are a several ways how to classify and subdivide the breast cancers. Traditionally, pathologists have separated lobular from ductal breast cancer and used the different grading and staging systems to determine the prognosis for the patient (Yersal and Barutca, 2014). Though the old classification system is not obsolete, newer immunohistochemistry stainings provide information that is useful in predicting the response to therapy (Liu, 2014). For instance, estrogen-receptor (ER)-positive tumors generally respond well to an estrogen receptor antagonist, such as tamoxifen (Early Breast Cancer Trialists’ Collaborative Group (EBCTCG) et al., 2011), and tumors with HER2-receptor amplification can be targeted with an anti HER2 monoclonal antibody such as trastuzumab (Vogel et al., 2002). Besides these two markers, the expression of the progesterone receptor (PR) is also routinely screened for breast cancer (Liu, 2014).

Page 24: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

24

Among the various subtypes of breast cancer, the triple negative breast cancer (TNBC), defined by the lack or low expression of ER and PR as well as the absence of HER2 amplification, is associated with poor prognosis and survival rates (Foulkes et al., 2010). Compared to other subtypes, TNBC is a highly aggressive and heterogeneous class of breast cancers, with currently no targeted treatments available, mainly due to the lack of known disease drivers (Grigoriadis et al., 2012; Peddi et al., 2012). Therefore, TNBC serves both as a clinically important as well as highly challenging disease model to identify druggable molecular vulnerabilities and combinatorial co-dependencies in a cell type-specific manner (Gautam et al., 2016).

2.9.2. Alzheimer’s disease Alzheimer’s disease (AD) is a neurodegenerative disease that is characterized by progressive loss of memory as well as other cognitive handicaps. Its incidence increases with age and with an aging population this disease is projected to be one of the major health care challenges of the future (Abbott, 2012). Therefore, novel treatment strategies will be increasingly needed in the developed countries, where the population is aging at increasing rates. Although the morphological changes in AD-brain are well-characterized, the pathogenesis and causes of the typical AD symptoms are not fully understood (Komarova and Thalhauser, 2011). A few of the AD-cases are inherited while the vast majority is sporadic (Chartier-Harlin et al., 1991). Three mutated genes account for most of the familial cases, β-amyloid precursor protein (APP), presenilin 1 (PSEN1) and presenilin 2 (PSEN2) (Bagyinszky et al., 2014). The prevailing amyloid hypothesis of AD development states that the accumulation of neuritic plaques and neurofibrillary tangles (the hallmarks of AD) arise from disruptions in the homeostatic processes that control cleavage of amyloid precursor protein (APP) into Aβ peptides (De-Paula et al., 2012). The brain atrophy visible in brain imaging of AD-patients is caused by the loss of neurons and synapses in the neocortex as well as some subcortical areas. This cell death is caused by the extracellular accumulation of insoluble neurotoxic Aβ peptides that go on to form the characteristic Alzheimer plaques (De-Paula et al., 2012). Imbalance in the cellular redox-systems has been suggested to be one of the major causes of the disrupted homeostasis of APP-processing that is at the core of AD-development (Malinski, 2007). AD increases with age and it is well known that accumulation of oxidative lesions in DNA and other macromolecules is one of the major contributors to the aging process (Romano et al., 2010). It has already been shown that AD patient brains have increased oxidative stress (Sayre et al., 2008). Nitric oxide (NO) is a free radical that functions as an important signaling molecule in almost all tissues. The brain actually ranks among the top tissues when it comes to expression of NO synthase (Bredt et al., 1991), which is the enzyme that generates NO. Nitrosylation, the covalent binding of NO to certain exposed cysteine-residues is an important component of NO-signaling (Hess et al., 2005), but in situations of increased oxidative stress, aberrant nitrosylation has been demonstrated to cause protein misfolding and aggregation as well mitochondrial dysfunction, disrupting cellular homeostasis (Nakamura et al., 2013).

Page 25: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

25

3. AIMS OF THE STUDY

The aim of this thesis was to develop integrative data analysis strategies that can contribute to the identification of disease driver networks from functional profiling experiments. Such experimental data include protein-protein interactions, drug response measurements, profiling of drug targets and perturbation screens by small chemical molecules or siRNAs. The particular aim was to develop network-based experimental-computational approaches for (i) direct prioritization of drug targets or their combinations, (ii) identification of proteins functionally similar to the disease drivers of interest, and (iii) better use of the drug target profiling data.

The specific goals of this work were:

1. Comparison and integration of drug target interaction profiles from kinase inhibitor bioactivity measurements (Publication I)

2. Prediction of cancer-selective druggable signal addictions (single or combinatorial) and synergistic drug combinations from functional screening of chemical compounds (Publication II)

3. Development of bioinformatic method to assess the functional relevance of protein interaction networks (Publication III)

4. Functional network analysis of S-nitrosylated proteins in Alzheimer’s disease mouse brains and healthy controls (Publication IV)

Page 26: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

26

4. MATERIALS AND METHODS

4.1. Comparison of kinase inhibitor profiling datasets (I) Three recent studies containing quantitative bioactivity profiling information for kinase inhibitors were subjected to comparative analysis. In the first study by Davis et al. (2011), dissociation constant (Kd) values were determined on the basis of ATP site-dependent competition binding assays for 72 compounds and 442 kinases. The second study by Metz et al. (2011) measured the inhibition constant (Ki) of the transfer of a radioactive phosphate group to target peptides for a large collection of drug-like molecules and 172 kinases. In the third study, Anastassiadis et al. (2011) calculated the percentage of kinase catalytic activity remaining after the inhibition by a compound at 0.5 μM for 300 kinases and 178 inhibitors. The datasets were subjected to several preprocessing steps to facilitate their comparison. The missing values representing untested compound-target pairs in Metz and Anastassiadis studies were filtered out. The missing values from Davis et al. (2011), which indicated a weak binding (Kd>10 000 nM) or undetected at 10 000 nM inhibitor concentration, were imputed with 10 000 nM. Only the catalytically active human protein kinases were used in the comparisons. As the primary identifiers of compounds and kinases, we used the CheMBL IDs and UniProt IDs respectively. The final number of compounds, kinases and drug-target pairs in each study as well as their overlaps is presented in Figure 1 and Table 1 of Publication I.

4.2. Evaluation of reproducibility and reliability of kinase inhibitor profiling datasets (I) In order to evaluate the reproducibility of these three distinct types of bioactivity data, we compared them with the high-confidence human target profiling information from independent studies available from the ChEMBL database (confidence level ≥ 8, version 17). In case of multiple measurements for a given drug-target pair in ChEMBL the median values were used in the comparisons. The distributions for each bioactivity type based on ChEMBL data and each of the studies were compared using two sample Kolmogorov−Smirnov test for goodness-of-fit. Kendall correlation coefficient was calculated to evaluate the consistency between the bioactivity measurements. In the reliability assessment of the data from the three studies, we used two reference sets of positive drug-target interactions: STITCH and PRIMARY. The first one contained high confidence target annotations (with combined confidence scores higher than 900) from the STITCH database (version 3.0) (Kuhn et al., 2012). The latter consisted of 134 drug-target interactions from the Supplementary Table 3 of the Davis study that have been experimentally-validated in multiple studies. The goodness of fit between the reference lists and the data from the three studies was assessed by calculating the enrichments scores on the basis of the Kolmogorov−Smirnov statistic in the GSEA (Gene Set Enrichment Analysis) tool.

Page 27: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

27

4.3. Integration of the target bioactivity measurements (I) In order to integrate different types of bioactivity measurements available from ChEMBL, such as Kd, Ki and IC50, we developed the kinase inhibitor bioactivity (KIBA) score, which is calculated based on the adjusted Kd value, adjusted Ki value or their average, depending on the available target profiling measurement types:

The adjusted Kd and Ki values were defined using the parametric model:

where Li and Ld are the parameters that determine the weights of IC50 and whose values are optimized to maximize the consistency between Ki and Kd. Median values of Kd, Ki and IC50 were used in the calculations in case of multiple measurements of the same bioactivity type.

4.4. Kinase inhibition sensitivity score (KISS) for prediction of molecular cancer addictions (II) To predict druggable molecular cancer addictions using drug-target networks and drug sensitivity data, we developed kinase inhibition sensitivity score (KISS). The score is calculated for each kinase target as an average drug response of all its inhibitors (Figure 3, left). KISS calculation ranks the targets of the sensitivity-tested compounds according to their likelihood of being essential for cancer growth. The method was tested and evaluated in a panel of 21 breast cancer cell lines screened with FIMM compound collection as described below. The chemical library contained 40 kinase inhibitors in common with the selectivity profiling study of Davis et al. (2011), which was used as a source of target annotations. The statistical significance of a KISS value was determined by calculating the empirical p-values using permutation tests.

4.5. Combinatorial KISS for prediction of co-essential target pairs and synergistic drug combinations (II) Since cancer cells may also be addicted to several signals simultaneously, we extended the KISS model to a combinatorial version enabling the prediction of co-essential target pairs. The combinatorial KISS score is calculated for each kinase pair as the average drug response of the inhibitors targeting both kinases (Figure 3, right). Its calculation ranks the target pairs according to their likelihood of being jointly essential for cancer growth. To focus on synthetic lethal type of kinase pairs, termed as co-essential, we compared combinatorial KISS to a complement score, defined as the average drug response of the inhibitors targeting only

Page 28: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

28

one member of the pair, and ranked only those pairs for which the difference between the scores was above a selected cut-off value. Hence, the combinatorial scoring was applied only for those kinase pairs which had both common inhibitor(s) as well as individual inhibitors available among the target-annotated compounds in FIMM collection. These pairs were ranked by the magnitude of their KISS reflecting their likelihood of being co-essential for cell survival. The strongest individual inhibitors (i.e. with the lowest Kd values in Davis et al., 2011) of the kinases in top-ranked pairs were predicted to form synergistic drug combinations. The highest scoring co-essential kinase pairs and the corresponding drug pairs of their individual inhibitors were experimentally tested in HCC1937 breast cancer cell line.

4.6. Compound sensitivity testing (II) 21 breast cancer cell lines were screened with 239 individual compounds in 5 concentrations across a 10 000-fold range using the drug sensitivity and resistance platform (Pemovska et al., 2013). A percent growth inhibition values were determined for each compound concentrations after 72h incubation with the cells and served as the basis for the establishment of dose-response curves which were subsequently quantified by half-maximal inhibitory concentrations (IC50) and the drug sensitivity score (DSS) (Yadav et al., 2014). MCF10A control cell line or the mean over all cell lines were used for calculation of the differential DSS and IC50 values. Drug combinations from the top-ranked combinatorial KISS predictions in the HCC1937 cell line were tested in an 8x8 dose-matrix format. First row and column of each matrix contained 7 increasing concentrations of each drug whereas the middle part contained all their pair-wise combination. The negative control (0.1% DMSO) and the cell killing positive control (100 μM benzethonium chloride) were placed in the top left and bottom right corner, respectively. The raw fluorescence intensity values were measured after 2 h incubation with CellTiter-Blue reagent and converted to percent inhibition using the plate average of positive and negative controls. Each row and column from the 8x8 matrices was subsequently fitted in GraphPad Prism software using logistic model in order to reduce the dispensing-related experimental variation from well to well. The average of the two fitted values per each well as well as the single drug inhibition values were used to calculate the expected combinatorial effects based on the Bliss independence model (Bliss, 1956).

4.7. Compound-target mappings (II) Dissociation constant (Kd) values from the biochemical competition binding assay were extracted from the target selectivity study performed by Davis et al. (Davis et al., 2011), where 40 inhibitors from our FIMM drug collection were tested. The 205 non-mutated kinases whose Kd values were within 50 fold from the lowest Kd (strongest target) of a given inhibitor or below 100 nM (whichever threshold came first) were considered to be targets and formed the quantitative drug–target data set (qDT).

Page 29: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

29

Kinases listed as targets in DrugBank (www.drugbank.ca; Wishart et al., 2006), the Therapeutic Target Database (TTD, http://bidd.nus.edu.sg/BIDD-Databases/TTD/TTD.asp; Chen et al., 2002) and KEGG (www.genome.jp/kegg; Ogata et al., 1999) were combined to form non-quantitative, unary drug-target data (uDT). This type of selectivity data contains the reported targets of a given drug, but lacks the information about the kinases which were reported as non-targets or were not tested for being a target. The unary data set contained 296 kinase targets for 142 drugs in our FIMM drug collection. For the 295 GlaxoSmithKline (GSK) compounds, we subdivided the kinase activity assay results from Knapp et al. (2013) into 5 categories (from 0, inactive, to 4, very active), based on the inhibition of the kinase activity at 1 μM and at 0.1 μM concentrations of a particular compound, according to the following thresholds: 0 - <25% inhibition at 1 μM; 1 - activity >1 μM: 25-50% inhibition at 1 μM 2 - activity <1 μM: >50% inhibition at 1 μM, <50% inhibition at 0.1 μM 3 - activity <0.1 μM: >50% inhibition at 1 μM, 50-90% inhibition at 0.1 μM 4 - activity <<0.1 μM: >50% inhibition at 1 μM, >90% inhibition at 0.1 μM Kinases classified into categories 3 and 4 were considered to be targets of the GSK probes. Kinases belonging to the second category were considered to be targets only if the compound had no targets in higher categories. To check whether the KISS predictions would be affected by extending the target space for FIMM drugs, we extracted high confidence targets from the KIBA database (Publication I), which combines quantitative drug-target bioactivity data (Kd, Ki and IC50) from several studies reported in the ChEMBL database (https://www.ebi.ac.uk/chembl/). For drug target mapping, we used a KIBA bioactivity score cut-off of <=2.7, which was equivalent to the log 10 of the highest Kd value in the qDT drug target interaction set.

4.8. Comparison of driver de-convolution methods (II) To systematically compare KISS with other target deconvolution methods, the kinase targets were ranked based on the Fisher’s test (Wei et al., 2012), Spearman’s correlation (Tran et al., 2014) and Tyner’s score (Tyner et al., 2013). As input for these methods we used drug-target data (as qDT, uDT or the union of both sets) and drug response data (defined as IC50 or DSS) in their differential and non-differential versions. In the Tyner’s score kinases were subdivided into 5 tiers depending on the magnitude of their Kd or IC50 value for a given inhibitor (Tyner et al., 2013). The most potent targets were assigned the lowest tier based on the lowest Kd or IC50. Similarly, the drugs tested in a given sample were subdivided into active and inactive categories based on a patient-specific IC50 threshold of 5-fold less than the cohort median IC50. The addition and subtraction of points from the effective and ineffective inhibitors, respectively, of a given kinase resulted in the final cumulative score for each kinase (Tyner et al., 2013). To calculate the p-value from the Fisher’s test, a contingency table containing the number of inactive and active inhibitors as well as inactive and active non-inhibitors was created for each kinase (Wei et al., 2012). The top 20% of the most effective drugs in a given cell line were considered to be active. Spearman’s correlation method ranked the kinases in

Page 30: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

30

each cell line based on the correlation between the kinase selectivity profile and the drug response.

To compare the kinase addiction rankings based on KISS, Spearman’s, Fisher’s and Tyner’s scores, together with IC50/DSS drug sensitivity quantification and a given target selectivity data types, we extracted the ranks of the known addictions in our cell line collection and calculated their median rank. Our set of positive controls consisted of ERBB2 in BT474, MDA-MB-435 and SKBR3, BRAF in DU4475, and PIK3CA in MDA-MB-435, ZR751 and MCF7 and was based on the data available in COSMIC database (as of June, 2012). In this evaluation, we included only those mutations that were shown to have an impact on the protein function in a given cell line based on the data from IntOGen database (as of March, 2013; http://beta.intogen.org/web/cell-lines).

4.9. Independent compound and siRNA testing to validate KISS predictions (II) Independent set of previously untested kinase inhibitors from Davis et al. (2011), as well as siRNA knockdowns were used to validate the top single KISS predictions. The experiments were carried out in selected cell lines predicted to be positive and negative for each kinase addiction. The compound sensitivity tests were performed as described above (Drug sensitivity testing) and DSS>=7 was considered to indicate a positive result. SiRNA experiments involved three sequences per kinase tested separately in three replicates. The siRNA inhibition score was calculated for each kinase by averaging two highest mean inhibition values from the replicates of each sequence. The scores >=35% were considered to indicate a positive validation.

The top co-essential kinase pairs and the corresponding synergistic drug combinations predicted by the combinatorial KISS were tested in HCC1937 cells. The drug combination experiments were performed as described above (Drug sensitivity testing). The co-essentiality of kinase pairs was validated by siRNA combination assays, in which all pairwise combinations of three siRNAs per each kinase were tested in a 4x4 matrix format using 8 nM of each siRNA. The first row and column of each matrix was occupied by single siRNAs for each kinase at 8 nM. Mean fluorescence of 16 positive and 24 negative control was used to calculate the percent inhibition values. To make the single and double kinase knock-down results comparable we normalized the single siRNA percent inhibition values by averaging the effects of each single kinase knock-down and those double kinase knock-downs that included this kinase with lower inhibition effects. The siRNA combination assay result was considered to be positive if it exceeded the effect of the maximum single siRNA knockdown.

4.10. Construction of addiction networks (II) Kinase addictions predicted by KISS to be significant in MDA-MB-436 and HDQP1 (p<0.06) were used as input to the GeneMania database (http://www.genemania.org) (Warde-Farley et al., 2010), which was then queried to extract their physical, genetic and pathway links using gene ontology (GO) biological process weighting method and allowing for maximum 20 and 10 bridging proteins in MDA-MB-436 and HDQP1, respectively. Enriched GO biological processes were calculated using GeneMania’s default settings.

Page 31: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

31

To construct CAL51 addiction network high confidence interactions, both physical and associations, in breast neoplasm for the top KISS predictions were extracted from HIPPIE database (http://cbdm.mdc-berlin.de/tools/hippie) (Schaefer et al., 2012). To limit the network complexity, only the interactors connected with at least three input proteins were retained. The resulting network was further connected using pathway and genetic links extracted from GeneMania (http://www.genemania.org) (Warde-Farley et al., 2010). For clarity, genetic interactions were filtered out unless they involved two proteins among the top KISS predictions. The network was visualized using Cytoscape v.3.0.1 (www.cytoscape.org; Shannon et al., 2003). Enrichment of gene ontology biological processes was analyzed using ClueGo v.2.8 Cytoscape plugin (Bindea et al., 2009), with minimum number and percentage of genes per term set to 4 and 2, respectively, and the Benjamini Hochberg method was used for the multiple comparison correction (Benjamini and Hochberg, 1995).

4.11. Generation of PME-1 interactome using MS/AP (III) PME-1 interacting proteins were identified using one step Strep tag affinity purification followed by mass spectrometry as previously described. The spectral data was automatically collected using Applied Biosystems Analyst QS 1.1 software. The proteins were identified by searching the SwissProt database (version 57.6) using our in-house Mascot (version 2.2.06) and p<0.05 as the significance threshold. Mascot search results were uploaded to Prohits platform (Liu et al., 2010) for further analysis using default settings and subsequently merged by applying the sum of spectral counts for all samples from the same biological replicate. Two biological PME-1 bait replicates and controls were used as input for SAINT express algorithm (Choi et al., 2011) in order to determine the reliability of detected protein interactions. For assessment of functional coupling between PME-1 and its identified interaction partners, we used the probabilistic scores from FunCoup database (Schmitt et al., 2012) derived from integrated evidence from all available species and data types.

4.12. Relevance rank platform (RRP) for predicting functional similarity (III) To identify the most functionally related interaction partners of PME-1 and PIN1, we developed a relevance rank platform (RRP), which combines evidence from siRNA screens and expression profiles into a single ranking of genes. RRP is itself based on three individual rankings (Figure 2). The first component of RRP was generated using siRNA perturbations in PC3 prostate cancer cells against PME-1 or PIN1 and their interacting partners. We tested three non-overlapping siRNAs for each gene. Reasoning that disrupting functionally related proteins should yield similar knockdown effects, the percent inhibition values for all interacting protein siRNAs were compared to the average percent inhibition caused by PIN1 or PME-1 siRNAs by calculating their absolute difference. The top rank was then assigned to the siRNAs with the lowest absolute difference. The second component of RRP involved testing the same siRNAs but in the PNT2 cell line representing the normal prostate cells. As in the first ranking, the absolute difference was determined but this time using the average % inhibition measured in the PNT2 cells, and again the siRNAs were ranked from the smallest absolute difference to the highest. The third component of RRP involved ranking of the siRNAs based on their expression similarity with PIN1 and PME-1 in prostate

Page 32: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

32

adenocarcinoma and healthy prostate tissue. For this purpose, we used expression profiles of clinical samples from the MediSapiens database (http://www.medisapiens.com/) (Kilpinen et al., 2008) and summed up the correlations of expression values between normal and prostate cancer tissues for PIN1/PME-1 and each of their interacting partners. The siRNA for the gene where this sum was highest was ranked at the top. Finally, to move from individual siRNAs to the final gene-level similarity ranks, we applied redundant siRNA activity (RSA) method (Konig et al., 2007), which takes into account potential off-target effects by considering the enrichment of ranks of the siRNAs targeting the same gene.

Figure 2. Overview of the steps comprising the relevance rank platform (RRP). Inh – percent inhibition, g – expression profile of a candidate gene, b – expression profile of the bait

Page 33: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

33

4.13. Identification of nitrosylated proteins (IV) The presence of posttranslational nitrosylation among the synaptosomal proteins in Alzheimer’s disease versus normal brain was studied using normal control mice (FVB) and age/sex-matched transgenic Alzheimer mice (hAPP) that express the human amyloid precursor protein with the London mutation [Tg(APPV717I)]. The brains from four hAPP Alzheimer and four FVB control mice (14-15 months old) were removed and homogenized following decapitation. The synaptosomal fraction was isolated through Ficoll-gradient centrifugation and then used for the subsequent biotin-switch MS identification of nitrosylated proteins. To detect the nitrosylation sites we used a SNO site identification (SNOSID) method - a biotin switch technique adapted for high-throughput proteomics thanks to improved enrichment efficiency (Hao et al., 2006). In this procedure, the nitrosylated cysteines (SNO-Cys) of the synaptosomal proteins were chemically altered to become biotinylated cysteines (biotin-SS-Cys) using Biotin-HPDP in the presence of sodium ascorbate (catalyst). After the substitution the proteins were added onto Neutravidin-beads which have very high affinity biotin. Thus the biotinylated synaptosomal proteins became strongly attached to the beads while other proteins were washed out. The attached biotinylated synaptosomal proteins were then digested using trypsin and washed once again so that only peptides containing biotin-SS-Cys remained. These peptides were then eluted using buffer containing DTT which reduces and breaks the disulphide-bridge between the biotin and the cysteine. The eluted peptides were loaded and separated on a Nano Aquity Liquid Chromatography system (Waters, Milford, MA) and then analyzed on a LTQ-FTICR mass spectrometer (Thermo Scientific). The SNO peptides were subjected to LC-MS to achieve label free quantitation and LC-MS/MS for identification.

4.14. Functional analysis of the proteomic data (IV) A network connecting the human orthologues of mouse differentially S-nitrosylated proteins with Amyloid Precursor Protein (APP) was created using physical, genetic, predicted, and pathway interactions from GeneMania database (Warde-Farley et al., 2010). Human orthologues of mouse genes were extracted using NCBI homologene (www.ncbi.nlm.nih.gov/homologene) and used as an input. Functional network analysis for differentially mouse S-nitrosylated proteins identified in FVB or hAPP mice was performed using ClueGOv1.4, a Cytoscape plug-in (http://www.ici.upmc.fr/cluego/) (Bindea et al., 2009), which applies GO/KEGG hierarchical characteristics for deciphering the enriched biological terms. For GO biological process term enrichment calculations, we created a separate reference set consisting of human orthologues of proteins measured by MS in mouse synaptosomes (Malinowska et al., submitted). This set was further enriched with non-redundant genes from two most comprehensive, expert- curated synaptic databases (SynsysNet; http://bioinformatics.charite.de/synsys/ (von Eichborn et al., 2013)) and SynaptomeDB; http://psychiatry.igm.jhmi.edu/SynaptomeDB/ (Pirooznia et al., 2012). The term enrichments of GO biological process functional categories were calculated using right-sided hypergeometric test.

Page 34: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

34

5. RESULTS

5.1. Comparative evaluation and integration of target selectivity profiles (I)

Identification of molecular vulnerabilities or synergistic target and drug combinations from drug response data is dependent on the knowledge of drug-target interactions. Since the strength of these interactions can be quantified using various bioactivity measures, we set out to compare and integrate three recent studies with different target selectivity profiling readouts, including Kd (Davis et al., 2011), Ki (Metz et al., 2011) and activity percentage (Anastassiadis et al., 2011). The main results from the comparative evaluations are described in the following.

5.1.1. Comparative analysis of the compound bioactivity studies We first evaluated the overlaps between the three target profiling datasets in terms of the tested compounds, kinases and the detected drug-target interactions. The studies shared on average 14 % of the profiled drug−target interactions, but the overlaps between the tested proteins and compounds were relatively small. Next, we compared the bioactivity types using the common set of drug-target pairs (Figure 2 in Publication I). We observed a significant correlation between the Davis Kd and the Metz Ki inhibition constants (Kendall rank correlation 0.5, p <10−15). The Anastassiadis activity data measuring the remaining enzyme activity showed less consistency with the Ki or Kd constants (Kendall correlation of 0.32 and 0.4, respectively).

5.1.2. Evaluation of the reliability of the compound bioactivity studies In the reliability assessment, the bioactivity values reported in the three studies were compared against the reference sets of drug-target interactions using 1000nM as a threshold for determining true positive interactions with high confidence (sensitivity: 0.85; hypergeometric test, p <10−15). Both the Metz and the Davis studies contained the majority of the high confidence PRIMARY and STITCH interactions. However, nearly 90% of the positive drug-target pairs identified in both studies were absent from the STITCH reference set at the time of this analysis. In the enrichment analysis, the Kd values showed better goodness of fit with both reference sets of positive interactions than the other bioactivity types (Publication I, Table 2). The enrichment of positive drug-target pairs in the lower end of the Kd spectrum supports the applicability of this readout for discriminating between the true positive and negative interactions.

To evaluate the reproducibility of the different bioactivity readouts, we compared the data from the three studies against the high confidence interactions reported in ChEMBL. The distributions of Kd values from Davis et al. and ChEMBL showed markedly better overall similarity as compared to Ki or activity values (KS test, p=0.66 vs p<0.01). All bioactivity readouts showed significant Kendall correlations with the ChEMBL data (p<0.001), but the consistency in the ranked activity values was nearly twice lower than in the other two bioactivity measures (Table 2 in Publication I).

Page 35: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

35

5.1.3. Integration of kinase inhibitor bioactivity data To integrate different bioactivity readouts for kinase inhibitor profiling, we focused on the ones that were most abundant in ChEMBL database: Kd, Ki and IC50. These readouts were available for 246 088 compound−kinase interactions, including 52 498 compound and 467 kinases. Calculating KIBA scores using the empirically determined parameters Li = 0.3 and Ld = 1.3 turned out to improve the consistency between all three bioactivity types based on the increase in their pairwise correlations. The integrated KIBA score was also utilized to construct a quantitative bioactivity matrix among kinase inhibitors and their potential targets, which can provide guidelines on classifying positive and negative interactions and thus contribute to improved reliability of the drug-target networks.

5.2. KISS for prediction of single and combinatorial cancer drivers (II)

In order to elucidate molecular vulnerabilities or functional addictions in cancer directly from drug response data, we developed a target addiction score that integrates drug sensitivity and resistance profiles with drug selectivity data. This approach enables the ranking of drug targets according to their probability of being essential for cancer progression. Our addiction scoring model is based on the assumption that the sensitivity of cancer cells to a given compound implies that some of its targets are important for cancer growth, whereas the lack of response indicates that the cancer cells are not addicted to the targets of the compound under study. Since we tested and evaluated our method using breast cancer cell lines screened with a panel of target annotated kinase inhibitors, we named it kinase inhibition sensitivity score (KISS), but this approach can be used for prioritization of any pharmacologically actionable drug targets.

Formally, KISS is defined as the average drug response of all inhibitors of a given target (Figure 3, left). Since the observed drug sensitivity can be mediated by addictions that consist of either single targets or a group of targets whose individual inhibition does not necessarily cause any effect, we extended our approach also to target pairs by means of a combinatorial KISS (Figure 3, right). This joint scoring method enables the prediction of such kinase pairs whose simultaneous inhibition results in cell death. To focus on the synthetic lethal type of interactions, a filtering step was introduced to retain only those target pairs where individual inhibition of only one target results in a considerably smaller increase in cell death in comparison to simultaneous inhibition of both members of the pair. The compounds inhibiting such co-essential target pairs were hypothesized to have synergistic effect.

Page 36: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

36

Figure 3. Schematic overview of the KISS approach. Drug selectivity and sensitivity data is combined for prediction of single and combinatorial target addictions. In the single KISS approach (left), kinase targets (k) are ranked based on the average drug response (DR) of their inhibitors (d) which reflects their essentiality for cancer cell survival. In the combinatorial KISS (right), the top-ranked kinase pairs whose simultaneous inhibition resulted in markedly higher cell death compared to the inhibition of each kinase individually were hypothesized to be co-essential and form synthetic lethal type interactions. Consequently, the combinations of strongest non-common inhibitors of each individual kinase from these target pairs were predicted to be synergistic.

5.2.1. Comparison of KISS with other driver deconvolution approaches In order to evaluate the KISS approach, we made several comparative analyses. First, we compared its performance to other target deconvolution methods, such as Spearman’s correlation (Tran et al., 2014), Tyner’s score (Tyner et al., 2013) and Fisher’s test (Wei et al., 2012). For this purpose, we used the drug sensitivity data for 21 breast cancer cell lines tested with a panel of anticancer compound available at FIMM (Gautam et al., 2016). Different types of target profiling data and drug response quantification metrics were used as an input for each deconvolution method resulting in 31 technically possible combinations of method, response metric and bioactivity types. Each of these combinations gave rise to a separate ranking of targets reflecting the likelihood of their importance for cancer. The median rank of

Page 37: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

37

the known breast cancer kinase drivers served as the basis method comparison. We found that KISS was the top performing approach for pinpointing the known cancer addictions, especially when calculated using quantitative drug target mapping and differential DSS. We did not however observe improvements in the predictive accuracy after combining unary and quantitative drug selectivity data, suggesting that also negative drug-target interactions have an important role in the driver predictions (Figure 2D in Publication II). Since KISS calculated on the basis of quantitative target annotations and differential DSS was found to be the best combination in the comparison, it was the version of choice for further evaluations and applications of the KISS model.

5.2.2. Comparison of KISS and transcriptomic profiles To explore the similarity of clusters originating from KISS and patient transcriptomic profiles, we used the genome-wide molecular profiling of 587 TNBC cases (Lehmann et al., 2011). The unsupervised clustering of the KISS profiles revealed 3 major groups of cell lines with diverse kinase signal addictions that closely resembled patient-derived TNBC subtypes (Figure 3 in Publication II). The KISS-based cluster solution correctly identified the basal-like and mesenchymal-like subgroups as well as even the smaller subgroups within basal-like subtype. We also observed higher similarity of KISS-based functional clustering to patient-derived classification compared to the one originating only from drug response profiles, as confirmed by the increased Rand cluster evaluation index (Figure 3B in Publication II). KISS rankings also provided additional information beyond the established cancer subtypes. In particular KISS profiles of MDA-MB-436 and DU-4475 did not cluster together with any other cell lines, suggesting that these two cell lines have relatively distinct sets of druggable kinase addictions.

5.2.3. KISS-predicted kinase addictions showed consistency between independent drug collections To test the influence of the compound collection on the KISS predictions, we applied KISS scoring to the targets of 295 kinase inhibitors from GlaxoSmithKline (GSK) (Knapp et al., 2013). The top functional predictions in the KISS rankings of the 87 targets shared between FIMM and GSK chemical library showed a clear overlap when applied to HER2 positive and ER negative cell lines (Figure 6A in Publication II). The significant association between the target addiction rankings obtained using the drug responses from two independent compound libraries (p<0.05, Fisher’s exact test) demonstrated the robustness of KISS-based scoring of signal addictions (Figure 6B in Publication II). The consistency of KISS rankings was also preserved after extending the target space of FIMM compounds using the KIBA database.

5.2.4. KISS-based prediction of pharmacologically actionable drivers in breast cancer cell lines In addition to the known oncogenic drivers, KISS approach suggested novel druggable addictions in heterogeneous breast cancer cell lines. We tested selected top KISS predictions

Page 38: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

38

from a subset of cell lines using siRNA-mediated kinase knockdowns and a set of previously untested drugs. Our validations confirmed the importance of AMPK –related kinases, such as ARK5 (NUAK1), SNARK (NUAK2), QSK and MARK3, in CAL51 cell line. In addition, CAMK2A, PKN1, RIOK2, PRP4 and SRPK3 were validated by observing the growth inhibition following their knockdown by siRNSs. Both siRNA and drug inhibition results were positive for ARK5, SANRK, MST1 as well as dasatinib targets: EPHA5, EPHB4, HCK and TXK. HDQP1 cell line was predicted to be strongly addicted to CSK, BMX, HCK and IRAK and to a lesser degree to ephrin receptor tyrosine kinases. Its addictions partially overlapped with CAL51 and Hs578 cell line addictions (Figure 4 in Publication II). In addition, several of the top predictions were previously shown to be implicated in cancer-related processes. The number of kinase predictions confirmed either in our validation experiments or in other studies provides proof-of-concept support for the approach and its applicability to finding novel cancer cell-specific target addictions.

5.2.5. Predicted signal addictions form cell line specific driver networks When exploring the connectivity of top KISS prediction in selected TNBC cell lines, we found that the top hits form well-connected signaling networks consisting of physical, genetic and pathway interactions, suggesting that the individually most essential kinase targets play a role in shared biological processes. Further, the gene ontology analysis revealed an enrichment of several immune function and inflammation-related terms in MDA-MB-436 addiction network, such as Toll-like receptor signaling, regulation of innate immune response and positive regulation of NF-kappa B transcription factor activity, which are known to be involved in many cancer phenotypes (Supplemental Figure S2 in Publication II). The Toll-like receptor signaling and regulation of innate immune response were also enriched in HDQP1 network in addition to ephrin receptor signaling and cell-substrate adhesion. The enrichment of proteins involved in Toll-like receptor signaling was also observed in DU4475, even though this cell line did not have common drivers with MDA-MB-436 and HDQP1. These findings are consistent with several recent studies showing the involvement of Toll-like receptor signaling pathway in breast tumor cell invasion, survival, and metastasis (Gonzalez-Reyes et al., 2010; Merrell et al., 2006; Yang et al., 2010). Since most of the validations involved CAL51 cell line, where multiple addictions were predicted, we took a closer look at their interaction network in the context of breast cancer. The network analysis revealed that the KISS-predicted kinases formed a highly connected sub-network with SYK as its main hub. Importantly, siRNA knockdown of a number of significant kinase addiction predictions in the SYK-interactome showed marked growth inhibition (Figure 5B in Publication II). The CAL51 addiction network was also highly enriched in several cancer-related gene ontology (GO) processes.

Page 39: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

39

5.2.6. Combinatorial KISS prediction of co-essential target pairs and synergistic drug combinations Since cancer cells may also be dependent on simultaneous inhibition of multiple target signals, we extended the KISS approach to predict also combinatorial molecular addictions, so-called druggable co-dependencies, in each cancer cell type individually. The predicted co-essential target pairs and synergistic compound combinations were experimentally tested in HCC1937 TNBC cell line using siRNA-mediated target knockdown and combinatorial drug response measurements in increasing inhibitor concentrations, respectively. In 9 out of 11 tested kinase pairs, we observed a co-essential effect, defined as higher growth inhibition following simultaneous silencing as compared to that caused by the knock down of each individual member of the pair (Figure 8C in Publication II). The joint drug responses from the predicted top combinations of the most potent inhibitors of the single kinases turned out to be synergistic under all drug interaction scoring models. Specifically, we manage to confirm the synergy of nintedanib with enzastaurin, bosutinib with pazopanib or foretinib, and dasatinib with axitinib (Figure 8A in Publication II). The kinase pairs predicted to underlie the synergistic effects of these compound combinations (EPHB6 with AURKC, GCK with TAOK3 or CDK7 as well as GSK3b with PCTK1) were also shown to be co-essential. Taken together, these results suggest that the combinatorial KISS approach can identify unexpected synergistic interactions between compounds, which are often due to synthetic lethal type interactions between the kinase targets of the individual inhibitors.

5.3. RRP analysis of PIN1 and PME-1 interactome (III)

Another way to find novel cancer drivers or signal addictions is to explore the interaction partners of known oncogenes, since the existence of the interaction suggests involvement in the same biological process. However, one of the major challenges in protein-protein interaction screens is the identification of functionally relevant interactions from their long list in particular when the functional annotations are missing. This motivated the development of a method that could advice, regardless of either the reliability of the mass spectrometry-based identification, or known biological function of the putative interactor, what are the candidate proteins from the protein interaction screens that most likely contribute to the function of the bait protein. We applied the RRP strategy to the interactomes of PIN1 and PME-1 derived from PINA, a database of PPI, and MS experiments, respectively.

5.3.1. Functional similarity ranking of PIN1 interactome PIN1 has a prognostic role in prostate cancer (Ayala et al., 2003) and promotes tumor development in immunodeficient mice (Ryo et al., 2005). In order to find proteins which are relevant for the oncogenic function of PIN1, we extracted its interaction partners from integrated protein interactome database PINA (Wu et al., 2009) and analyzed their functional

Page 40: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

40

similarity using RRP platform and PC3 prostate cancer cell line, where PIN1 enhances the colony growth and MYC expression. In the first ranking the 81 PIN1 interacting proteins were ordered according to their cell viability effects following the siRNA knowckdowns in PC3 prostate cancer cell line. The siRNAs whose cancer growth inhibition effects were most similar to those exerted by PIN1 were ranked at the top. In the second ranking we examined the similarity of cell viability effects of PIN1 and its interaction partners in immortalized prostate epithelium PNT2 cell line. Since functionally similar genes are usually co-expressed (Lee et al., 2004), in the third ranking we determined the similarity of expression profiles between PIN1 and its interactors in prostate cancer and healthy tissue based on the data from MediSapiens database (www.medisapiens.com) (Kilpinen et al., 2008). The average ranks for each siRNA were used as input for RSA analysis, which enabled the transition from individual siRNAs to the final gene-level similarity ranks (Figure 2).

The RRP analysis ranked the proteins interacting with PIN1 according to their functional relevance for the oncogenic function of PIN1. To validate the predictions we focused on the top 10 ranked proteins (Figure 2 in Publication III and Figure 4). Several of them have been validated previously as functionally relevant in Pin 1 biology or as interaction partners of PIN1. Specifically, Cdc27, which had the highest predicted functional similarity, is a known PIN1 target implicated in mitosis regulation (Shen et al., 1998). We also experimentally confirmed the novel functional association of CSKN2B and PTOV1 with PIN1 by showing their role in PIN1 target c-Jun phosphorylation. Further, we experimentally validated the interactions between PIN1 and DDX24, PRPF40A, FTSJ1, CSNK2B and PTOV1 in PC3 cell line. In addition, we compared the RRP results with the predictions from FunCoup database, which provides the probabilistic score reflecting the likelihood of functional association between proteins. The top RRP ranked proteins had a very high average FunCoup score of 0.91. Moreover, the rankings of the entire PIN1 interactome were significantly correlated (Spearman correlation 0.32, p=0.018), suggesting similar capacity of these two methods to rank the PIN1 interactors according to their functional similarity to PIN1, whenever the interaction partner was available in the FunCoup database. Since proteins may have different roles in different cellular context, the observed correlation could perhaps be even higher if FunCoup scores were possible to calculate solely based on the evidence from the same cell line, PC3. Taken together, our analysis demonstrates the potential of RRP to guide the selection of functionally relevant protein interactions from high-content PPI data for follow-up experiments.

Page 41: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

41

Figure 4. PIN1 interactome. Proteins interacting with PIN1 were extracted from PINA database and subjected to relevance ranking to determine their importance for the oncogenic function of PIN1. The intensity of the node color represents the RRP rank with the darker tones corresponding to top positions in the ranked list. The top 10 ranked proteins are marked by red node boarder and elliptical shape. Thickened edges represent both their high functional similarity with PIN1 predicted by FunCoup database (score >0.8) and experimental validation.

5.3.2. Functional similarity ranking of AP-MS-derived PME-1 interactome To further test the robustness of RRP to supervise the prioritization of interacting proteins most functionally relevant to the protein of interest, we applied our method to PME-1 interactome that has not been subjected to reliability filtering. PME-1 (PPME1) is a methylesterase that inhibits protein phosphatase 2A (PP2A) and thereby enhances the activity of ERK pathway in human glioblastoma (Puustinen et al., 2009). Its interactome containing 49 proteins was identified based on AP-MS approach using a Strep tag and one-step purification (see Materials and Methods for details). Importantly, the identified interaction partners were not filtered based on any criteria traditionally used to assess the reliability of AP-MS results other than the Mascot score significance threshold (p<0.05) in order to challenge the ability of RRP to discriminate between the relevant and non-relevant interactions. The known PME-1 effector proteins MAPK3 (ERK1) and AKT1 were added to the list of PME-1 interactors as internal controls. The RRP relevance ranking was performed in the same way as for the PIN1 interactome using the PC3 cells in which the PME-1 depletion lead to a marked growth inhibition. The final results presented in Figure 5 of Publication III demonstrate the potential of RRP to select functionally similar interactors for a

Page 42: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

42

given bait in a cellular context under study, since both of the known PME-1 effectors were among the top ranked proteins. None of the other top 10 ranked proteins have been previously functionally associated with PME-1. We therefore assessed the performance of RRP using publicly available reliability filtering tools, such as CRAPome and SAINT, and in house experiments for selected hits. Based on CRAPome database, except for chaperone protein HSPA8, the top ten ranked proteins were not among frequent contaminants in previous AP-MS experiments with Streptactin affinity column. SAINT analysis results also supported the reliability of the interactions predicted to be most relevant, with the exception of LRRC8E. Importantly, we also validated FDPS and HRNR as PME-1 interactors by antibody based detection from an independent Strep-PME-1 purification sample. Both FDPS and PME-1 have been shown to regulate the ERK MAPK pathway activity either downstream of RAS or upstream of MEK, respectively (Puustinen et al., 2009; Reilly et al., 2002). Although the detailed mechanism by which FDPS and HRNR affect PME-1 biology remain unknown, the high degree of genomic changes in HRNR and FDPS observed in various cancer types suggests their likely involvement in cancer-related processes. In addition we validated Lamin A/C as PME-1 interactor using in situ proximity ligation assay. The lack of publically available information on the identified PME interactors undermines the applicability of tools such as FunCoup for assessing the RRP predictions. Nevertheless, two of the top 10 ranked proteins: MAPK3 (the positive control) and EEF2 had a high EEF2 FunCoup score. Recent studies revealed that EEF2 is in fact a target protein of PP2A (McDermott et al., 2014; Yang et al., 2013). Taken together, our analysis indicates that RRP can be used to prioritize the interaction partners relevant for the biology of PME-1 and is especially useful for ranking of the interacting proteins for which there is limited prior knowledge. The combination of our method with reliability filtering tools such as SAINT or CRAPome is likely to provide the ranking of high confidence interactions that are relevant to the function of the bait in a given cellular context.

5.4. Functional analysis of S-nitrosylated proteins (IV)

Neurodegenerative disorders including Alzheimer's disease are characterized by impaired signalling in the synapse and alterations in S-nitrosylation were previously shown in animal AD models and patients. In order to elucidate the molecular mechanisms in the synapse, in which S-nitrosylation might play an important role, we performed functional comparison of nitrosylated proteins in the brain synaptosomes of Alzheimer’s disease mouse models and their healthy controls.

Synaptosomal proteins which were found to be S-nitrosylated in hPPP and FVB mice controls using SNOSID-LC-MS/MS assay were subjected to functional enrichment analyses of terms from Gene Ontology Biological Process (GOBP) and KEGG. The analysis was performed using ClueGO algorithm and a comprehensive “synaptic reference set” of 5,600 genes (see Material and Methods) in order to elucidate the molecular mechanisms in the synapse, in which S-nitrosylation might play an important role.

Page 43: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

43

A variety of protein classes were found to be modulated by S-nitrosylation within the synapse microenvironment in FVB wild type mice. Their functional GO BP analysis pinned down multiple statistically significantly enriched terms, such as generation of precursor metabolites (P=3.11-15), gluconeogenesis (P=7.41-08), synaptic transmission (P=1.22-4) and neurotransmitter transport (P=1.07-03), indicating a global character of this post-translational modification. The analysis of KEGG pathways revealed additional functional categories representing other metabolic processes associated with the normal physiology of neurons, including oxidative phosphorylation, TCA cycle and synaptic organization. Interestingly, 16 SNO-proteins from FVB mice (~12% of all identified mouse brain synaptosomal SNO-proteins formed a cluster of functionally grouped terms previously implicated in Alzheimer’s, Parkinson’s and Huntington’s diseases (Figure 5 in Publication IV).

In the Alzheimer disease mice model overexpressing the human APP we identified more S-nitrosylated proteins and S-nitrosylated sites in comparison to the FVB wild type controls, which suggests a putative role of this modification in the disease pathology. 38 out of 138 identified SNO-proteins and 108 out of 249 SNO-sites were found only in hAPP mice synaptosomes. In the GO BP enrichment analysis we identified two main functional clusters formed by these differentially S-nitrosylated proteins: one associated mostly with axon guidance and the other with gluconeogenesis/glycolysis as well as generation of precursor metabolites and energy. The latter subnetwork contained also terms related to APP neuronal trafficking such as exocytosis and vesicle mediated transport (Groemer, 2011; Vella, 2008). The results of KEGG pathway enrichment analysis further supported the involvement of the differentially SNO-proteins in the regulation of actin cytoskeleton and axon guidance (Figure 6 in Publication IV).

The human orthologues of differentially SNO-proteins were connected with APP using the pathway sharing information as well as physical, genetic and predicted interactions from GeneMania database. In the resulting highly- connected network of 59 nodes, 4 proteins (UBE2D3, NEGR1, NCAM1, GAPDH) were directly linked to APP. In addition ACTB (β-actin) has been shown to form a molecular complex with APP (Cottrell, 2005). The two most enriched biological processes in this network were axon guidance (P=2.74-04) and participation in cytoplasmic vesicle (P=1.17-03). They were shared by 10 and 9 proteins, respectively, including APP and 6 differentially SNO-proteins in each case. (Figure 6C in Publication IV). Western blotting also validated 10 proteins that were found to be differentially nitrosylated in hAPP synaptosomes (Figure 7 in Publication IV). The resulting disease network is likely to play an important role in AD pathology and can serve as the basis for further studies of molecular pathogenesis as well as for the development of new treatment strategies.

Page 44: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

44

6. DISCUSSION AND CONCLUSIONS

In this thesis we developed several computational-experimental strategies for high throughput functional assays that can contribute to deciphering disease driver networks and thus provide candidate therapeutic targets for further validations. The KISS-based approach integrates functional perturbation profiles from small molecule screening and compound-target annotations in order to systematically identify druggable single and combinatorial molecular vulnerabilities. Our comparative analysis revealed that KISS in combination with differential DSS metric and quantitative drug target selectivity profiles performed better than previously introduced target deconvolution approaches in predicting known kinase drivers (Tran et al, 2014; Tyner et al, 2013; Wei et al, 2012). When applied to compound screening of 21 heterogeneous breast cancer cell lines, KISS resulted in best average rank of our positive controls: ERBB2 and PIK3CA (Figure 2 in publication II). We also demonstrated how this model enabled the classification of selective kinase addiction patterns in agreement with the clinical subtypes of the disease. The cancer cell-specific networks identified a number of experimentally validated kinase addictions as well as combinatorial co-dependencies among drug target pairs (synthetic lethal type of interactions). Experimental validation using independent drug and siRNA combinations confirmed a number of drug synergy predictions, such as dasatinib with axitinib, bosutinib with foretinib or with pazopanib, which cannot be explained by the efficacy of the single compounds when used alone.

A conceptually similar approach to identifying functional cancer drivers was recently developed by Tyner et al (2013). To directly compare the performance of KISS against other methods on the same set of cell lines, we implemented the Tyner method in the same way as it was described in the original article, i.e., using non-differential IC50 as drug response metric and quantitative drug selectivity profiles from Davis et al (2011). The ranks of the known kinase drivers in the TNBC cells demonstrated that KISS improved pinpointing the known signal addictions in these cells (Figure 2 in publication II). This improvement is likely due to the fact that, unlike Tyner’s or Fisher’s methods, KISS does not require binarization of the drug response data into active and non-active compounds, using pre-defined and potentially biased cut-off thresholds. Rather, it can use the full spectrum of drug responses when de-convoluting the most essential kinase target signals from the quantitative drug response metrics such as DSS. Additionally, the usage of DSS contributes to the improved performance independently of other factors such as the deconvolution method or type of compound-target annotations. This is evident from the fact that all DSS-based versions of tested methods performed better in driver prediction than their IC50 equivalents. This finding is however expected, since DSS has been previously shown to provide a more accurate quantification of the compound response curve and be more informative in detecting selective responses (Yadav et al., 2014). Although we tried to make as systematic evaluation as possible, not all possible combinations of the deconvolution methods and drug selectivity data could be tested here due to technical limitations. For example, since there is no information available about non-targets or all tested drug-target pairs in the public drug databases, such as DrugBank, TTD or KEGG, the Fisher’s and correlation-based methods could not be applied to this type

Page 45: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

45

of unary drug-target interactions. By making the crude assumption that all or majority of missing values in the drug-target matrix are “non-targets”, these types of data could potentially be used with KISS. However, our rank analysis has shown that the unary drug-target data did not lead to improvements in the prediction results. The importance of using the quantitative target selectivity profiles that contain information of both experimentally validated as well as experimentally invalidated drug-target interactions was also supported by the comparative evaluations among various kinase inhibitor bioactivity resources in publication I.

Loss-of-function mutations of cancer suppressors are often very challenging to re-activate by chemical treatments, and synthetic lethal interactions have been suggested as a novel approach to target especially interaction partners of the driving tumor suppressors (Ashworth et al, 2011; Chan and Giaccia, 2011). However, due to both experimental and computational challenges, synthetic lethal interactions between drugs and/or genes have remained extremely difficult to identify or predict in a genotype-specific manner (Brough et al, 2011). By applying the combinatorial KISS approach to the kinase target pairs allowed us to predict the pairs whose joint inhibition leads to synergistic co-essential effects as compared to their individual inhibition. Even though there were no known loss-of-function mutations among the limited number of tested kinase pairs, it can be argued that such synthetic lethal type of interactions, or co-dependencies, can be identified using the KISS model; importantly, this concept goes beyond the current synthetic lethal paradigm as it is also applicable to interaction partners of such non-drivers that may still lead to increased combinatorial cancer killing effect. We tested this concept here by subsequent mapping of the top kinase pairs into drug pairs, consisting of inhibitors of each kinase alone, which led to a ranking of synergistic drug combinations. Since the kinase pairs jointly inhibited by the same drugs have equal scores from combinatorial KISS, both the kinase and drug combination rankings may have ties (i.e. there may be more than one kinase pair potentially explaining the efficacy of their common set of inhibitors). Therefore, it was not expected that all the top kinase or drug pairs would be synergistic, but rather that at least one drug or kinase pair per combinatorial KISS class showed synergistic effect. In our experimental validations, we indeed found one synergistic drug pair in each of the KISS classes (Figure 6 in publication II).

The relatively small number of inhibitors with comprehensive target annotations available for the KISS study resulted in the presence of ties in both single and combinatorial kinase rankings, whenever the kinase targets were individually or jointly inhibited by the same compounds. Some kinases being part of such highly-ranked ties may actually be false positives. However, these ties in the current rankings could be completely eliminated given a sufficient number of compounds with large-scale target selectivity profiles. On the other hand, false negative findings, i.e., low-ranked kinases that are important for cancer cell survival, can also be present in the current predictions due to relatively short assessment time of drug sensitivity (3 days), which may be insufficient to affect the viability readout. For instance, the inhibitors with target annotations in our collection could not elucidate the signal addictions in HCC1599, since none of the kinases had a positive differential KISS score in this cell line. Further, BL1 cell lines and other non-TNBC cells belonging to the same cluster had mostly

Page 46: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

46

negative KISS values due to their lack of sensitivity to any of the kinase inhibitors in our panel (Supplementary Figure in publication II). Since the KISS analysis had to be limited to a relatively small number of compounds and only a subset of the kinome, which may in turn lead to unequal representation of pathways relevant for cancer cell survival, we further examined the effect of extending the target space on the KISS rankings. For this purpose, we used our integrative collection of comprehensive biochemical selectivity profiles from publication I. Specifically, we extracted all the kinase targets with high confidence binding affinity score in the same range as in quantitative drug-target set. It was found out that KISS rankings remained largely unaffected by the additional target annotations (data not shown), pointing mainly to the same set of key molecular addictions. However, it is likely that increased number of kinase inhibitors with validated target spaces will lead to more accurate prediction of new cancer drivers using KISS and similar methodologies in the future.

We systematically tested the predictive power of both single and combinatorial KISS in breast cancer cell lines, in which a number of molecular dependencies have already been established, serving here as positive controls, and where in-depth evaluations of kinase target essentiality and synthetic lethal target interactions are possible via siRNA and drug combination experiments in vitro. Experiments with siRNA combinations offer a more direct way of validating the combinatorial KISS predictions, although the observed synergy upon joint siRNA inhibition may also be due to off-target effects or variable reagent sensitivity, known to hamper RNAi studies. Moreover, the functional consequences of loss-of-function siRNA perturbations are not equal to those from chemical perturbations (Weiss et al, 2007), since the loss of gene expression may affect protein complex formation, and thereby signal transduction and cellular phenotypes such as viability, differently than just blocking the activity of a target that may still interact and bridge some complexes. Nevertheless, the siRNA knockdowns gave complementary support to the co-essentiality of the kinase pairs predicted to be synergistic by the combinatorial KISS. As a future development, it might be useful to combine the chemical and RNAi-based perturbation experiments with genomic analyses to provide potentially improved identification of molecular addictions and co-dependencies, along the lines suggested in previous integrative studies (Gatza et al, 2014; Shiu et al, 2014; Sundaramurthy et al, 2014; Vizeacoumar et al, 2013; Zaman et al, 2013). It is noted, however, that systematic combinatorial silencing of exponentially increasing number of potential kinase pairs using combinatorial RNAi screens is currently experimentally not feasible in human cells. In model organisms, large scale drug combination screens have found that synergistic drug pairs are rare (4–10%); Interestingly, however, these often correspond to synergistic genetic interactions between the corresponding targets of the individual compounds (Cokol et al, 2011), hence suggesting an unbiased strategy to predict synergistic drug combinations. What has remained lacking, however, are systematic experimental-computational approaches, such as KISS, which enable predicting and prioritizing the most promising combinations for experimental validation. Such methods are necessary since experimental testing of all possible target or drug combinations is laborious and expensive even in automated high throughput setting.

Page 47: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

47

KISS can be applied in various cell lines or patient samples to provide pharmacologically actionable therapeutic targets and can contribute to the development of new personalized treatment strategies. As broad-scale drug sensitivity testing is increasingly used for profiling patient-derived cancer cells, KISS can benefit the translational applications by revealing oncogenic driver signals and suggesting combinatorial treatments targeting synthetic lethal type of interactions. When applied to drug sensitivity profiles of cancer patients derived at various stages of disease progression, it elucidated the network addictions not obvious from drug screens or genomic information (Pemovska et al, 2013). Further improvements in KISS-based target prioritization could arise from integration of functional and molecular profiles of patients, including for instance gene expression, genotypes or protein levels. Assessing the impact of effective small molecules on the disease networks by further molecular profiling of cells following drug perturbation could also contribute to improved understanding of the disease driving processes. In addition, a group of patient samples with the same drivers may be hypothesized to have a common underlying mutation or gene expression signature predictive of the drug response.

The predictive accuracy of disease driver deconvolution methods based on small molecule response profiles is highly dependent on the coverage and accuracy of target annotations. The databases with quantitative target annotations, such as ChEMBL, contain heterogeneous data obtained under various experimental settings. The extracted bioactivity readouts for a given compound-target interaction may be incomparable or difficult to evaluate in terms of their reliability without going into the experimental details. This in turn may not be feasible when analyzing large amounts of annotations. In addition, full information about experimental conditions in ChEMBL is often missing which limits the applicability of the known relations between the bioactivity types, such as the Cheng-Prusoff equation. The KIBA model presented in publication I facilitates the usage of this heterogeneous data by providing a summary score which maximizes the consistency between different metrics. Thus, KIBA provides a way for integration and fine-tuning of the compound-targets networks obtained from various compound bioactivity profiling assays and thereby facilitates better classification and translation of compound-target interactions. Expanding the knowledge of these interactions should in turn contribute to increased reliability of cancer addictions predicted by methods such as KISS. It would also facilitate testing more kinase combinations because in combinatorial KISS we could only test the pairs for which both common and single inhibitors were available. In addition, better coverage of compound target space facilitates understanding of the mechanisms of action of tested compounds, both in terms of their therapeutic and side effects. In the future KIBA-based target evaluation could be extended to incorporate other types of readouts, such as thermal stability shift assay (Fedorov et al., 2007) or chaperone interaction assay (Taipale et al., 2013).

As the proteomic technologies evolve and develop to provide better coverage and accuracy of PPI maps including the disease networks around the known molecular drivers, there is a growing need for methods that could help in the interpretation and functional translation of such data. In this thesis we focused on the challenge of prioritizing the interactions with high functional similarity to the disease driver of interest in a given cellular context. The RRP

Page 48: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

48

approach contributes to the functional translation of PPI networks by predicting the interactions relevant for a given biological question. RRP consists of modular functional and bioinformatic filters to rank the interaction partners according to their probability of contributing to the oncogenic function of the bait. RRP was applied to two cancer relevant proteins, PIN1 and PME-1, and turned out especially useful for the analysis of interactomes of baits for which there is limiter prior knowledge available, such as PME-1. For these networks the existing tools for functional similarity predictions, such as FunCoup which rely on the data in the public domain, turned out to be uninformative. The RRP-based rankings of interactors according to the predicted functional similarity to the bait of interest were enriched in the validated or known effector proteins for both PIN1 and PME-1. In addition, the RRP analysis of the MS-identified PME-1 interactome provides a possibility for better understanding of the role of this protein in human malignancies. Our evaluations have shown that the top ranked interactors in both rankings would have been ranked much lower based solely on the RNAi screening results, that was the method of choice in some previous studies (Friedman et al., 2011; Ramage et at., 2015). Unlike other methods typically used for filtering of high throughput protein interaction data, such as SAINT or CRAPome, RRP does not assess the reliability of the detected interactions, but focuses on addressing the functional similarity of the proteins. Therefore, combining RRP with the tools for selection of high confidence interactions offers a possibility to pinpoint the interaction partners which are both reliable and functionally relevant to the function of the bait in a given biological context.

Taken together, the experimental-computational approaches for the identification of disease driver networks facilitate better translation of the growing amount of biological data and development of novel treatment strategies. Further efforts are however needed to decipher the information flow across the dynamic and context-dependent disease networks, in addition to good coverage of PPIs and drug-target interactions. Integration of additional types of data, such as genomic sequence or time-series measurements of protein signaling under various conditions is therefore necessary to better understand the disease mechanisms and to design the perturbations that would result in therapeutic effects.

Page 49: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

49

7. ACKNOWLEDGEMENTS

The work presented here was carried out in the Computational Systems Medicine group at the Institute for Molecular Medicine Finland (FIMM) between 2011 and 2015. I would like to express my deepest gratitude to all the people who directly or indirectly contributed to its completion. In particular I wish to thank the group leader Prof. Tero Aittokallio for his guidance and supervision of this work. I appreciate that you always found time to comment my manuscripts, presentations, poster abstracts or grant applications. I hope at least some of your writing skills diffused into me. I sincerely thank my co-supervisor, Dr. Krister Wennerberg, for all the discussions and finding time in his overbooked calendar. Your knowledge of chemical biology was invaluable for this work. Thank you for all the insightful comments.

I wish to thank all my collaborators and co-authors of my publications. Your contributions were essential for this work. Many thanks to Dr. Prson Gautam, Dr. Leena Karhinen, Laura Turunen, Sawan Kumar Jha and Dr. Jani Saarela for generating data for my analysis and performing the experimental validations for the KISS paper. I’m also grateful to Karoliina Laamanen, Katja Suomi, Elina Huovari and Dr Carina von Schantz-Fant from the High Throughput Biology Unit at FIMM, for their technical assistance. Prof. Jukka Westermarck and Dr. Yuba Pokharel are greatly acknowledged for the collaboration in the RRP project. I warmly thank Dr. Maciej Lalowski for a fruitful collaboration and enjoyable discussions whether science related or not. Working with you was a pleasure and your passion for science is infectious! Thanks to Dr. Monika Zaręba-Kozioł and other people in Warsaw for the possibility to work with you on AD project. Dr. Enzo Scifo, thank you for the neuroscience collaboration. I appreciate I could take part in those projects, even though the results were not included in my thesis.

Prof. Harri Lähdesmäki and Prof. Tuula Nyman are acknowledged for being part of my thesis committee and providing valuable feedback on this work. I also thank Prof. Matti Nykter for accepting the invitation to act as my opponent and Prof. Samuli Ripatti for being the custos at my thesis defense. The reviewers of this work, Associate Professors Harri Lähdesmäki and Henri Xhaard, are greatly acknowledged for their comments and suggestions that improved the final thesis manuscript.

Many thanks to everyone working at FIMM during my PhD years, the former director Prof. Olli Kallioniemi and the administrative stuff, for providing the facilities and creating such a great place to work with friendly and supportive atmosphere. In particular, I warmly thank all the fellow PhD students at FIMM and members of the Aittokallio and Wennerberg groups, especially my office mates Dr. Bhagwan Yadav and Dr. Jing Tang for all your help, nice company and discussions over the years. I’m also grateful to Dr. Gretchen Repasky for her guidance as a research training coordinator.

Page 50: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

50

I also thank my former boss Prof. Jussi Taipale and my undergraduate study supervisor Dr. Minna Taipale for stimulating my interest in systems biology. I probably wouldn’t be here if you didn’t take me onboard.

I also wish to thank the Doctoral Program in Biomedicine, the Biomedicum Helsinki Foundation and Finnish Cancer Society for financial support.

Finally, I’m very thankful to my family and all my friends, especially to Roland, Marcela, Tony, Edua and Jan, for their encouragement, support and help at various stages of my stay in Finland and during the writing process.

Page 51: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

51

8. REFERENCES

Abbott, A. (2012). Cognition: The brain’s decline. Nature 492, S4–S5.

Adzhubei, I., Jordan, D.M., and Sunyaev, S.R. (2013). Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet Chapter 7, Unit7.20.

Aebersold, R., and Mann, M. (2003). Mass spectrometry-based proteomics. Nature 422, 198–207.

Akavia, U.D., Litvin, O., Kim, J., Sanchez-Garcia, F., Kotliar, D., Causton, H.C., Pochanard, P., Mozes, E., Garraway, L.A., and Pe’er, D. (2010). An integrated approach to uncover driv-ers of cancer. Cell 143, 1005–1017.

Albers, M., Kranz, H., Kober, I., Kaiser, C., Klink, M., Suckow, J., Kern, R., and Koegl, M. (2005). Automated yeast two-hybrid screening for nuclear receptor-interacting proteins. Mol. Cell Proteomics 4, 205–213.

Altelaar, A.F.M., Frese, C.K., Preisinger, C., Hennrich, M.L., Schram, A.W., Timmers, H.T.M., Heck, A.J.R., and Mohammed, S. (2013a). Benchmarking stable isotope labeling based quantitative proteomics. J Proteomics 88, 14–26.

Altelaar, A.F.M., Munoz, J., and Heck, A.J.R. (2013b). Next-generation proteomics: towards an integrative view of proteome dynamics. Nat Rev Genet 14, 35–48.

Anastassiadis, T., Deacon, S.W., Devarajan, K., Ma, H., and Peterson, J.R. (2011). Compre-hensive assay of kinase catalytic activity reveals features of kinase inhibitor selectivity. Nat. Biotechnol. 29, 1039–1045.

Anighoro, A., Bajorath, J., and Rastelli, G. (2014). Polypharmacology: challenges and oppor-tunities in drug discovery. J. Med. Chem. 57, 7874–7887.

Antolin, A.A., Workman, P., Mestres, J., and Al-Lazikani, B. (2016). Polypharmacology in Precision Oncology: Current Applications and Future Prospects. Curr. Pharm. Des. 22, 6935–6945.

Ashworth, A., Lord, C.J., and Reis-Filho, J.S. (2011). Genetic interactions in cancer progres-sion and treatment. Cell 145, 30–38.

Assouline, S., and Lipton, J.H. (2011). Monitoring response and resistance to treatment in chronic myeloid leukemia. Curr Oncol 18, e71–e83.

Ayala, G., Wang, D., Wulf, G., Frolov, A., Li, R., Sowadski, J., Wheeler, T.M., Lu, K.P., and Bao, L. (2003). The prolyl isomerase Pin1 is a novel prognostic marker in human prostate cancer. Cancer Res. 63, 6244–6251.

Ba-alawi Wail, Soufan, O., Essack, M., Kalnis, P., and Bajic, V.B. (2016). DASPfind: new efficient method to predict drug–target interactions. Journal of Cheminformatics 8, 15.

Di Bacco, A., Keeshan, K., McKenna, S.L., and Cotter, T.G. (2000). Molecular abnormalities in chronic myeloid leukemia: deregulation of cell growth and apoptosis. Oncologist 5, 405–415.

Badertscher, L., Wild, T., Montellese, C., Alexander, L.T., Bammert, L., Sarazova, M., Ste-bler, M., Csucs, G., Mayer, T.U., Zamboni, N., et al. (2015). Genome-wide RNAi Screening Identifies Protein Modules Required for 40S Subunit Synthesis in Human Cells. Cell Reports

Page 52: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

52

13, 2879–2891.

Bagyinszky, E., Youn, Y.C., An, S.S.A., and Kim, S. (2014). The genetics of Alzheimer’s disease. Clin Interv Aging 9, 535–551.

Banerjee, S., and Mazumdar, S. (2012). Electrospray Ionization Mass Spectrometry: A Tech-nique to Access the Information beyond the Molecular Weight of the Analyte. International Journal of Analytical Chemistry 2012, e282574.

Barabási, A.-L., Gulbahce, N., and Loscalzo, J. (2011). Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12, 56–68.

Barretina, J., Caponigro, G., Stransky, N., Venkatesan, K., Margolin, A.A., Kim, S., Wilson, C.J., Lehár, J., Kryukov, G.V., Sonkin, D., et al. (2012). The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607.

Benjamini, Y., and Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological) 57, 289–300.

Berenbaum, M.C. (1989). What is synergy? Pharmacol. Rev. 41, 93–141.

Bindea, G., Mlecnik, B., Hackl, H., Charoentong, P., Tosolini, M., Kirilovsky, A., Fridman, W.-H., Pagès, F., Trajanoski, Z., and Galon, J. (2009). ClueGO: a Cytoscape plug-in to deci-pher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 25, 1091–1093.

Bliss, C.I. (1939). The Toxicity of Poisons Applied Jointly1. Annals of Applied Biology 26, 585–615.

Bouwmeester, T., Bauch, A., Ruffner, H., Angrand, P.-O., Bergamini, G., Croughton, K., Cruciat, C., Eberhard, D., Gagneur, J., Ghidelli, S., et al. (2004). A physical and functional map of the human TNF-alpha/NF-kappa B signal transduction pathway. Nat. Cell Biol. 6, 97–105.

Bozic, I., and Nowak, M.A. (2014). Timing and heterogeneity of mutations associated with drug resistance in metastatic cancers. Proc. Natl. Acad. Sci. U.S.A. 111, 15964–15968.

Bozic, I., Antal, T., Ohtsuki, H., Carter, H., Kim, D., Chen, S., Karchin, R., Kinzler, K.W., Vogelstein, B., and Nowak, M.A. (2010). Accumulation of driver and passenger mutations during tumor progression. Proc. Natl. Acad. Sci. U.S.A. 107, 18545–18550.

Bradner, J.E., Hnisz, D., and Young, R.A. (2017). Transcriptional Addiction in Cancer. Cell 168, 629–643.

Bredt, D.S., Glatt, C.E., Hwang, P.M., Fotuhi, M., Dawson, T.M., and Snyder, S.H. (1991). Nitric oxide synthase protein and mRNA are discretely localized in neuronal populations of the mammalian CNS together with NADPH diaphorase. Neuron 7, 615–624.

Bromberg, Y., and Rost, B. (2007). SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res. 35, 3823–3835.

Brough, R., Frankum, J.R., Sims, D., Mackay, A., Mendes-Pereira, A.M., Bajrami, I., Costa-Cabral, S., Rafiq, R., Ahmad, A.S., Cerone, M.A., et al. (2011). Functional Viability Profiles of Breast Cancer. Cancer Discov 1, 260–273.

Burlingham, B.T., and Widlanski, T.S. (2003). An Intuitive Look at the Relationship of Ki and IC50: A More General Use for the Dixon Plot. J. Chem. Educ. 80, 214.

Butland, G., Peregrín-Alvarez, J.M., Li, J., Yang, W., Yang, X., Canadien, V., Starostine, A.,

Page 53: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

53

Richards, D., Beattie, B., Krogan, N., et al. (2005). Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature 433, 531–537.

Campbell, J., Ryan, C.J., Brough, R., Bajrami, I., Pemberton, H.N., Chong, I.Y., Costa-Cabral, S., Frankum, J., Gulati, A., Holme, H., et al. (2016). Large-Scale Profiling of Kinase Dependencies in Cancer Cell Lines. Cell Rep 14, 2490–2501.

Campbell, J.M., Lockwood, W.W., Buys, T.P.H., Chari, R., Coe, B.P., Lam, S., and Lam, W.L. (2008). Integrative genomic and gene expression analysis of chromosome 7 identified novel oncogene loci in non-small cell lung cancer. Genome 51, 1032–1039.

Campbell, P.J., Stephens, P.J., Pleasance, E.D., O’Meara, S., Li, H., Santarius, T., Stebbings, L.A., Leroy, C., Edkins, S., Hardy, C., et al. (2008). Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat Genet 40, 722–729.

Carter, H., Chen, S., Isik, L., Tyekucheva, S., Velculescu, V.E., Kinzler, K.W., Vogelstein, B., and Karchin, R. (2009). Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. Cancer Res. 69, 6660–6667.

Chan, D.A., and Giaccia, A.J. (2011). Harnessing synthetic lethal interactions in anticancer drug discovery. Nat Rev Drug Discov 10, 351–364.

Chartier-Harlin, M.C., Crawford, F., Houlden, H., Warren, A., Hughes, D., Fidani, L., Goate, A., Rossor, M., Roques, P., and Hardy, J. (1991). Early-onset Alzheimer’s disease caused by mutations at codon 717 of the beta-amyloid precursor protein gene. Nature 353, 844–846.

Chen, H., and Zhang, Z. (2013). A Semi-Supervised Method for Drug-Target Interaction Pre-diction with Consistency in Networks. PLOS ONE 8, e62975.

Chen, D.Y., Liu, H., Takeda, S., Tu, H.-C., Sasagawa, S., Van Tine, B.A., Lu, D., Cheng, E.H.-Y., and Hsieh, J.J.-D. (2010). Taspase1 functions as a non-oncogene addiction protease that coordinates cancer cell proliferation and apoptosis. Cancer Res. 70, 5358–5367.

Chen, X., Ji, Z.L., and Chen, Y.Z. (2002). TTD: Therapeutic Target Database. Nucleic Acids Res. 30, 412–415.

Cheng, Y., and Prusoff, W.H. (1973). Relationship between the inhibition constant (K1) and the concentration of inhibitor which causes 50 per cent inhibition (I50) of an enzymatic reac-tion. Biochem. Pharmacol. 22, 3099–3108.

Cheung, H.W., Cowley, G.S., Weir, B.A., Boehm, J.S., Rusin, S., Scott, J.A., East, A., Ali, L.D., Lizotte, P.H., Wong, T.C., et al. (2011). Systematic investigation of genetic vulnerabili-ties across cancer cell lines reveals lineage-specific dependencies in ovarian cancer. PNAS 108, 12372–12377.

Cho, S., Park, S.G., Lee, D.H., and Park, B.C. (2004). Protein-protein interaction networks: from interactions to networks. J. Biochem. Mol. Biol. 37, 45–52.

Choi, H., Larsen, B., Lin, Z.-Y., Breitkreutz, A., Mellacheruvu, D., Fermin, D., Qin, Z.S., Tyers, M., Gingras, A.-C., and Nesvizhskii, A.I. (2011). SAINT: probabilistic scoring of af-finity purification-mass spectrometry data. Nat. Methods 8, 70–73.

Cieśla, J., Frączyk, T., and Rode, W. (2011). Phosphorylation of basic amino acid residues in proteins: important but easily missed. Acta Biochim. Pol. 58, 137–148.

Cokol, M., Chua, H.N., Tasan, M., Mutlu, B., Weinstein, Z.B., Suzuki, Y., Nergiz, M.E., Cos-tanzo, M., Baryshnikova, A., Giaever, G., et al. (2011). Systematic exploration of synergistic drug pairs. Mol. Syst. Biol. 7, 544.

Page 54: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

54

Cottrell, B.A., Galvan, V., Banwait, S., Gorostiza, O., Lombardo, C.R., Williams, T., Schil-ling, B., Peel, A., Gibson, B., Koo, E.H., et al. (2005). A pilot proteomic study of amyloid precursor interactors in Alzheimer’s disease. Ann. Neurol. 58, 277–289.

Cowley, M.J., Pinese, M., Kassahn, K.S., Waddell, N., Pearson, J.V., Grimmond, S.M., Bian-kin, A.V., Hautaniemi, S., and Wu, J. (2012). PINA v2.0: mining interactome modules. Nu-cleic Acids Res. 40, D862–D865.

Cox, A.D., Fesik, S.W., Kimmelman, A.C., Luo, J., and Der, C.J. (2014). Drugging the undruggable Ras: mission possible? Nat Rev Drug Discov 13, 828–851.

Davis, M.I., Hunt, J.P., Herrgard, S., Ciceri, P., Wodicka, L.M., Pallares, G., Hocker, M., Treiber, D.K., and Zarrinkar, P.P. (2011). Comprehensive analysis of kinase inhibitor selec-tivity. Nat Biotech 29, 1046–1051.

Dayon, L., and Sanchez, J.-C. (2012). Relative protein quantification by MS/MS using the tandem mass tag technology. Methods Mol. Biol. 893, 115–127.

Dayon, L., Hainard, A., Licker, V., Turck, N., Kuhn, K., Hochstrasser, D.F., Burkhard, P.R., and Sanchez, J.-C. (2008). Relative quantification of proteins in human cerebrospinal fluids by MS/MS using 6-plex isobaric tags. Anal. Chem. 80, 2921–2931.

Dedes, K.J., Wilkerson, P.M., Wetterskog, D., Weigelt, B., Ashworth, A., and Reis-Filho, J.S. (2011). Synthetic lethality of PARP inhibition in cancers lacking BRCA1 and BRCA2 muta-tions. Cell Cycle 10, 1192–1199.

Dutta, B., Pusztai, L., Qi, Y., André, F., Lazar, V., Bianchini, G., Ueno, N., Agarwal, R., Wang, B., Shiang, C.Y., et al. (2012). A network-based, integrative study to identify core bio-logical pathways that drive breast cancer clinical subtypes. Br J Cancer 106, 1107–1116.

Early Breast Cancer Trialists’ Collaborative Group (EBCTCG), Davies, C., Godwin, J., Gray, R., Clarke, M., Cutter, D., Darby, S., McGale, P., Pan, H.C., Taylor, C., et al. (2011). Rele-vance of breast cancer hormone receptors and other factors to the efficacy of adjuvant tamoxi-fen: patient-level meta-analysis of randomised trials. Lancet 378, 771–784.

Von Eichborn, J., Dunkel, M., Gohlke, B.O., Preissner, S.C., Hoffmann, M.F., Bauer, J.M.J., Armstrong, J.D., Schaefer, M.H., Andrade-Navarro, M.A., Le Novere, N., et al. (2013). Syn-SysNet: integration of experimental data on synaptic protein-protein interactions with drug-target relations. Nucleic Acids Res. 41, D834–D840.

El-Aneed, A., Cohen, A., and Banoub, J. (2009). Mass Spectrometry, Review of the Basics: Electrospray, MALDI, and Commonly Used Mass Analyzers. Applied Spectroscopy Reviews 44, 210–230.

Erler, J.T., and Linding, R. (2010). Network-based drugs and biomarkers. J. Pathol. 220, 290–296.

Fedorov, O., Marsden, B., Pogacic, V., Rellos, P., Müller, S., Bullock, A.N., Schwaller, J., Sundström, M., and Knapp, S. (2007). A systematic interaction map of validated kinase inhib-itors with Ser/Thr kinases. Proc Natl Acad Sci U S A 104, 20523–20528.

Forbes, S.A., Bhamra, G., Bamford, S., Dawson, E., Kok, C., Clements, J., Menzies, A., Teague, J.W., Futreal, P.A., and Stratton, M.R. (2008). The Catalogue of Somatic Mutations in Cancer (COSMIC). Curr Protoc Hum Genet CHAPTER, Unit – 10.11.

Ford, D., Easton, D.F., Stratton, M., Narod, S., Goldgar, D., Devilee, P., Bishop, D.T., Weber, B., Lenoir, G., Chang-Claude, J., et al. (1998). Genetic heterogeneity and penetrance analysis of the BRCA1 and BRCA2 genes in breast cancer families. The Breast Cancer Linkage Con-

Page 55: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

55

sortium. Am. J. Hum. Genet. 62, 676–689.

Foster, M.W., Hess, D.T., and Stamler, J.S. (2009). Protein S-nitrosylation in health and dis-ease: a current perspective. Trends Mol Med 15, 391–404.

Foulkes, W.D., Smith, I.E., and Reis-Filho, J.S. (2010). Triple-negative breast cancer. N. Engl. J. Med. 363, 1938–1948.

Freije, J.M.P., Fraile, J.M., and Lopez-Otin, C. (2011). Protease Addiction and Synthetic Le-thality in Cancer. Front. Oncol. 1.

Friedman, A.A., Tucker, G., Singh, R., Yan, D., Vinayagam, A., Hu, Y., Binari, R., Hong, P., Sun, X., Porto, M., et al. (2011). Proteomic and Functional Genomic Landscape of Receptor Tyrosine Kinase and Ras to Extracellular Signal–Regulated Kinase Signaling. Sci Signal 4, rs10.

Garnett, M.J., Edelman, E.J., Heidorn, S.J., Greenman, C.D., Dastur, A., Lau, K.W., Greninger, P., Thompson, I.R., Luo, X., Soares, J., et al. (2012). Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483, 570–575.

Garraway, L.A., and Lander, E.S. (2013). Lessons from the cancer genome. Cell 153, 17–37.

Gatza, M.L., Silva, G.O., Parker, J.S., Fan, C., and Perou, C.M. (2014). An integrated ge-nomics approach identifies drivers of proliferation in luminal-subtype human breast cancer. Nat. Genet. 46, 1051–1059.

Gaulton, A., Bellis, L.J., Bento, A.P., Chambers, J., Davies, M., Hersey, A., Light, Y., McGlinchey, S., Michalovich, D., Al-Lazikani, B., et al. (2012). ChEMBL: a large-scale bio-activity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107.

Gautam, P., Karhinen, L., Szwajda, A., Jha, S.K., Yadav, B., Aittokallio, T., and Wennerberg, K. (2016). Identification of selective cytotoxic and synthetic lethal drug responses in triple negative breast cancer cells. Mol. Cancer 15, 34.

Gavin, A.-C., Aloy, P., Grandi, P., Krause, R., Boesche, M., Marzioch, M., Rau, C., Jensen, L.J., Bastuck, S., Dümpelfeld, B., et al. (2006). Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636.

Gfeller, D., Michielin, O., and Zoete, V. (2013). Shaping the interaction landscape of bioac-tive molecules. Bioinformatics 29, 3073–3079.

Gingras, A.-C., Gstaiger, M., Raught, B., and Aebersold, R. (2007). Analysis of protein com-plexes using mass spectrometry. Nat. Rev. Mol. Cell Biol. 8, 645–654.

Gnad, F., Baucom, A., Mukhyala, K., Manning, G., and Zhang, Z. (2013). Assessment of computational methods for predicting the effects of missense mutations in human cancers. BMC Genomics 14 Suppl 3, S7.

Gonzalez-Perez, A., and Lopez-Bigas, N. (2012). Functional impact bias reveals cancer driv-ers. Nucleic Acids Res. 40, e169.

Gonzalez-Perez, A., Deu-Pons, J., and Lopez-Bigas, N. (2012). Improving the prediction of the functional impact of cancer mutations by baseline tolerance transformation. Genome Med 4, 89.

González-Reyes, S., Marín, L., González, L., González, L.O., del Casar, J.M., Lamelas, M.L., González-Quintana, J.M., and Vizoso, F.J. (2010). Study of TLR3, TLR4 and TLR9 in breast carcinomas and their association with metastasis. BMC Cancer 10, 665.

Goswami, C.P., Cheng, L., Alexander, P., Singal, A., and Li, L. (2015). A New Drug Combi-

Page 56: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

56

natory Effect Prediction Algorithm on the Cancer Cell Based on Gene Expression and Dose–Response Curve. CPT Pharmacometrics Syst Pharmacol 4.

Greco, W.R., Park, H.S., and Rustum, Y.M. (1990). Application of a new approach for the quantitation of drug synergism to the combination of cis-diamminedichloroplatinum and 1-beta-D-arabinofuranosylcytosine. Cancer Res. 50, 5318–5327.

Grigoriadis, A., Mackay, A., Noel, E., Wu, P.J., Natrajan, R., Frankum, J., Reis-Filho, J.S., and Tutt, A. (2012). Molecular characterisation of cell line models for triple-negative breast cancers. BMC Genomics 13, 619.

Groemer, T.W., Thiel, C.S., Holt, M., Riedel, D., Hua, Y., Hüve, J., Wilhelm, B.G., and Klingauf, J. (2011). Amyloid precursor protein is trafficked and secreted via synaptic vesicles. PLoS ONE 6, e18754.

Han, X., Aslanian, A., and Yates, J.R. (2008). Mass Spectrometry for Proteomics. Curr Opin Chem Biol 12, 483–490.

Hao, G., Derakhshan, B., Shi, L., Campagne, F., and Gross, S.S. (2006). SNOSID, a proteo-mic method for identification of cysteine S-nitrosylation sites in complex protein mixtures. Proc. Natl. Acad. Sci. U.S.A. 103, 1012–1017.

Hart, T., Chandrashekhar, M., Aregger, M., Steinhart, Z., Brown, K.R., MacLeod, G., Mis, M., Zimmermann, M., Fradet-Turcotte, A., Sun, S., et al. (2015). High-Resolution CRISPR Screens Reveal Fitness Genes and Genotype-Specific Cancer Liabilities. Cell 163, 1515–1526.

Heiser, L.M., Sadanandam, A., Kuo, W.-L., Benz, S.C., Goldstein, T.C., Ng, S., Gibb, W.J., Wang, N.J., Ziyad, S., Tong, F., et al. (2012). Subtype and pathway specific responses to anti-cancer compounds in breast cancer. Proc. Natl. Acad. Sci. U.S.A. 109, 2724–2729.

Hess, D.T., Matsumoto, A., Kim, S.-O., Marshall, H.E., and Stamler, J.S. (2005). Protein S-nitrosylation: purview and parameters. Nat. Rev. Mol. Cell Biol. 6, 150–166.

Hopkins, A.L. (2008). Network pharmacology: the next paradigm in drug discovery. Nat. Chem. Biol. 4, 682–690.

Hu, P., Janga, S.C., Babu, M., Díaz-Mejía, J.J., Butland, G., Yang, W., Pogoutse, O., Guo, X., Phanse, S., Wong, P., et al. (2009). Global functional atlas of Escherichia coli encompassing previously uncharacterized proteins. PLoS Biol. 7, e96.

Huan, T., Meng, Q., Saleh, M.A., Norlander, A.E., Joehanes, R., Zhu, J., Chen, B.H., Zhang, B., Johnson, A.D., Ying, S., et al. (2015). Integrative network analysis reveals molecular mechanisms of blood pressure regulation. Molecular Systems Biology 11, 799.

Huang, S. (2004). Back to the biology in systems biology: what can we learn from biomolecu-lar networks? Brief Funct Genomic Proteomic 2, 279–297.

Huang, S., and Ingber, D.E. (2000). Shape-dependent control of cell growth, differentiation, and apoptosis: switching between attractors in cell regulatory networks. Exp. Cell Res. 261, 91–103.

Huang, L., Li, F., Sheng, J., Xia, X., Ma, J., Zhan, M., and Wong, S.T.C. (2014). DrugCom-boRanker: drug combination discovery based on target network analysis. Bioinformatics 30, i228–i236.

Huynen, M., Snel, B., Lathe, W., and Bork, P. (2000). Predicting Protein Function by Ge-nomic Context: Quantitative Evaluation and Qualitative Inferences. Genome Res 10, 1204–1210.

Page 57: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

57

Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., and Sakaki, Y. (2001). A comprehen-sive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. U.S.A. 98, 4569–4574.

Jerby-Arnon, L., Pfetzer, N., Waldman, Y.Y., McGarry, L., James, D., Shanks, E., Seashore-Ludlow, B., Weinstock, A., Geiger, T., Clemons, P.A., et al. (2014). Predicting cancer-specific vulnerability via data-driven detection of synthetic lethality. Cell 158, 1199–1209.

Jeronimo, C., Forget, D., Bouchard, A., Li, Q., Chua, G., Poitras, C., Thérien, C., Bergeron, D., Bourassa, S., Greenblatt, J., et al. (2007). Systematic analysis of the protein interaction network for the human transcription machinery reveals the identity of the 7SK capping en-zyme. Mol. Cell 27, 262–274.

Johnson, L.N. (2009). The regulation of protein phosphorylation. Biochem. Soc. Trans. 37, 627–641.

Johnson, L.N., Noble, M.E.M., and Owen, D.J. (1996). Active and Inactive Protein Kinases: Structural Basis for Regulation. Cell 85, 149–158.

Jokinen, E., Laurila, N., Koivunen, P., and Koivunen, J.P. (2014). Combining targeted drugs to overcome and prevent resistance of solid cancers with some stem-like cell features. Onco-target 5, 9295–9307.

Jones, S., Zhang, X., Parsons, D.W., Lin, J.C.-H., Leary, R.J., Angenendt, P., Mankoo, P., Carter, H., Kamiyama, H., Jimeno, A., et al. (2008). Core Signaling Pathways in Human Pan-creatic Cancers Revealed by Global Genomic Analyses. Science 321, 1801–1806.

Jørgensen, C., and Linding, R. (2010). Simplistic pathways or complex networks? Curr. Opin. Genet. Dev. 20, 15–22.

Kanehisa, M., Goto, S., Sato, Y., Furumichi, M., and Tanabe, M. (2012). KEGG for integra-tion and interpretation of large-scale molecular data sets. Nucleic Acids Res. 40, D109–D114.

Kilpinen, S., Autio, R., Ojala, K., Iljin, K., Bucher, E., Sara, H., Pisto, T., Saarela, M., Skotheim, R.I., Björkman, M., et al. (2008). Systematic bioinformatic analysis of expression levels of 17,330 human genes across 9,783 samples from 175 types of healthy and pathologi-cal tissues. Genome Biol. 9, R139.

Kim, M.-S., Pinto, S.M., Getnet, D., Nirujogi, R.S., Manda, S.S., Chaerkady, R., Madugundu, A.K., Kelkar, D.S., Isserlin, R., Jain, S., et al. (2014). A draft map of the human proteome. Nature 509, 575–581.

Knapp, S., Arruda, P., Blagg, J., Burley, S., Drewry, D.H., Edwards, A., Fabbro, D., Gilles-pie, P., Gray, N.S., Kuster, B., et al. (2013). A public-private partnership to unlock the untar-geted kinome. Nat. Chem. Biol. 9, 3–6.

Knox, C., Law, V., Jewison, T., Liu, P., Ly, S., Frolkis, A., Pon, A., Banco, K., Mak, C., Neveu, V., et al. (2011). DrugBank 3.0: a comprehensive resource for “omics” research on drugs. Nucleic Acids Res. 39, D1035–D1041.

Köcher, T., and Superti-Furga, G. (2007). Mass spectrometry–based functional proteomics: from molecular machines to protein networks. Nat Meth 4, 807–815.

Komarova, N.L., and Thalhauser, C.J. (2011). Calculating stage duration statistics in multi-stage diseases. PLoS ONE 6, e28298.

König, R., Chiang, C., Tu, B.P., Yan, S.F., DeJesus, P.D., Romero, A., Bergauer, T., Orth, A., Krueger, U., Zhou, Y., et al. (2007). A probability-based approach for the analysis of large-scale RNAi screens. Nat Meth 4, 847–849.

Page 58: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

58

Korbel, J.O., Jensen, L.J., von Mering, C., and Bork, P. (2004a). Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs. Nat. Biotechnol. 22, 911–917.

Korbel, J.O., Jensen, L.J., von Mering, C., and Bork, P. (2004). Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs. Nat. Biotechnol. 22, 911–917.

Krogan, N.J., Cagney, G., Yu, H., Zhong, G., Guo, X., Ignatchenko, A., Li, J., Pu, S., Datta, N., Tikuisis, A.P., et al. (2006). Global landscape of protein complexes in the yeast Saccha-romyces cerevisiae. Nature 440, 637–643.

Kuhn, M., Szklarczyk, D., Franceschini, A., von Mering, C., Jensen, L.J., and Bork, P. (2012). STITCH 3: zooming in on protein-chemical interactions. Nucleic Acids Res. 40, D876–D880.

Lage, K. (2014). Protein–protein interactions and genetic diseases: The interactome. Bio-chimica et Biophysica Acta (BBA) - Molecular Basis of Disease 1842, 1971–1980.

Lawrence, M.S., Stojanov, P., Polak, P., Kryukov, G.V., Cibulskis, K., Sivachenko, A., Carter, S.L., Stewart, C., Mermel, C.H., Roberts, S.A., et al. (2013). Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218.

Lawrence, R.T., Perez, E.M., Hernández, D., Miller, C.P., Haas, K.M., Irie, H.Y., Lee, S.-I., Blau, C.A., and Villén, J. (2015). The proteomic landscape of triple-negative breast cancer. Cell Rep 11, 630–644.

Al-Lazikani, B., Banerji, U., and Workman, P. (2012). Combinatorial drug therapy for cancer in the post-genomic era. Nat Biotech 30, 679–692.

Lee, J.A., and Berg, E.L. (2013). Neoclassic drug discovery: the case for lead generation us-ing phenotypic and functional approaches. J Biomol Screen 18, 1143–1155.

Lee, J., and Bogyo, M. (2013). Target deconvolution techniques in modern phenotypic profil-ing. Curr Opin Chem Biol 17, 118–126.

Lee, H.K., Hsu, A.K., Sajdak, J., Qin, J., and Pavlidis, P. (2004). Coexpression analysis of human genes across many microarray data sets. Genome Res. 14, 1085–1094.

Lehmann, B.D., Bauer, J.A., Chen, X., Sanders, M.E., Chakravarthy, A.B., Shyr, Y., and Pie-tenpol, J.A. (2011). Identification of human triple-negative breast cancer subtypes and pre-clinical models for selection of targeted therapies. J. Clin. Invest. 121, 2750–2767.

Li, Y. (2011). The tandem affinity purification technology: an overview. Biotechnol. Lett. 33, 1487–1499.

Li, M., Zeng, T., Liu, R., and Chen, L. (2014). Detecting tissue-specific early warning signals for complex diseases based on dynamical network biomarkers: study of type 2 diabetes by cross-tissue analysis. Brief. Bioinformatics 15, 229–243.

Li, S., Armstrong, C.M., Bertin, N., Ge, H., Milstein, S., Boxem, M., Vidalain, P.-O., Han, J.-D.J., Chesneau, A., Hao, T., et al. (2004). A map of the interactome network of the metazoan C. elegans. Science 303, 540–543.

Linding, R., Jensen, L.J., Ostheimer, G.J., van Vugt, M.A.T.M., Jørgensen, C., Miron, I.M., Diella, F., Colwill, K., Taylor, L., Elder, K., et al. (2007). Systematic discovery of in vivo phosphorylation networks. Cell 129, 1415–1426.

Liu, H. (2014). Application of immunohistochemistry in breast pathology: a review and up-

Page 59: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

59

date. Arch. Pathol. Lab. Med. 138, 1629–1642.

Liu, G., Zhang, J., Larsen, B., Stark, C., Breitkreutz, A., Lin, Z.-Y., Breitkreutz, B.-J., Ding, Y., Colwill, K., Pasculescu, A., et al. (2010). ProHits: integrated software for mass spectrome-try-based interaction proteomics. Nat. Biotechnol. 28, 1015–1017.

Loewe, S. (1953). The problem of synergism and antagonism of combined drugs. Arzneimit-telforschung 3, 285–290.

Lounkine, E., Keiser, M.J., Whitebread, S., Mikhailov, D., Hamon, J., Jenkins, J.L., Lavan, P., Weber, E., Doak, A.K., Côté, S., et al. (2012). Large-scale prediction and testing of drug activity on side-effect targets. Nature 486, 361–367.

Luo, S.Y., and Lam, D.C. (2013). Oncogenic driver mutations in lung cancer. Transl Respir Med 1, 6.

Luo, J., Emanuele, M.J., Li, D., Creighton, C.J., Schlabach, M.R., Westbrook, T.F., Wong, K.-K., and Elledge, S.J. (2009). A Genome-wide RNAi Screen Identifies Multiple Synthetic Lethal Interactions with the Ras Oncogene. Cell 137, 835–848.

Luo, J., Solimini, N.L., and Elledge, S.J. (2009). Principles of cancer therapy: oncogene and non-oncogene addiction. Cell 136, 823–837.

Mäkinen, V.-P., Civelek, M., Meng, Q., Zhang, B., Zhu, J., Levian, C., Huan, T., Segrè, A.V., Ghosh, S., Vivar, J., et al. (2014). Integrative Genomics Reveals Novel Molecular Pathways and Gene Networks for Coronary Artery Disease. PLoS Genet 10.

Malinski, T. (2007). Nitric oxide and nitroxidative stress in Alzheimer’s disease. J. Alz-heimers Dis. 11, 207–218.

Mannick, J.B. (2007). Regulation of apoptosis by protein S-nitrosylation. Amino Acids 32, 523–526.

Marshall, H.E., Merchant, K., and Stamler, J.S. (2000). Nitrosation and oxidation in the regu-lation of gene expression. FASEB J. 14, 1889–1900.

Martínez-Jiménez, F., Papadatos, G., Yang, L., Wallace, I.M., Kumar, V., Pieper, U., Sali, A., Brown, J.R., Overington, J.P., and Marti-Renom, M.A. (2013). Target prediction for an open access set of compounds active against Mycobacterium tuberculosis. PLoS Comput. Biol. 9, e1003253.

Matthiesen, R., Azevedo, L., Amorim, A., and Carvalho, A.S. (2011). Discussion on common data analysis strategies used in MS-based proteomics. Proteomics 11, 604–619.

McDermott, M.S., Browne, B.C., Conlon, N.T., O’Brien, N.A., Slamon, D.J., Henry, M., Meleady, P., Clynes, M., Dowling, P., Crown, J., et al. (2014). PP2A inhibition overcomes acquired resistance to HER2 targeted therapy. Mol. Cancer 13, 157.

McFedries, A., Schwaid, A., and Saghatelian, A. (2013). Methods for the Elucidation of Pro-tein-Small Molecule Interactions. Chemistry & Biology 20, 667–673.

Mellacheruvu, D., Wright, Z., Couzens, A.L., Lambert, J.-P., St-Denis, N.A., Li, T., Miteva, Y.V., Hauri, S., Sardiu, M.E., Low, T.Y., et al. (2013). The CRAPome: a contaminant reposi-tory for affinity purification-mass spectrometry data. Nat. Methods 10, 730–736.

Von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., and Bork, P. (2002). Comparative assessment of large-scale data sets of protein-protein interactions. Na-ture 417, 399–403.

Merrell, M.A., Ilvesaro, J.M., Lehtonen, N., Sorsa, T., Gehrs, B., Rosenthal, E., Chen, D.,

Page 60: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

60

Shackley, B., Harris, K.W., and Selander, K.S. (2006). Toll-like receptor 9 agonists promote cellular invasion by increasing matrix metalloproteinase activity. Mol. Cancer Res. 4, 437–447.

Metz, J.T., Johnson, E.F., Soni, N.B., Merta, P.J., Kifle, L., and Hajduk, P.J. (2011). Navi-gating the kinome. Nat. Chem. Biol. 7, 200–202.

Murphy, E., Kohr, M., Sun, J., Nguyen, T., and Steenbergen, C. (2012). S-nitrosylation: a radical way to protect the heart. J. Mol. Cell. Cardiol. 52, 568–577.

Nagel, R., Semenova, E.A., and Berns, A. (2016). Drugging the addict: non oncogene addic-tion as a target for cancer therapy. EMBO Rep 17, 1516–1531.

Nakamura, T., and Lipton, S.A. (2016). Protein S-Nitrosylation as a Therapeutic Target for Neurodegenerative Diseases. Trends Pharmacol. Sci. 37, 73–84.

Nakamura, T., Tu, S., Akhtar, M.W., Sunico, C.R., Okamoto, S.-I., and Lipton, S.A. (2013). Aberrant protein s-nitrosylation in neurodegenerative diseases. Neuron 78, 596–614.

Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., and Kanehisa, M. (1999). KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 27, 29–34.

Old, W.M., Meyer-Arendt, K., Aveline-Wolf, L., Pierce, K.G., Mendoza, A., Sevinsky, J.R., Resing, K.A., and Ahn, N.G. (2005). Comparison of label-free methods for quantifying hu-man proteins by shotgun proteomics. Mol. Cell Proteomics 4, 1487–1502.

O’Neil, N.J., Bailey, M.L., and Hieter, P. (2017). Synthetic lethality and cancer. Nat Rev Genet advance online publication.

Ong, S.-E., Blagoev, B., Kratchmarova, I., Kristensen, D.B., Steen, H., Pandey, A., and Mann, M. (2002). Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell Proteomics 1, 376–386.

Pahikkala, T., Airola, A., Pietilä, S., Shakyawar, S., Szwajda, A., Tang, J., and Aittokallio, T. (2015). Toward more realistic drug-target interaction predictions. Brief. Bioinformatics 16, 325–337.

Parmigiani, G., Boca, S., Ding, J., and Trippa, L. (2014). Statistical tools and R software for cancer driver probabilities. Methods Mol. Biol. 1101, 113–134.

Paul, F.E., Hosp, F., and Selbach, M. (2011). Analyzing protein-protein interactions by quan-titative mass spectrometry. Methods 54, 387–395.

De-Paula, V.J., Radanovic, M., Diniz, B.S., and Forlenza, O.V. (2012). Alzheimer’s disease. Subcell. Biochem. 65, 329–352.

Pawson, T., and Linding, R. (2008). Network medicine. FEBS Lett. 582, 1266–1270.

Peddi, P.F., Ellis, M.J., and Ma, C. (2012). Molecular basis of triple negative breast cancer and implications for therapy. Int J Breast Cancer 2012, 217185.

Pe’er, D., and Hacohen, N. (2011). Principles and strategies for developing network models in cancer. Cell 144, 864–873.

Pemovska, T., Kontro, M., Yadav, B., Edgren, H., Eldfors, S., Szwajda, A., Almusa, H., Bespalov, M.M., Ellonen, P., Elonen, E., et al. (2013). Individualized systems medicine strat-egy to tailor treatments for patients with chemorefractory acute myeloid leukemia. Cancer Discov 3, 1416–1429.

Pemovska, T., Johnson, E., Kontro, M., Repasky, G.A., Chen, J., Wells, P., Cronin, C.N.,

Page 61: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

61

McTigue, M., Kallioniemi, O., Porkka, K., et al. (2015). Axitinib effectively inhibits BCR-ABL1(T315I) with a distinct binding conformation. Nature 519, 102–105.

Peters, J.-U. (2013). Polypharmacology - foe or friend? J. Med. Chem. 56, 8955–8971.

Pirooznia, M., Wang, T., Avramopoulos, D., Valle, D., Thomas, G., Huganir, R.L., Goes, F.S., Potash, J.B., and Zandi, P.P. (2012). SynaptomeDB: an ontology-based knowledgebase for synaptic genes. Bioinformatics 28, 897–899.

Plouffe, D., Brinker, A., McNamara, C., Henson, K., Kato, N., Kuhen, K., Nagle, A., Adrián, F., Matzen, J.T., Anderson, P., et al. (2008). In silico activity profiling reveals the mechanism of action of antimalarials discovered in a high-throughput screen. Proc. Natl. Acad. Sci. U.S.A. 105, 9059–9064.

Porcelli, L., Quatrale, A.E., Mantuano, P., Silvestris, N., Brunetti, A.E., Calvert, H., Paradiso, A., and Azzariti, A. (2012). Synthetic lethality to overcome cancer drug resistance. Curr. Med. Chem. 19, 3858–3873.

Puustinen, P., Junttila, M.R., Vanhatupa, S., Sablina, A.A., Hector, M.E., Teittinen, K., Raheem, O., Ketola, K., Lin, S., Kast, J., et al. (2009). PME-1 protects extracellular signal-regulated kinase pathway activity from protein phosphatase 2A-mediated inactivation in hu-man malignant glioma. Cancer Res. 69, 2870–2877.

Ramage, H.R., Kumar, G.R., Verschueren, E., Johnson, J.R., Von Dollen, J., Johnson, T., Newton, B., Shah, P., Horner, J., Krogan, N.J., et al. (2015). A Combined Prote-omics/Genomics Approach Links Hepatitis C Virus Infection with Nonsense-Mediated mRNA Decay. Mol Cell 57, 329–340.

Raphael, B.J., Dobson, J.R., Oesper, L., and Vandin, F. (2014). Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine. Ge-nome Med 6, 5.

Reilly, J.F., Martinez, S.D., Mickey, G., and Maher, P.A. (2002). A novel role for farnesyl pyrophosphate synthase in fibroblast growth factor-mediated signal transduction. Biochem. J. 366, 501–510.

Romano, A.D., Serviddio, G., de Matthaeis, A., Bellanti, F., and Vendemiale, G. (2010). Oxi-dative stress and aging. J. Nephrol. 23 Suppl 15, S29–S36.

Ross, P.L., Huang, Y.N., Marchese, J.N., Williamson, B., Parker, K., Hattan, S., Khainovski, N., Pillai, S., Dey, S., Daniels, S., et al. (2004). Multiplexed protein quantitation in Saccharo-myces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell Proteomics 3, 1154–1169.

Rual, J.-F., Venkatesan, K., Hao, T., Hirozane-Kishikawa, T., Dricot, A., Li, N., Berriz, G.F., Gibbons, F.D., Dreze, M., Ayivi-Guedehoussou, N., et al. (2005). Towards a proteome-scale map of the human protein-protein interaction network. Nature 437, 1173–1178.

Ryo, A., Uemura, H., Ishiguro, H., Saitoh, T., Yamaguchi, A., Perrem, K., Kubota, Y., Lu, K.P., and Aoki, I. (2005). Stable suppression of tumorigenicity by Pin1-targeted RNA inter-ference in prostate cancer. Clin. Cancer Res. 11, 7523–7531.

Salahudeen, M.S., and Nishtala, P.S. (2017). An overview of pharmacodynamic modelling, ligand-binding approach and its application in clinical practice. Saudi Pharm J 25, 165–175.

Sayre, L.M., Perry, G., and Smith, M.A. (2008). Oxidative stress and neurotoxicity. Chem. Res. Toxicol. 21, 172–188.

Schadt, E.E. (2009). Molecular networks as sensors and drivers of common human diseases.

Page 62: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

62

Nature 461, 218–223.

Schaefer, M.H., Fontaine, J.-F., Vinayagam, A., Porras, P., Wanker, E.E., and Andrade-Navarro, M.A. (2012). HIPPIE: Integrating Protein Interaction Networks with Experiment Based Quality Scores. PLOS ONE 7, e31826.

Schmitt, T., Ogris, C., and Sonnhammer, E.L.L. (2014). FunCoup 3.0: database of genome-wide functional coupling networks. Nucleic Acids Res. 42, D380–D388.

Sevimoglu, T., and Arga, K.Y. (2014). The role of protein interaction networks in systems biomedicine. Comput Struct Biotechnol J 11, 22–27.

Shah, S.P., Roth, A., Goya, R., Oloumi, A., Ha, G., Zhao, Y., Turashvili, G., Ding, J., Tse, K., Haffari, G., et al. (2012). The clonal and mutational evolution spectrum of primary triple-negative breast cancers. Nature 486, 395–399.

Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, B., and Ideker, T. (2003). Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res 13, 2498–2504.

Shen, M., Stukenberg, P.T., Kirschner, M.W., and Lu, K.P. (1998). The essential mitotic pep-tidyl-prolyl isomerase Pin1 binds and regulates mitosis-specific phosphoproteins. Genes Dev. 12, 706–720.

Shiu, K.-K., Wetterskog, D., Mackay, A., Natrajan, R., Lambros, M., Sims, D., Bajrami, I., Brough, R., Frankum, J., Sharpe, R., et al. (2014). Integrative molecular and functional profil-ing of ERBB2-amplified breast cancers identifies new genetic dependencies. Oncogene 33, 619–631.

Simpson, J.C., Joggerst, B., Laketa, V., Verissimo, F., Cetin, C., Erfle, H., Bexiga, M.G., Singan, V.R., Hériché, J.-K., Neumann, B., et al. (2012). Genome-wide RNAi screening iden-tifies human proteins with a regulatory function in the early secretory pathway. Nat Cell Biol 14, 764–774.

Sinha, S., Thomas, D., Chan, S., Gao, Y., Brunen, D., Torabi, D., Reinisch, A., Hernandez, D., Chan, A., Rankin, E.B., et al. (2017). Systematic discovery of mutation-specific synthetic lethals by mining pan-cancer human primary tumor data. Nat Commun 8, 15580.

Sjöblom, T., Jones, S., Wood, L.D., Parsons, D.W., Lin, J., Barber, T.D., Mandelker, D., Leary, R.J., Ptak, J., Silliman, N., et al. (2006). The consensus coding sequences of human breast and colorectal cancers. Science 314, 268–274.

Skarra, D.V., Goudreault, M., Choi, H., Mullin, M., Nesvizhskii, A.I., Gingras, A.-C., and Honkanen, R.E. (2011). Label-free quantitative proteomics and SAINT analysis enable in-teractome mapping for the human Ser/Thr protein phosphatase 5. Proteomics 11, 1508–1516.

Solimini, N.L., Luo, J., and Elledge, S.J. (2007). Non-oncogene addiction and the stress phe-notype of cancer cells. Cell 130, 986–988.

Stamler, J.S., Lamas, S., and Fang, F.C. (2001). Nitrosylation. the prototypic redox-based signaling mechanism. Cell 106, 675–683.

Stanyon, C.A., Liu, G., Mangiola, B.A., Patel, N., Giot, L., Kuang, B., Zhang, H., Zhong, J., and Finley, R.L. (2004). A Drosophila protein-interaction map centered on cell-cycle regula-tors. Genome Biol. 5, R96.

Stapley, J., Reger, J., Feulner, P.G.D., Smadja, C., Galindo, J., Ekblom, R., Bennison, C., Ball, A.D., Beckerman, A.P., and Slate, J. (2010). Adaptation genomics: the next generation. Trends in Ecology & Evolution 25, 705–712.

Page 63: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

63

Stelzl, U., Worm, U., Lalowski, M., Haenig, C., Brembeck, F.H., Goehler, H., Stroedicke, M., Zenkner, M., Schoenherr, A., Koeppen, S., et al. (2005). A human protein-protein interaction network: a resource for annotating the proteome. Cell 122, 957–968.

Stephens, P.J., McBride, D.J., Lin, M.-L., Varela, I., Pleasance, E.D., Simpson, J.T., Stebbings, L.A., Leroy, C., Edkins, S., Mudie, L.J., et al. (2009). Complex landscapes of so-matic rearrangement in human breast cancer genomes. Nature 462, 1005–1010.

Stratton, M.R., Campbell, P.J., and Futreal, P.A. (2009). The cancer genome. Nature 458, 719–724.

Sundaramurthy, V., Barsacchi, R., Chernykh, M., Stöter, M., Tomschke, N., Bickle, M., Kalaidzidis, Y., and Zerial, M. (2014). Deducing the mechanism of action of compounds identified in phenotypic screens by integrating their multiparametric profiles with a reference genetic screen. Nat Protoc 9, 474–490.

Taipale, M., Krykbaeva, I., Whitesell, L., Santagata, S., Zhang, J., Liu, Q., Gray, N.S., and Lindquist, S. (2013). Chaperones as thermodynamic sensors of drug–target interactions reveal kinase inhibitor specificities in living cells. Nat Biotechnol 31, 630–637.

Tang, J., and Aittokallio, T. (2014). Network pharmacology strategies toward multi-target anticancer therapies: from computational models to experimental design principles. Curr. Pharm. Des. 20, 23–36.

Tang, J., Karhinen, L., Xu, T., Szwajda, A., Yadav, B., Wennerberg, K., and Aittokallio, T. (2013). Target inhibition networks: predicting selective combinations of druggable targets to block cancer survival pathways. PLoS Comput. Biol. 9, e1003226.

Tegnér, J., Skogsberg, J., and Björkegren, J. (2007). Thematic review series: systems biology approaches to metabolic and cardiovascular disorders. Multi-organ whole-genome measure-ments and reverse engineering to uncover gene networks underlying complex traits. J. Lipid Res. 48, 267–277.

Terstappen, G.C., Schlüpen, C., Raggiaschi, R., and Gaviraghi, G. (2007). Target deconvolu-tion strategies in drug discovery. Nat Rev Drug Discov 6, 891–903.

Thompson, A., Schäfer, J., Kuhn, K., Kienle, S., Schwarz, J., Schmidt, G., Neumann, T., Johnstone, R., Mohammed, A.K.A., and Hamon, C. (2003). Tandem mass tags: a novel quan-tification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 75, 1895–1904.

Tong, A.H., Evangelista, M., Parsons, A.B., Xu, H., Bader, G.D., Pagé, N., Robinson, M., Raghibizadeh, S., Hogue, C.W., Bussey, H., et al. (2001). Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294, 2364–2368.

Tonon, G. (2008). From oncogene to network addiction: the new frontier of cancer genomics and therapeutics. Future Oncol 4, 569–577.

Torkamani, A., Verkhivker, G., and Schork, N.J. (2009). Cancer driver mutations in protein kinase genes. Cancer Letters 281, 117–127.

Torti, D., and Trusolino, L. (2011). Oncogene addiction as a foundational rationale for target-ed anti-cancer therapy: promises and perils. EMBO Mol Med 3, 623–636.

Tran, T.P., Ong, E., Hodges, A.P., Paternostro, G., and Piermarocchi, C. (2014). Prediction of kinase inhibitor response using activity profiling, in vitro screening, and elastic net regression. BMC Syst Biol 8, 74.

Tsherniak, A., Vazquez, F., Montgomery, P.G., Weir, B.A., Kryukov, G., Cowley, G.S., Gill,

Page 64: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

64

S., Harrington, W.F., Pantel, S., Krill-Burger, J.M., et al. (2017). Defining a Cancer Depend-ency Map. Cell 170, 564–576.e16.

Tyner, J.W., Yang, W.F., Bankhead, A., Fan, G., Fletcher, L.B., Bryant, J., Glover, J.M., Chang, B.H., Spurgeon, S.E., Fleming, W.H., et al. (2013). Kinase pathway dependence in primary human leukemias determined by rapid inhibitor screening. Cancer Res. 73, 285–296.

Tzelepis, K., Koike-Yusa, H., De Braekeleer, E., Li, Y., Metzakopian, E., Dovey, O.M., Mupo, A., Grinkevich, V., Li, M., Mazan, M., et al. (2016). A CRISPR Dropout Screen Iden-tifies Genetic Vulnerabilities and Therapeutic Targets in Acute Myeloid Leukemia. Cell Rep 17, 1193–1205.

Vandin, F., Upfal, E., and Raphael, B.J. (2012). De novo discovery of mutated driver path-ways in cancer. Genome Research 22, 375–385.

Vella, L.J., Sharples, R.A., Nisbet, R.M., Cappai, R., and Hill, A.F. (2008). The role of exo-somes in the processing of proteins associated with neurodegenerative diseases. Eur. Biophys. J. 37, 323–332.

Vizeacoumar, F.J., Arnold, R., Vizeacoumar, F.S., Chandrashekhar, M., Buzina, A., Young, J.T.F., Kwan, J.H.M., Sayad, A., Mero, P., Lawo, S., et al. (2013). A negative genetic interac-tion map in isogenic cancer cell lines reveals cancer cell vulnerabilities. Mol. Syst. Biol. 9, 696.

Vogel, C.L., Cobleigh, M.A., Tripathy, D., Gutheil, J.C., Harris, L.N., Fehrenbacher, L., Slamon, D.J., Murphy, M., Novotny, W.F., Burchmore, M., et al. (2002). Efficacy and safety of trastuzumab as a single agent in first-line treatment of HER2-overexpressing metastatic breast cancer. J. Clin. Oncol. 20, 719–726.

Vogelstein, B., Papadopoulos, N., Velculescu, V.E., Zhou, S., Diaz, L.A., and Kinzler, K.W. (2013). Cancer Genome Landscapes. Science 339, 1546–1558.

Völkel, P., Le Faou, P., and Angrand, P.-O. (2010). Interaction proteomics: characterization of protein complexes using tandem affinity purification-mass spectrometry. Biochem. Soc. Trans. 38, 883–887.

Wang, Z. (2012). Protein S-nitrosylation and cancer. Cancer Lett. 320, 123–129.

Wang, T., Yu, H., Hughes, N.W., Liu, B., Kendirli, A., Klein, K., Chen, W.W., Lander, E.S., and Sabatini, D.M. (2017). Gene Essentiality Profiling Reveals Gene Networks and Synthetic Lethal Interactions with Oncogenic Ras. Cell 168, 890–903.e15.

Wang, X., Fu, A.Q., McNerney, M.E., and White, K.P. (2014). Widespread genetic epistasis among cancer genes. Nature Communications 5, 4828.

Warde-Farley, D., Donaldson, S.L., Comes, O., Zuberi, K., Badrawi, R., Chao, P., Franz, M., Grouios, C., Kazi, F., Lopes, C.T., et al. (2010). The GeneMANIA prediction server: biologi-cal network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 38, W214–W220.

Wei, X., Hoffman, A.F., Hamilton, S.M., Xiang, Q., He, Y., So, W.V., So, S.-S., and Mark, D. (2012). A simple statistical test to infer the causality of target/phenotype correlation from small molecule phenotypic screens. Bioinformatics 28, 301–305.

Weinstein, I.B., and Joe, A. (2008). Oncogene addiction. Cancer Res. 68, 3077–3080; discus-sion 3080.

Weinstein, I.B., and Joe, A.K. (2006). Mechanisms of disease: Oncogene addiction--a ra-tionale for molecular targeting in cancer therapy. Nat Clin Pract Oncol 3, 448–457.

Page 65: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

65

Weiss, W.A., Taylor, S.S., and Shokat, K.M. (2007). Recognizing and exploiting differences between RNAi and small-molecule inhibitors. Nat. Chem. Biol. 3, 739–744.

Wilhelm, M., Schlegl, J., Hahne, H., Gholami, A.M., Lieberenz, M., Savitski, M.M., Ziegler, E., Butzmann, L., Gessulat, S., Marx, H., et al. (2014). Mass-spectrometry-based draft of the human proteome. Nature 509, 582–587.

Wishart, D.S., Knox, C., Guo, A.C., Shrivastava, S., Hassanali, M., Stothard, P., Chang, Z., and Woolsey, J. (2006). DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 34, D668–D672.

Wong, W.C., Kim, D., Carter, H., Diekhans, M., Ryan, M.C., and Karchin, R. (2011). CHASM and SNVBox: toolkit for detecting biologically important single nucleotide muta-tions in cancer. Bioinformatics 27, 2147–2148.

Wood, L.D., Parsons, D.W., Jones, S., Lin, J., Sjöblom, T., Leary, R.J., Shen, D., Boca, S.M., Barber, T., Ptak, J., et al. (2007). The Genomic Landscapes of Human Breast and Colorectal Cancers. Science 318, 1108–1113.

Wu, J., Vallenius, T., Ovaska, K., Westermarck, J., Mäkelä, T.P., and Hautaniemi, S. (2009). Integrated network analysis platform for protein-protein interactions. Nat. Methods 6, 75–77.

Yadav, B., Pemovska, T., Szwajda, A., Kulesskiy, E., Kontro, M., Karjalainen, R., Majumder, M.M., Malani, D., Murumägi, A., Knowles, J., et al. (2014). Quantitative scoring of differen-tial drug sensitivity for individually optimized anticancer therapies. Sci Rep 4, 5193.

Yadav, B., Wennerberg, K., Aittokallio, T., and Tang, J. (2015). Searching for Drug Synergy in Complex Dose-Response Landscapes Using an Interaction Potency Model. Comput Struct Biotechnol J 13, 504–513.

Yan, W., Zhang, W., and Jiang, T. (2011). Oncogene addiction in gliomas: Implications for molecular targeted therapy. Journal of Experimental & Clinical Cancer Research 30, 58.

Yang, C.-W., Lee, Y.-Z., Hsu, H.-Y., Wu, C.-M., Chang, H.-Y., Chao, Y.-S., and Lee, S.-J. (2013). c-Jun-mediated anticancer mechanisms of tylophorine. Carcinogenesis 34, 1304–1314.

Yang, H., Zhou, H., Feng, P., Zhou, X., Wen, H., Xie, X., Shen, H., and Zhu, X. (2010). Re-duced expression of Toll-like receptor 4 inhibits human breast cancer cells proliferation and inflammatory cytokines secretion. J. Exp. Clin. Cancer Res. 29, 92.

Yeger-Lotem, E., and Sharan, R. (2015). Human protein interaction networks across tissues and diseases. Front Genet 6, 257.

Yersal, O., and Barutca, S. (2014). Biological subtypes of breast cancer: Prognostic and ther-apeutic implications. World J Clin Oncol 5, 412–424.

Yi, S., Lin, S., Li, Y., Zhao, W., Mills, G.B., and Sahni, N. (2017). Functional variomics and network perturbation: connecting genotype to phenotype in cancer. Nat Rev Genet advance online publication.

Youlden, D.R., Cramb, S.M., Dunn, N.A.M., Muller, J.M., Pyke, C.M., and Baade, P.D. (2012). The descriptive epidemiology of female breast cancer: an international comparison of screening, incidence, survival and mortality. Cancer Epidemiol 36, 237–248.

Zaman, N., Li, L., Jaramillo, M.L., Sun, Z., Tibiche, C., Banville, M., Collins, C., Trifiro, M., Paliouras, M., Nantel, A., et al. (2013). Signaling network assessment of mutations and copy number variations predict breast cancer subtype-specific drug targets. Cell Rep 5, 216–223.

Page 66: BIOINFORMATIC IDENTIFICATION OF DISEASE DRIVER …

66

Zanzoni, A., Soler-López, M., and Aloy, P. (2009). A network medicine approach to human disease. FEBS Lett. 583, 1759–1765.

Zhao, S., and Iyengar, R. (2012). Systems pharmacology: network analysis to identify mul-tiscale mechanisms of drug action. Annu. Rev. Pharmacol. Toxicol. 52, 505–521.

Zhu, F., Shi, Z., Qin, C., Tao, L., Liu, X., Xu, F., Zhang, L., Song, Y., Liu, X., Zhang, J., et al. (2012). Therapeutic target database update 2012: a resource for facilitating target-oriented drug discovery. Nucleic Acids Res. 40, D1128–D1136.