genetic analysis of mutation and response in cancer

Genetic Analysis of Mutation and Response in Cancer

A DISSERTATION SUBMITTED TO THE FACULTY OF THE

UNIVERSITY OF MINNESOTA BY

MONICA KIEFFER AKRE

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

BRIAN VAN NESS, PHD ADVISOR

AUGUST 2017

i

Acknowledgements

“But science and everyday life cannot and should not be separated.”

Rosalind Franklin (1940, letter to Ellis Franklin)

In these words, Dr. Franklin and I are kindred. Science is my sustenance. I

have the wonderment of a child. I often anthropomorphize my work: the

chromosomes, the cells, the code—whatever it is I am working on, I feel a

relationship. I’ve been known to talk to chromosomes or cells. I come home from

a day of work and tell my children about what Nature had to say to me today. For

Dr. Franklin, the struggle must have been heart-wrenching. I’m comforted that

she seemed to find a space in science, if not among Nobel Laureates, then in the

peaceful folds of the mystery.

This is the greatest piece I must acknowledge: all the women who have

come before me. Not just in science, but also in all things. To my Great

Grandmother Anna Hammel, who homesteaded as a woman at the turn of the

last century in western North Dakota and kept it in her name her entire life—a

pioneer and intrepid role-model. To my Grandmother Lois Maroney Hammel,

who struggled with mental illness for which there was nothing to be done. To my

Great Grandmother Elizabeth Killian Sonsalla, married off at age 14 and raised 5

children alone during the Great Depression. And to my Grandfather Herman

Hammel who said, “ain’t that something” when told I was going to graduate

school—he was a great ally. I come from fortitude. And not just within the

ii

immortal thread of heredity, but from the wisdom of all the elders from whom I

have had the pleasure to listen as they share their secrets to happiness.

My family—Erik, Alexa, and Lily—was patient with me through so much.

This course did not run smooth, but they were grounding and supportive. They

shared the blood, sweat, and tears—and there was much of all three. We defend

together.

I am daily aware of, and grateful for, the opportunity to learn. And thank all

who have made the journey possible. All of the wonderful people I gained

experience with during rotations: the labs of Naoko Shima, Ph.D, Reuben Harris,

Ph.D, and Tim Starr, Ph.D.

I owe so much to the Harris lab. The individual lab members, each and

every one, showed me science, compassion, and strength and I am forever

grateful. Nadine Shaban, Ph.D., has given me strength, laughter, and

encouragement.

I found a welcoming home in the Van Ness lab of curious seekers and

have loved the communal hypothesizing. Thank you, Amit Mitra, Ph.D, and

Taylor Harding for making me feel welcome and grow as a scientist. The

neighboring Griffith lab has been a source of reagents and joy. Thank you.

My committee members past and present have helped guide me and

challenge me: Tim Starr Ph.D., David Largaespada Ph.D., Anja Bielinsky, Ph.D.,

Chad Myers, Ph.D., Ran Blekhman, Ph.D., Nik Somia, Ph.D., and Naoko Shima,

Ph.D. Thank you.

iii

And finally, thank you to Dr. Brian Van Ness. The past 2 years in your lab,

I have felt like a child in a playground able to wonder. Thank you for bringing me

back to focus when my wonder turned to wander. The opportunity to discuss and

debate science (usually during your lunch—I promise I didn’t do that on purpose)

was invaluable in shaping me as a critical scientist.

iv

Abstract

This thesis is divided into two distinct parts: mutation and treatment of

cancer. Each abstract is described separately.

Molecular, cellular, and clinical studies have combined to demonstrate a

contribution from the DNA cytosine deaminase APOBEC3B (A3B) to the overall

mutation load in breast, head/neck, lung, bladder, cervical, ovarian, and other

cancer types. However, the complete landscape of mutations attributable to this

enzyme has yet to be determined in a controlled human cell system. In Chapter

2, we report a conditional and isogenic system for A3B induction, genomic DNA

deamination, and mutagenesis. Targeted sequencing of portions of TP53 and

MYC demonstrated greater mutation accumulation in the A3B-eGFP exposed

pools. Clones were generated and microarray analyses were used to identify

those with the greatest number of SNP alterations for whole genome sequencing.

A3B-eGFP exposed clones showed global increases in C-to-T transition

mutations, enrichments for cytosine mutations within A3B-preferred trinucleotide

motifs, and more copy number aberrations. The 293-based system characterized

here still yielded a genome-wide view of A3B-catalyzed mutagenesis in human

cells and a system for additional studies on the compounded effects of

simultaneous mutation mechanisms in cancer.

Personalized medicine includes the identification of the biomarkers and/or

expression profiles important in the treatment of cancer. Cancer cell lines offer a

representative diversity of molecular processes involved in both malignant

v

phenotypes as well as therapeutic responses. Coupling cancer line expression

with high-throughput drug screens allows for examination of perturbed pathways

that influence drug sensitivities. Here we use The Sanger Institute’s Genomics of

Drug Sensitivity in Cancer to develop a bioinformatic and pathway analysis

approach that identifies activation of the BCR pathway as an important biomarker

in response to the tyrosine kinase inhibitor, dasatinib, used in hematologic cancer

therapies. This approach establishes a process by which data from cell line

repositories can be used to identify biomarkers associated with drug response in

the treatment of cancers.

vi

Table of Contents List of Figures .................................................................................................................. x

Chapter 1 Introduction

Part I: Background for Mutational Processes ........................................................... 1

Controlled Endogenous Mutagenesis .................................................................... 2

Antibody Diversification ......................................................................................... 2

Foreign DNA Restriction ........................................................................................ 3

Nuclear APOBEC3 Activity .................................................................................... 4

Enzymatic-induced Mutation in Cancer ................................................................. 4

Substrate Context Preference ............................................................................... 5

A3B as Mutagen .................................................................................................... 6

Statement of Thesis ............................................................................................... 7

Part II: Background for Drug Prediction Expression Profiles ................................. 7

Components of Precision Medicine ....................................................................... 7

Diagnostic Mutations ............................................................................................. 8

Prognostic Mutations ............................................................................................. 9

Prognostic Expression ........................................................................................... 9

Targeted Treatment ............................................................................................. 10

Drug Metabolism .................................................................................................. 10

Genetic Approaches to Predictive Problems ....................................................... 11

Precision Medicine Summary .............................................................................. 11

Commonalities Across Cancers ........................................................................... 12

Opportunities in Public Repositories .................................................................... 12

vii

Diving In to Big Data ............................................................................................ 13

B-cell Development .............................................................................................. 15

Statement of Thesis ............................................................................................. 16

Chapter 2 Mutation Processes in 293-Based Clones Overexpressing the DNA Cytosine Deaminase APOBEC3B

Overview ..................................................................................................................... 28

Introduction ................................................................................................................ 29

Materials and Methods .............................................................................................. 32

Cell Lines ............................................................................................................. 32

Immunoblots ........................................................................................................ 32

DNA Deaminase Activity Assays ......................................................................... 33

Differential DNA Denaturation (3D) PCR Experiments ........................................ 34

SNP Analysis ....................................................................................................... 35

Whole Genome Sequencing (WGS) .................................................................... 36

Results ........................................................................................................................ 37

System for conditional A3B expression ............................................................... 37

Iterative rounds of A3B exposure ........................................................................ 38

Targeted DNA sequencing provides evidence for A3B mutagenesis .................. 40

Genome wide mutation analyses ......................................................................... 41

A3B mutational landscape by whole genome sequencing .................................. 43

Discussion ................................................................................................................. 46

Acknowledgements ................................................................................................... 48

viii

Chapter 3 Development of Expression-based Biomarkers of Dasatinib Response in

Hematologic Malignancies

Overview ..................................................................................................................... 57

Introduction ................................................................................................................ 58

Methods ...................................................................................................................... 60

Expression Analysis ............................................................................................. 60

Normalization of CCLE values ............................................................................. 60

Statistical Analysis ............................................................................................... 60

Tissue Culture ...................................................................................................... 61

Results ........................................................................................................................ 62

Classification of Response and Resistance ........................................................ 62

Differential gene expression and feature selection .............................................. 63

BCR and B-cell Development .............................................................................. 64

Evaluation of the genes in the BCR pathway ...................................................... 64

Validation ............................................................................................................. 65

Independent Cell Lines ........................................................................................ 65

Clinical Associations ............................................................................................ 67

Bortezomib Resistance, CD19,

and Collatoral Dasatinib Sensitivity ..................................................................... 67

Discussion ................................................................................................................. 68

Acknowledments: ...................................................................................................... 70

Chapter 4

Discussion and Future and Future Aims ................................................................ 87

ix

Mutational Processes ................................................................................................ 87

Future Directions ................................................................................................. 87

Predictive Expression Profiles ................................................................................. 88

Future Directions ................................................................................................. 88

The Importance of Pathway Analysis .................................................................. 89

Personal Future Directions .................................................................................. 89

Bibliography .............................................................................................................. 94

x

List of Figures

Chapter 1

Fig 1. Antibody Formation: Use for Controlled Mutagenesis ................................ 17

Fig. 2 Outcomes of A3B Deamination ..................................................................... 19

Fig 3. APOBEC3 Family of Cytosine Deaminases .................................................. 22

Fig 4. Precision Medicine in Cancer Timeline ......................................................... 23

Fig 5. B-cell Differentiation ....................................................................................... 26

Chapter 2

Fig 1. A conditional system for A3B expression. ................................................... 49

Fig 2. A3B induction optimization and targeted sequencing results ................... 51

Fig 3. SNP analyses to estimate new mutation accumulation. ............................. 53

Fig 4. Summary of somatic mutations detected by WGS. ..................................... 55

Chapter 3

Fig 1. Workflow .......................................................................................................... 72

Fig 2. Diversity of Dasatinib Response and Expression Correlation ................... 74

Fig 3. BCR Pathway Activated in Extreme Responders ........................................ 76

Fig 4. B-cell Differentiation Stages with Malignancy and CD19 Expression of Cell

Lines ........................................................................................................................... 78

Fig 5. Statistically Significant Signature Genes ..................................................... 80

Fig 6. Gene Expression Signature of Extreme Response ..................................... 82

Fig 7. Validation of Lines With 5 Gene Signature ................................................... 84

Supp Fig 1. Expression Comparison between Multiple Myeloma and

Waldenström’s macroglobulinemia ......................................................................... 86

xi

Chapter 4

Fig 1. Genes of the BCR Pathway ............................................................................ 91

Fig 2. Potential of Expression Profiling in Precision Medicine ............................. 93

1

Chapter 1 Introduction

This thesis is divided into two distinct parts: mutation and treatment of

cancer. The background of each will be done separately.

Part I: Background for Mutational Processes

It has long been suspected that cancer arises as an imprecise passing on

of hereditary material of somatic cells. Indeed, over 100 years ago, through

astute observation of dividing sea urchin cells Theodor Boveri discovered that an

exact chromosome complement was necessary for controlled growth. Boveri

hypothesized in 1902 that an aberrant number of chromosomes in a single cell

gave rise to uncontrolled growth similar to a tumor(1). Since the time Boveri and

his wife, Marcella, made these observations, many more observed the specifics

of mutations and their contribution to oncogenesis.

Viruses proved to be not only an exogenous source of cancer, but also

communicable (2). Mutations accrued in irradiated mice resulted in tumors (3).

More sources of mutations were added to the list of growing scientific

understanding. These sources could be divided into two groups: exogenous and

endogenous.

Mutations have been accepted as one cause of cancer. But there are also

beneficial uses for mutations. Indeed, we would still be bubbles in the primordial

ooze rather than multi-cellular, multi-function beings had evolution not used

2

mutation to retool gene products into the complexity we see in the diversity of life

forms.

Within humans, 3 important mutational processes occur: antibody

formation and diversification, foreign DNA restriction, and T-cell receptor

rearrangements. I will describe in detail the first two processes.

Controlled Endogenous Mutagenesis

Antibody Diversification

Mutations are necessary for diversification during antibody production.

Antibody diversity and specificity is a result of recombination of the heavy and

light (either kappa or lambda) chain components of the future antibody.

Recombination activating gene (RAG) enzymes are responsible for controlled

cutting of variable, diversity, and joining (V(D)J) regions of the heavy chain (HC)

which are then rejoined in a newly re-configured genetic sequence by non-

homologous end joining (NHEJ) (4). This process allows construction of

antibodies with an enormous diversity of gene rearrangements. RAG1/2-assisted

recombination also occurs at either the kappa or lambda light chain (LC) variable

and joining (VJ) gene regions (Fig 1A-C). The heavy and light chains are

presented together via disulfide bonds upon the IgM stem embedded in the

plasma membrane. The new, unique configuration is now tested for efficacy

against potential immunological threats.

Activation-Induced Cytosine Deaminase (AID) is an enzyme responsible

for removing amine groups from single-stranded cytosines. The resulting uracil

lesion is mismatched with the no-longer-complementary guanine on the opposite

3

strand. This lesion can be repaired back to an error-free cytosine through base-

excision repair. However, the lesion can also result in mutation (Fig 2).

1. Replication over the uracil results in a C>T transition,

2. translesion synthesis can bring about C>T transitions or C>A

transversions,

3. mutagenic mismatch repair can result in transitions or

transversions,

4. and abasic sites left un-repaired may result in double-stranded

breaks(5).

AID deamination is responsible for somatic hypermutation (SHM) that

modifies antibodies and diversifies them to acquire a “best fit” antibody to attack

foreign invaders with precision (Fig 1D)(6). Class-switch recombination (CSR)

occurs when AID-induced double-stranded breaks recombine via non-

homologous end joining to result in isotype of the antibody produced (Fig 1E)(4).

Antibody construction is an orchestrated process and the enzymes involved are

well controlled and largely sequester their activity to the antigen binding sites

(hypervariable regions) of IGH, IGL (7).

Foreign DNA Restriction

Cellular immunity is a defense system against foreign DNA. It has been

formed as a set of 7 genes in tandem on chromosome 22, decendents from AID:

the APOBEC3 genes (Fig 4) (8). Like their progenitor, these genes also code for

active cytosine deaminases (9). The canonical function of the APOBEC3 proteins

is foreign DNA restriction in the cytoplasm during viral infection (10).

4

These enzymes deaminate DNA in the cytoplasm from viral infection or

even uptake of foreign DNA material protecting the cell from infection or

disruptive integration into host DNA (8,11–13).

The enzymes work on single-stranded DNA of retroviruses deaminating

cytosines and rendering the code ineffective in infection. The cytoplasmic

localization of the APOBEC3s reflect their function as that is where foreign DNA

first presents (14).

Nuclear APOBEC3 Activity

There are several APOBEC3s found in the nucleus (Fig 4). One important

function of nuclear localization APOBEC3s is the suppression of Alu and LINE

elements—ancient retrotransposon “jumping genes” (12,15–17) and therefore

their nuclear occupation may be justified. The restriction of retro-elements does

not require enzymatic activity (10). But are nuclear APOBEC3s enzymatically

active? Can they deaminate genomic cytosines?

Of the APOBEC3s found in the nucleus, APOBEC3B (A3B) is the only A3

member that is exclusively nuclear (14,18). A3B is, therefore, a good candidate

to study for evidence of genomic DNA mutation. What is the impact of A3B’s

enzymatic activity in the nucleus. A3B’s enzymatic activity is potent (11). Is it

under tight control, or does it, indeed cause mutation of the very DNA from which

it came?

Enzymatic-induced Mutation in Cancer

During controlled breakage of the IGH region, the DNA is vulnerable to

NHEJ with another region, or another chromosome altogether. AID was

5

discovered to be responsible for the t(8;14) chromosomal translocation involving

the newly severed IGH gene region, that was destined to recombine within the

same gene region, is now recombined with the MYC oncogene. This

translocation juxtaposing the MYC gene under the influence of IGH’s strong

promoter is the hallmark of Burkitt lymphoma (19). Balb/c mice are prone to

spontaneously developing c-Myc/IgH translocations and plasmacytomas (20).

However, Balb/c AID-/- mice do not develop neoplasms (19). Multiple other

translocations with evidence of AID-assisted mutagenesis involving the IGH gene

region have since been reported(21). This was the first example of the controlled

mutation machinery of antibody formation contributing to an oncogenic

translocation(19).

Substrate Context Preference

Given the distribution of cytosines in the genome, random chance would

predict a pattern of C>T transition with a fairly equal chance of being positioned

3’ of any one of the 4 bases. However, exhaustive analysis of substrate context

revealed A3B has a strong preference for deaminating cytosines flanked by a 5’

T and 3’ W base (5’TCW3’ context). This signature mirrors the context of point

mutations seen in multiple cancer patient samples (22–24).

As described in Figure 2 the resulting uracils can be properly excised and

repaired. However, mutations can occur if the repair of the site is incomplete or

erroneously repaired(5).

6

A3B as Mutagen

The APOBEC3 member most likely to cause mutations in human genomic

DNA is A3B. Firstly, A3B has import mechanisms to the nucleus allowing it

access to genomic DNA (18). It is the only member that has exclusive nuclear

localization.

Further, breast cancers with a high level of A3B expression also had

significantly higher numbers of mutations within A3B’s preferred substrate

context (22). Since first discovering the correlation between high A3B expression

and mutation load in breast cancer, more cancers have been examined for their

A3B expression, mutation load, and sequence context specific to A3B—the

cytosine substrate preceded by a thymine at the 5’ position and followed by a

weak base—adenine or thymine—in the 3’ position (5,23). The sequence context

of mutations have been examined by whole genome sequencing of tumors

(24,25). These studies show a mutational signature that is indicative of the

enzymatic activity of an APOBEC3 as shown in vitro (22).

The mounting data implicated A3B as a causal agent in mutations

(5,22,23,26,27). However, there are implications that another cytosine

deaminase, APOBEC3A, could be responsible for the mutation signature seen in

breast cancer (28). Further, tumors from individuals with a germline deletion of

the coding region of A3B also carry an APOBEC3 signature, even in patients

homozygous for the deletion (29,30). If A3B was deleted, what was responsible

for the mutations seen in these tumors?

7

The field required an investigation into what a single APOBEC3 molecule,

A3B, could do to human genomic DNA. We, therefore, developed a clean,

isogenic system to monitor A3B’s mutagenic impact over time and the genomic

landscape it shaped.

Statement of Thesis

Given the increased mutational load of breast cancers with higher A3B

expression, we decided to examine the accumulation of mutations by A3B in a

controlled, isogenic cell line system to prove the causal link between A3B and

resulting increased mutation load and context specificity. In an A3B-GFP

doxycline-inducible system with a GFP control derived from the same clone, 2

biological replicates were grown side-by-side and induced weekly for 10 weeks.

Our hypothesis was that A3B over-expressing lines would accumulate more

mutation over the course of the experiment compared to the GFP controls and

that these mutations would prefer the TCW sequence, characteristic of A3B’s in

vitro enzymatic activity.

Part II: Background for Gene Expression

Drug Prediction Profiles

Components of Precision Medicine

There are multiple components to precision medicine. Genetic profiles

contribute to precision medicine through identification of:

1. diagnostic and prognostic mutations,

2. diagnostic and prognostic expression,

8

3. targeted treatment, and

4. treatment metabolism by congenital variants.

Each of these components will be briefly examined for their historical

contributions to the overall campaign to successfully treat cancer (Fig 5).

Diagnostic Mutations

The first structural chromosomal defect of cancer was identified with the

discovery of the Philadelphia chromosome in 1959. Cytogenetic analysis of

chronic myelogenous leukemia (CML) patients (in Philadelphia and so, thusly

named) revealed a small chromosome that was soon discovered to be the

product of the causal translocation between chromosomes 9 and 22 (31). The

resulting derivative 22 contained a fusion gene of BCR (Breakpoint Cluster

Region) and ABL. The BCR gene region should not be confused with B-cell

Receptor, which shares its abbreviation. For this reason, the gene region will

always be followed by the words “gene region.”

ABL is a tyrosine kinase that is constitutively active when fused in the

BCR gene region. This extremely specific translocation led to one of the first

targeted therapies, albeit over 40 years after the initial discovery of the

translocation.

Imatinib mesylate was approved for use in the United States in 2001 and

dramatically improved CML treatment and outcomes (32). Mainstream media

responded with sensational headlines like Time’s, “There Is New Ammunition In

the War Against Cancer. These Are the Bullets” with a picture of imatinib

mesylate pills on the cover of the May 2001 edition (33).

9

Imatinib mesylate is a targeted tyrosine kinase inhibitor and, at least

initially, the hype seemed warranted. However, CML patients started relapsing.

Their cancers had mutated or amplified the fusion gene so that imatinib mesylate

no longer managed their disease (34). This highlights one of the difficulties in

precision medicine: the moving target of treating a dynamic disease—one that

can change, or is heterogeneous at the start of treatment.

Prognostic Mutations

The Döhner classification of chronic lymphocytic leukemia (CLL) is based

on cytogenetic analysis identifying deletions, duplications, translocations, and

aneusomies. The analysis stratifies patients into 5 prognostic groupings allowing

practitioners and patients to make treatment decisions based on their prognostic

classification (35).

Prognostic Expression

Her2 expression is an example of a prognostic marker turned targeted

therapy. The Her2 (ERBB2) protein is overexpressed in about 25% of breast

cancers often due to gene amplification. Her2 amplified breast cancers were

aggressive and patients were given a poor prognosis (36). Cleverly, a

monoclonal antibody against Her2 was used to neutralize the proliferation signal

from the amplified protein. A humanized form of the original mouse monoclonal

was created to allow for patients’ immune system to better tolerate the presence

of the foreign antibody (37). For Her2-amplified breast cancer, trastuzumab,

turned a poor prognosis into a treatable disease (36,38–40).

10

Targeted Treatment

Both imatinib mesylate and trastuzumab represent treatments targeted to

mutations and expression respectively. Success of these molecular targeting

therapies resulted in a race of clinicians and researchers looking for similar

molecular aberrations in other cancer types. As a result, imatinib mesylate is now

used as an effective treatment in gastrointestinal stromal tumors (41,42).

Trastuzumab is used in treating Her2-amplified metastatic gastric cancer and

gastroesophogeal junction adenocarcinoma (43–45). Thus, therapies designed

for a specific cancer were found to have possible broader applications. A shift

began in which cancer treatments were based on common genetic events, even

across cancers of different tissue types.

Drug Metabolism

The final component of precision medicine to be discussed here is

congenital variants of metabolism. Liver enzymes that either pre-process drugs

from pro-drug inactive to active forms or break down drugs to be cleared have a

variety of reported single nucleotide polymorphisms (SNPs). For example,

Tamoxifen is an anti-estrogen drug but is only effective when converted by the

liver enzyme CYP2D6 into its active endoxifen form (46). Single Nucleotide

Polymorphisms (SNP) variants of CYP2D6 with less enzymatic activity convert

less drug to endoxifen resulting in a weakened therapeutic benefit. However,

even with enzymatically potent CYP2D6 variants, tamoxifen benefit can be

reduced as other prescription drugs can reduce the activity of CYP2D6 (47).

11

Genetic Approaches to Predictive Problems

Cytogenetic analysis was our first experience with diagnostic precision

medicine—the discovery of the Philadelphia chromosome. Indeed, DNA analysis

for mutations and variations make up a greater portion of clinically available tests

for diagnosis, prognosis, targeted treatment, and metabolic variants.

However, there has been a place for gene expression, even relatively

early on with assays for receptors that classify tumors or indicate overexpression

via immunohistochemistry (IHC). The technique was developed in 1941, long

before the cancer field had a use for it (48). IHC and other immunofluorescent

techniques in research labs have now become routine. In 1996, a Phase II trial to

test patients for Her2 overexpression status via IHC, brought the utility into

clinical precision medicine (49).

The central dogma of biology: DNA makes RNA makes protein, would

make the argument that what can be learned from the DNA is also in the RNA.

And, importantly, it is the amount of RNA that may define cellular activity,

including response to therapies. In Chapter 3, I make use of expression data

from cell lines to describe expression signatures predictive of drug response.

Precision Medicine Summary

The patient-specific cancer characteristics described above also

determine which therapies will or won’t be effective. Further, a patient’s

congenital enzymatic make-up determines if drugs are converted into active

conformations, metabolized and cleared too quickly to illicit a response, or too

slowly and build up to toxic levels. The multiple layers of information that can be

12

gathered from a patient and tumor make precision medicine difficult and highlight

a necessity to integrate genetic data from a larger body of evidence.

Commonalities Across Cancers

Tumor suppressors and proto-oncogenes are within every non-tumor cell.

The disruption of their function can be found across different cancer types. For

example, the ALK gene was first found rearranged in large-cell lymphoma has

been found in non-small cell lung carcinoma and clear cell renal cell carcinoma

(50–52). TP53 is estimated to be mutated, deleted, or silenced in nearly every

type of cancer with frequencies ranging from 10% of hematologic malignancies to

nearly 100% of high-grade serous ovarian cancer (53). Likewise, mis-regulation

of MYC-targeted genes is involved in a statistically significant proportion of 11 of

the 17 cancer types examined in the the Cancer Genome Atlas (TCGA) (54).

Discoveries of similar aberrations across cancer types has led to a movement of

classification of cancer into genetic-based categories rather than the more broad

categories based on histology or tissue of origin (55–57).

The difficulties presented by variant components of precision medicine

require utilizing information from all cancers. Large databases representing

multiple cancer types allow us to cross tissue barriers as repositories are

collections of multiple cancer types, tissues, stages, and individuals. Chapter 3

utilizes big data from multiple cancer types.

Opportunities in Public Repositories

Big data presents the opportunity to explore the gene expression-drug

response relationship in greater detail and across multiple cancer types. Cancer

13

cell line collections such as the NCI-60, Genomics of Drug Sensitivity in Cancer

(GDSC), and Cancer Cell Line Encyclopedia (CCLE) offer mutation status,

expression profiles, and drug response data (58–60).

The GDSC has 1074 cell lines with whole-genome sequence, micro-array

expression data, and response profiles to 265 drugs (58). Examination of the

expression profiles of cell lines derived from diverse primary tumors and their

corresponding response to drugs allows us to dissect important gene and

pathway expression that contributes to either response or resistance to a drug.

Once these patterns are established and tested clinically, patient tumor profiles

could direct therapy selection toward a more effective treatment.

Diving In to Big Data

The gene-drug approach described above can, as I show in Chapter 3, be

extended beyond one gene to include multiple genes of a pathway. In order to

accomplish this, multiple steps in limiting and classifying the data were

employed.

Firstly, we subset the cell lines to only include B-cell malignancies, leaving

us with expression data for 94 lines. This was done to limit any confounding

impact of tissue-specific expression on analysis, but still benefit from the

combination of multiple cancer types to increase sample size as well as broaden

what is understood about the all the cancers of B-cell origin. This is a small

version of the pan-cancer analysis described earlier.

14

Secondly, we examined 14 drugs with targets currently being used for

treatment in B-cell malignancies. We chose to focus on Dasatinib, which is a

multi-kinase inhibitor.

Area under the survival curve (AUSC) is a measurement of drug response

that is, in this data set, normalized between 0 and 1. Lines with an AUSC for a

drug that approaches 1 is more resistant than a line that has a value of 0.5 or

less. It’s important to note that there is no specific criterion for what is responsive

in these studies. Relative responsiveness is the measure by which value cutoffs

for AUSC were based. Cutoffs were chosen based on breaks in the AUSC values

that we assess represented a natural cutoff between a physiological response to

the drug. These separations in AUSC values are drug-specific and, in some

drugs, do not exist, but rather a gradual, continuous, distribution of AUSC values

is observed.

We describe lines as Responders of dasatinib when they have an AUSC

of less than 0.75 with the additional criterion of 50% of cell death (IC50) at a drug

concentration below the maximum testing concentration divided by 4. The IC50

criterion was to ensure that there was cell death assayed well within the survival

curve plot. There were 14 Responders of the 71 lines for which drug testing was

available. We described lines as Non-Responders when their AUSC was greater

than 0.98. This is an extremely high value and there is no ambiguity that these

lines’ metabolism was unaffected by dasatinib.

It was important that we have cell lines that displayed extreme responses

so as not to confound analysis. We identified differential expression between

15

Non-Responders and Responders. Once a we narrowed down candidate gene

expression markers of dasatinib response, we examined the expression of these

candidates in the remaining 46 llines. We were looking for trends in lines with

partial response or resistance. Did they have expression values similar to their

nearest extreme responders?

B-cell Development

The hematopoietic stem cell starts down a course of differentiation that

has multiple bifurcations and destinations. The one arm we will focus on here is

that of B-cell differentiation. The differentiation of a B-cell is marked by surface

receptors, activation of the B-cell receptor (BCR) pathway, and the maturation of

the antibody described in Part I of this introduction (Fig 6).

The pluripotent pre-B cell has an activated B-cell receptor pathway with

high expression of BCR components such as CD19, CD79A, and CD79B

presented on the surface of the cell (61). The HC first recombines the D-J

regions followed by joining this newly recombined section with a rearrangement

of the V region. The pre-B-cell adds the HC to the BCR assembly as prototype of

an antibody with a surrogate LC substituting for the yet-to-be recombined IGL

region (either kappa or lambda). Should the surrogate LC bind sufficiently bind

and present the HC, the new pro-B cell moves forward in differentiation and

replaces the surrogate LC with the newly-formed LC to the BCR ensemble and

the naïve cell lies in wait. Once stimulated by an antigen, the intercellular

portions of CD79A and CD79B are phosphorylated and a proliferation and

survival cascade ensues in the mature B cell (61).

16

After the BCR has signaled, its usefulness is obsolete and the cell

downregulates the BCR pathway components. The fully mature product of the

differentiated B-cell, the plasma cell, in fact, has essentially no expression or

presentation of BCR components, such as CD19 (61).

The subset of B-cell malignancies from the GDSC has expression data for

71 cell lines, which also have drug response for dasatinib. It is to be expected

that molecules involved in differentiation will be differentially expressed based on

at what stage in development the malignancy arose. What is also expected is the

cell lines carry enough in common to reveal what is truly different between the

Responders and Non-Responders.

Statement of Thesis

We hypothesize that the phenotype of extreme response and non-

response to drug treatment in cell lines must be due to differences between the

lines that can be revealed by analysis of RNA expression. We focused our

attention on dasatinib response in B-cell malignancies. We augmented our

analysis by including pathway analsyis of the differentially expressed genes. Our

gene expression response signature was confirmed by validating the biomarkers

on untested cell lines and primary patient samples and supported by outcomes of

a clinical trial.

17

Fig1.AntibodyFormation:UseforControlledMutagenesis

A) The HC Variable, Diversity, and Joining (V(D)J) loci undergo recombination

via the double-stranded breaks induced by the RAG enzymes (depicted as white

circles). Non-homologous end joining joins breaks to reconfigure the locus into a

unique region.

B) The LC (either the kappa or lambda locus) also undergoes recombination,

however there is no diversity sequence within the LC.

C) After being transcribed and translated, the antibody assembles. The HC ends

in a constant region (black) that abuts to the IgM locus that creates the “stem” of

the antibody. The LC also ends in a constant region.

D) Somatic Hypermutation (SHM) occurs as AID deaminates cytosines within the

heavy and light chain loci. SHM of the HC is depicted here with the mutations

arrowed on resultant antibody.

E) Class Switch Recombination (CSR) occurs when the AID-induced lesion

results in a double-stranded break that recombines the locus designating the

type of antibody to be produced.

18

V(D)J Recombination

AIDU

UAID

U AID

G

G

G

Somatic HyperMutation (SHM) Class-Switch Recombination

Fig 1. Antibody Formation: Use for Controlled Mutagenesis

VJ Recombination

Heavy C

hain

Lig

ht C

hain

(kappa o

r la

mbda)

IgA

IgE

IgM

IgD

IgG

AID

AID

IgG

A

B

C

D E

variable

div

ers

ity

join

ing

variable

join

ing

19

Fig.2OutcomesofA3BDeamination

A deaminated cytosine has multiple outcomes. Repair enzymes can, excise, and

replace the cytosine resulting in error-free repair. However, all along the process

of repair, mutations can result as

1) the lesion is used as a template during replication

2) error-prone mismatch repair can result in transition or transversions

3) Translesion synthesis with an error-prone polymerase can also result in

transition and transversions.

4) Double-stranded breaks can occur at weakened abasic sites or when two

abasic sites are near and nicked with AP endonuclease.

20

T

CA T

C AT AG

A G T

Highlighted Cytosines are within the A3B preferred context

Single-stranded DNA allows A3B catalytic acess

A3B

A3B

T

UA T

U AT AG

A G T

T

UA T

U AT AG

A G T

T

UA T

U AT AG

A G T

T

CA T

C AT AG

A G T

Lesions

Error-free Repair

Uracil DNA

Glycosylase Excision

T

UA T

U AT AG

A G T

T

UA T

U AT AG

A G T

T

A T

AT AG

A G T

Abasic Sites

Backbone Breakage

by AP endonuclease

T

UA T

U AT AG

A G T

T

UA T

U AT AG

A G T

T

A T

AT AG

A G T

T

A

AG

T

A

T

AG

T

Double-stranded

Break

Deamination by A3B

Polymerase

Replication over uracil-

C>T transition

Translesion

Synthesis-

Transitions or

Transversions

Mutagenic Mismatch

Repair-

Transitions or

Transversions

Fig 2. Outcomes of A3B Deamination

21

Fig 3. APOBEC3 Family of Cytosine Deaminases

A schematic representing the location of the 7 APOBEC3 family members in

tandem on chromosome 22. The cell representations on the right show the sub-

cellular localization of each family member corresponding to the color scheme.

22

Chromosome 22

22q13.1

APOBEC3 Family

A3A

A3B

A3C

A3DE

A3G

A3H

A3F


Sub-cellularLocalization


23

Fig.4PrecisionMedicineinCancerTimeline

Examples of Diagnostic, Prognostic, Treatment, and Metabolic components of

cancer care throughout recent history.

24

Fig

. 4 P

reci

sion

Med

icin

e in

Can

cer T

imel

ine

25

Fig5.B-cellDifferentiation The schematic represents B-cell differentiation and antibody maturation from left

to right. The lymphoid stem cell commits to the b-cell lineage and undergoes D-J

rearrangement of the IGH region as a Pro-B cell. Later in this stage the variable

region recombines with the newly formed DJ region. Entering the Pre-B cell

stage, the HC is displayed along with a surrogate LC. If successful binding of the

surrogate LC with the HC occurs, the LC is recombined and added to the CD79A

and CD79B components with the IgM stem at the surface of the immature B cell.

The naïve B cell travels to secondary lymph tissues to wait for antigen

stimulation. Upon antigen stimulation, the intercellular BCR intercellular

components are phosphorylated and a signaling cascade results in proliferative

and survival signals. Class switch recombination in the mature B cell results in

the formation of antibody isotypes. Receptors and molecules of the BCR pathway

are shown below as bars that extend across the differentiation stages where they

are most expressed.

26 Fi

g. 5

B-c

ell D

iffer

entia

tion

27

Chapter2

Mutation Processes in 293-Based Clones Overexpressing the

DNA Cytosine Deaminase APOBEC3B

Short title: APOBEC3B Mutation Signature in Human Cells

Monica K. Akre1, Gabriel J. Starrett1, Jelmar S. Quist2, Nuri A. Temiz1, Michael A.

Carpenter1, Andrew N. J. Tutt2, Anita Grigoriadis2, Reuben S. Harris1,3,*

1 Department of Biochemistry, Molecular Biology, and Biophysics,

Institute for Molecular Virology, Masonic Cancer Center, University of Minnesota,

Minneapolis, MN, USA, 55455

2 Breast Cancer Now Research Unit, Research Oncology, Guy’s Hospital, King’s

College London, London, UK, SE1 9RT

3 Howard Hughes Medical Institute, University of Minnesota, Minneapolis, MN,

USA, 55455

* [email protected]

Keywords: APOBEC3B (A3B); DNA cytosine deamination; DNA damage;

genomic mutagenesis; mismatch repair deficiency; somatic mutation

28

Overview

Molecular, cellular, and clinical studies have combined to demonstrate a

contribution from the DNA cytosine deaminase APOBEC3B (A3B) to the overall

mutation load in breast, head/neck, lung, bladder, cervical, ovarian, and other

cancer types. However, the complete landscape of mutations attributable to this

enzyme has yet to be determined in a controlled human cell system. We report a

conditional and isogenic system for A3B induction, genomic DNA deamination,

and mutagenesis. Human 293-derived cells were engineered to express

doxycycline-inducible A3B-eGFP or eGFP constructs. Cells were subjected to 10

rounds of A3B-eGFP exposure that each caused 80-90% cell death. Control

pools were subjected to parallel rounds of non-toxic eGFP exposure, and

dilutions were done each round to mimic A3B-eGFP induced population

fluctuations. Targeted sequencing of portions of TP53 and MYC demonstrated

greater mutation accumulation in the A3B-eGFP exposed pools. Clones were

generated and microarray analyses were used to identify those with the greatest

number of SNP alterations for whole genome sequencing. A3B-eGFP exposed

clones showed global increases in C-to-T transition mutations, enrichments for

cytosine mutations within A3B-preferred trinucleotide motifs, and more copy

number aberrations. Surprisingly, both control and A3B-eGFP clones also elicited

strong mutator phenotypes characteristic of defective mismatch repair. Despite

this additional mutational process, the 293-based system characterized here still

yielded a genome-wide view of A3B-catalyzed mutagenesis in human cells and a

system for additional studies on the compounded effects of simultaneous

29

mutation mechanisms in cancer.

Introduction

Cancer genome sequencing studies have defined approximately 30 distinct

mutation signatures (reviewed by (27,62–64)). Some signatures are large-scale

confirmations of established sources of DNA damage that have escaped repair.

The largest is water-mediated deamination of methyl-cytosine bases, which

manifest as C-to-T transitions in genomic 5’-CG motifs (65). This process

impacts almost all cancer types and accumulates as a function of age. Other well

known examples include ultraviolet radiation, UV-A and UV-B, which crosslink

adjacent pyrimidine bases and result in signature C-to-T transitions (66), and

tobacco mutagens such as nitrosamine ketone (NNK), which metabolize into

reactive forms that covalently bind guanine bases and result in signature G-to-T

transversions (67). These latter mutagenic processes are well known drivers of

skin cancer and lung cancer, respectively, but also contribute to other tumor

types. A lesser-known but still significant example of a mutagen is the dietary

supplement aristolochic acid, which is derived from wild ginger and related plants

and metabolized into reactive species that covalently bind adenine bases and

cause A-to-T transversions(68,69). Aristolochic acid mutation signatures are

evident in urothelial cell, hepatocellular, and bladder carcinomas. Other

confirmed mutation sources include genetic defects in recombination repair

(BRCA1, BRCA2, etc.), post-replication mismatch repair (MSH2, MLH1, etc.),

and DNA replication proofreading function, which manifest as microhomology-

mediated insertion/deletion mutations, repeat/microsatellite slippage mutations,

30

and transversion mutation signatures, respectively(27,65,70).

The largest previously undefined mutation signature in cancer is C-to-T

transitions and C-to-G transversions within 5’-TC dinucleotide motifs(24,65,71).

This mutation signature occurs throughout the genome, as well as less frequently

in dense clusters called kataegis. This signature is ascribable to the enzymatic

activity of members of the APOBEC family of DNA cytosine to uracil

deaminases(22–24,65,71,72). Human cells encode up to 9 distinct APOBEC

family members with demonstrated C-to-U editing activity, and 7/9 have been

shown to prefer 5’-TC dinucleotide motifs in single-stranded DNA substrates:

APOBEC1, APOBEC3A, APOBEC3B (A3B), APOBEC3C, APOBEC3D,

APOBEC3F, and APOBEC3H. In contrast, AID and APOBEC3G prefer 5’RC and

5’CC, respectively (R = purine; reviewed by(73,74)). The size and similarity of

this protein family, as well as the formal possibility that another DNA damage

source may be responsible for the same mutation signature(75), have made DNA

sequencing data and informatics analyses open to multiple interpretations.

However, independent (14,22) and subsequent (5,23,28,72,76–80)

studies indicate that at least one DNA deaminase family member, A3B, has a

significant role in causing these types of mutations in cancer. First, A3B localizes

to the nucleus throughout the cell cycle except during mitosis when it appears

excluded from chromatin(14). Second, A3B is upregulated in breast cancer cell

lines and primary tumors at the mRNA, protein, and activity levels(5,22,81).

Third, endogenous A3B is the only detectable deaminase activity in nuclear

extracts of many cancer cell lines representing a broad spectrum of cancer types

31

(breast, head/neck, lung, ovarian, cervix, and bladder (5,22,81)). Fourth,

endogenous A3B is required for elevated levels of steady state uracil and

mutation frequencies in breast cancer cell lines(22). Fifth, overexpressed A3B

induces a potent DNA damage response characterized by g-H2AX and 53BP1

accumulation, multinuclear cell formation, and cell cycle deregulation(22,28,76).

Sixth, A3B levels correlate with overall mutation loads in breast and head/neck

tumors (22,77). Seventh, the biochemical deamination preference of recombinant

A3B, 5’TCR, is similar to the actual cytosine mutation pattern observed in breast,

head/neck, lung, cervical, and bladder cancers(5,22,23). Eighth, human

papillomavirus (HPV) infection induces A3B expression in several human cell

types, providing a link between viral infection and the observed strong APOBEC

mutation signatures in cervical and some head/neck and bladder cancers(82–

84). Ninth, the spectrum of oncogenic mutations in PIK3CA is biased toward

signature A3B mutation targets in HPV-positive head/neck cancers(77). Finally,

high A3B levels correlate with poor outcomes for estrogen receptor-positive

breast cancer patients (79,80,85).

Despite this extensive and rapidly growing volume of genomic,

molecular, and clinical information on A3B in cancer, the full breadth of this

enzyme’s activity on the human genome has yet to be determined. Here we

report further development of a human 293 cell-based system for conditional

expression of human A3B. The results reveal, for the first time in a human cell

line, the genomic landscape of A3B mutagenesis.

32

Materials and Methods

Cell Lines

We previously reported T-REx-293 cells that conditionally express A3B (22).

However, the mother, daughter, and granddaughter lines described here are new

in order to ensure a single cell origin and have all of the controls derived in

parallel. T-REx-293 cells were cultured in high glucose DMEM (Hyclone)

supplemented with 10% FBS and 0.5% Pen/Strep. Single cell derived mother

lines, A and C, were obtained by limiting dilution in normal growth medium.

These mother clones were transfected with linearized pcDNA5/TO-A3Bintron-

eGFP (A3Bi-eGFP) or pcDNA5/TO-eGFP vectors (22,86), selected with 200

µg/mL hygromycin, and screened as described in the main text to identify drug-

resistant daughter clones capable of Dox-mediated induction of A3Bi-eGFP or

eGFP, respectively. The encoded A3B enzyme is identical to “isoform a” in

GenBank (NP_004891.4). GFP flow cytometry was done using a FACSCanto II

instrument (BD Biosciences).

Immunoblots

Whole cell lysates were prepared by suspending 1x106 cells in 300µL 10x

reducing sample buffer (125mM Tris pH 6.8, 40% Glycerol, 4%SDS, 5% 2-

mercaptoethanol and 0.05% bromophenol blue). Soluble proteins were

fractionated by 4% stacking and 12% resolving SDS PAGE, and transferred to

PVDF membranes using a wet transfer BioRad apparatus. Membranes were

blocked for 1 hr in 4% milk in PBS with 0.05% sodium azide. Primary antibody

33

incubations, anti-GFP (JL8-BD Clontech) and anti-b-actin (Cell Signaling) were

done in at a 1:1000 dilution in 4% milk diluted in PBST, and incubation conditions

ranged from 4-8 degrees C for 2-16 hrs. Membranes were then washed 3 times

for 5 minutes in PBST. Secondary antibody incubations, anti-mouse 680

(1:20000) and anti-rabbit 800 (1:20000), were done in 4% milk diluted in PBST

with 0.01% SDS, and incubation conditions ranged from 4-8 degrees C for 2-16

hrs. The resulting membranes were washed 3 times for 5 minutes in PBST and

imaged using Licor instrumentation (Odyssey).

DNA Deaminase Activity Assays

This assay was adapted from published procedures (81,87). Whole-cell extracts

were prepared from 1x106 cells by sonication in 200µL HED buffer (25mM

HEPES, 5mM EDTA, 10% glycerol, 1mM DTT, and one tablet protease inhibitor-

Roche per 50mL HED buffer). Debris was removed by a 30 min maximum speed

spin in a tabletop micro-centrifuge at 4 degrees C. The supernatant was then

used in 20µL deamination reactions that contained the following: 1µL of 4pM

fluorescently-labeled 43-mer oligo (5’-

ATTATTATTATTCGAATGGATTTATTTATTTATTTATTTATTT-fluorescein)

containing a single interior 5’-TC substrate, 9.25µL UDG (NEB), 0.25µL RNase,

2µL 10x UDG buffer (NEB), 16.5µL lysate. Reactions were incubated at 37

degrees C for 1h. 2µL 1M NaOH was added and reaction was heated to 95

degrees C in a thermocycler for 10 min. 22µL of 2x formamide loading buffer was

34

added to each sample. 5µL of each reaction was fractionated on a 15% TBE

Urea Gel and imaged using a SynergyMx plate reader (BioTek).

Differential DNA Denaturation (3D) PCR Experiments

This assay was adapted from published procedures (22,88). Genomic DNA was

extracted from samples using a PureGene protocol (Gentra) and quantified using

Nanodrop instrumentation (ThermoFisher Scientific). 20 ng of genomic DNA was

subjected to one round of normal high denaturation temperature PCR using Taq

Polymerase (Denville) and primers for MYC (5’-ACGTTAGCTTCACCAACAGG

and 3’TTCATCAAAAACATCATCATCCAG) or TP53

(5’GAGCTGGAGCTTAGGCTCCAGAAAGGACAA and

3’TTCCTAGCACTGCCCAACAACACCAGC). 383 bp and 376 bp PCR products

were purified and quantified using qPCR with nested primer sets and SYBR

Green detection (Roche 480 LightCycler;

5’ACGAGGAGGAGAACTTCTACCAGCA and

3’TTCATCTGCGACCCGGACGACGAGA for MYC and

5’TTCTCTTTTCCTATCCTGAGTAGTGGTAA and

3’TTATGCCTCAGATTCACTTTTATCACCTTT for TP53). Equivalent amounts of

each PCR product were then used for 3D-PCR using the same nested PCR

primer sets. The resulting 291 and 235 bp products were fractionated by agarose

gel electrophoresis, purified using QIAEX II (Qiagen), cloned into a pJet vector

(Fermentas), and subjected to sequencing (GENEWIZ). Alignments and mutation

calls were done with Sequencher (Gene Codes Corporation).

35

SNP Analysis.

Granddaughter clones were established by limiting dilution after the final pulse

round. Genomic DNA was prepared using the Gentra PureGene kit (Qiagen,

Valencia, CA, quantified by agarose gel staining with ethidium bromide and by

NanoDrop measurements (Thermo Scientific, Wilmington, DE), and subjected to

SNP analyses by Source BioScience (Cambridge, UK) using the Human

OmniExpress-24v1-0 BeadChip (Illumina, San Diego, CA). Raw data were pre-

processed in GenomeStudio using the Genotyping Module (Illumina, San Siego,

CA). Genotype clustering was performed using the humanomniexpress_24v1-

0_a cluster file, whereby probes with a GenCall score below 0.15, indicating low

genotyping reliability, were discarded. All samples passed quality control as

assessed by call rates and frequencies. Genotypes for a total of 716,503 probes

were used for further analyses.

By comparing the genotypes of the granddaughter clones with their

respective mother clones, six classes of base substitutions could be determined

(C-to-T, C-to-G, C-to-A, T-to-G, T-to-C, and T-to-A). For example, a C-to-T

transition occurred if the C/C genotype of the mother clone changed to a C/T

genotype in the granddaughter clone. Given the design of some microarray

probes (i.e., some probes detect the Watson-strand rather than the Crick-strand),

a change from a G/G in the mother clone to a G/A genotype in the granddaughter

clone was also scored as a C-to-T transition.

36

Chromosomal abnormalities in the genomes of granddaughter clones were

identified with Nexus Copy Number 7.5 software (BioDiscovery, Hawthorne, CA),

using the matched mother clone as a reference. SNPRank segmentation was

applied and the segmented copy number data were further processed with the

Tumor Aberrations Prediction Suite (TAPS) to obtain allele-specific copy number

profiles (89). All analyses were performed using the R statistical environment

(http://www.R-project.org). The number of copy number alterations in the A3B-

eGFP pulsed clones were determined based on the difference between the

segment copy number counts of the A3B-eGFP pulsed clones and the eGFP

pulsed clones. Segments which the eGFP pulsed granddaughter clones were not

identical or had CN of 0 were excluded. These were subsequently binned by

copy number loss or gain. All SNP data sets have been deposited in the NCBI

GEO database under accession code GSE78710.

Whole Genome Sequencing (WGS)

Library preparation and sequencing was performed by the Beijing Genome

Institute (BGI) on the Illumina X Ten platform to an average of 34.5 ± 2.8 fold

coverage using purified DNA from Pulse 10 subclone extractions described in the

SNP methods. Sequences were aligned to the hg19 reference genome using

BWA. PCR duplicates were marked and removed with Picard-tools (Broad).

Somatic mutation calling was conducted using mpileup (SamTools), VarScan2

(WashU), and MuTect (Broad). Mutations detected by both VarScan2 and

MuTect were kept as true somatic mutations. VarScan2 was run using

37

procedures describe by de Bruin and coworkers(78). MuTect was run using

default parameters. Alignments from CG1 and CG2 were used as “normal”

controls for CA1 and CA3. Alignment from AG3 was used as the as “normal”

control for AA3. CG1 and CG2 were used as normals for each other in order to

determine their somatic mutations. Somatic mutations that were called against

multiple “normal” genomes were merged to increase detection rates by

overcoming regions of poor sequence coverage unique to either “normal”

genome. Variants occurring at an allele frequency greater than 0.5 or falling into

repetitive regions or those with consistent mapping errors were removed as

described (78). Somatic indels were called by VarScan2 and filtered using the

same methods described above. Separation of mutation signatures present in

our WGS data was performed by the Somatic Signatures R package using

nsNMF decomposition instead of Brunet NMF decomposition as described by

Covington and colleagues (90). Mutation strand asymmetries were analyzed

using somatic mutations from all samples and the AsymTools MatLab software

(90). All raw sequences are available from NCBI SRA under project number,

PRJNA312357.

Results

System for conditional A3B expression

Previous studies have demonstrated that A3B over-expression induces a strong

DNA damage response resulting in cell cycle aberrations and eventual cell death

(14,22,28,76,86). To be able to control the degree of A3B-induced genotoxicity,

38

we built upon our prior studies (22) by establishing a single cell-derived isogenic

system for conditional and titratable expression of this enzyme. T-REx-293 cells

were subcloned to establish an isogenic “mother” line, which was then

transfected stably with a doxycycline (Dox) inducible A3B-eGFP construct or with

an eGFP vector as a negative control. The resulting “daughter” clones were

screened by flow cytometry to identify those that were non-fluorescent without

Dox (i.e., non-leaky) and uniformly fluorescent with Dox treatment (Fig 1A).

Daughter clones were also screened for Dox-inducible overexpression of A3B-

eGFP or eGFP by anti-GFP immunoblotting (Fig 1B). A3B-eGFP clones were

uniformly GFP-negative without Dox treatment, but eGFP only clones showed a

low level of leaky expression possibly related to greater protein stability. As

additional confirmation, the functionality of the induced A3B-eGFP protein was

tested using an in vitro ssDNA deamination assay using whole cell extracts (87).

As expected, only extracts from Dox-treated A3B-eGFP cells elicited strong

ssDNA C-to-U editing activity as evidenced by the accumulation of the

deaminated and hydrolytically cleaved reaction products (labeled P in Fig 1C;

see Methods for details). Nearly identical results were obtained with a parallel

set of independently derived daughter clones (Fig 1D-E).

Iterative rounds of A3B exposure

To establish reproducible A3B induction conditions, a series of cytotoxicity

experiments was done using a range of Dox concentrations. 10,000 T-REx-293

A3B-eGFP cells were plated in 10 cm plates in triplicate, treated with 0, 1, 4, or

39

16 ng/mL Dox, incubated 14 days to allow time for colony formation, and

quantified by crystal violet staining. As expected, higher Dox concentrations led

to greater levels of toxicity (Fig 2A, B). Interpolation from a best-fit logarithmic

curve indicated that 2 ng/mL Dox (C-series daughter clone) or 1 ng/mL Dox (A-

series daughter clone) would cause 80-90% cytotoxicity, and this concentration

was selected for subsequent experiments. Taken together with the measured

doubling times of daughter clones, each A3B-eGFP induction series was

estimated to span 7 days (represented in the workflow schematic in Fig 2C).

Each T-REx-293 A3B-eGFP daughter clone was then subjected to 10

rounds of A3B-eGFP induction and recovery (Fig 2C). Iterative exposures to

A3B-eGFP were expected to generate randomly dispersed mutations throughout

the genome. Ten rounds of A3B-eGFP induction were chosen as a sufficient

regimen for the cells to accumulate readily detectable levels of somatic mutation

as a proof-of-concept for this inducible system. Moreover, this approach left open

the option to go back and characterize an intermediate round, or pursue

additional rounds should analyses require less or more mutations, respectively.

A potential pitfall of this experimental approach is the possibility of

selecting cells that have inactivated the A3B expression construct or the capacity

for induction to avoid the cytotoxic effects of overexpressing this DNA

deaminase. Aliquots of cells from each pulse series were therefore periodically

tested by flow cytometry for A3B-eGFP inducibility, western blot for protein

expression, and ssDNA deamination assays for enzymatic activity (e.g., Fig 1).

Even after the tenth induction series, the A3B-eGFP daughter clones performed

40

similar to original daughter cultures as well as to daughter cultures that had been

grown continuously in parallel to the Dox-exposed experimental cultures and

diluted to mimic the population dynamics caused by each A3B-eGFP exposure

(e.g., Fig 1). These observations indicate that, despite negative selection

pressure imposed by A3B-eGFP mediated DNA damage, resistance or escape

mechanisms did not become overt.

Targeted DNA sequencing provides evidence for A3B mutagenesis

Next, target gene 3D-PCR and sequencing were used to determine if the cells

within each daughter culture had accumulated detectable levels of mutation after

10 rounds of A3B-eGFP exposure. 3D-PCR is a technique that enables the

preferential recovery of DNA templates with C-to-T transitions and/or C-to-A

transversions, because these mutations cause reduced hydrogen bonding

potential and yield DNA molecules that can be amplified at PCR denaturation

temperatures lower than those required to amplify the original non-mutated

sequences (11,22,91). MYC and TP53 were selected as target genes for this

analysis because prior work by our lab with transiently over-expressed A3B and

by others with related A3 family members has demonstrated that these genomic

regions are susceptible to enzyme-catalyzed deamination (19,22,88,92–97).

The 3D-PCR and DNA sequencing analyses revealed substantially more

mutations in MYC and TP53 in A3B-eGFP exposed cells in comparison to

controls (Fig 2D-G). For instance, in the C-series daughter clone 43 mutations,

mostly C-to-T transitions, were evident in MYC amplicons from A3B-eGFP

41

exposed cultures, whereas only 8 mutations were found in a similar number of

control amplicons (mutation plot on left side of Fig 2D; p=0.00036, Student's two-

tailed t-test). The mutation load per amplicon was also higher (pie graph on right

side of Fig 2D). Similar results were obtained for TP53 (Fig 2E; p=0.11,

Student's two-tailed t-test), as well as for both MYC and TP53 in a parallel set of

independently derived A-series daughter clones (Fig 2F, 2G; p<0.0001 and

p<0.0001, respectively, Student's two-tailed t-test). Taken together, these results

provided strong confirmations that 10 rounds of A3B-eGFP exposure caused

increased levels of genomic DNA mutagenesis.

Genome wide mutation analyses

The experiments described used pools of cells and, due to the largely stochastic

nature of the A3B mutational process and the duration of the pulse series, each

pool would be expected to manifest extreme genetic heterogeneity. This

complexity would constrain a standard deep sequencing approach by enabling

only the earliest arising mutations to be detected in the pool because most

subsequent mutations would persist at frequencies too low for reliable detection.

To reduce this complexity to a manageable level and be able to investigate the

mutational history of a single cell exposed to iterative rounds of either A3B-eGFP

or eGFP, we used limiting dilution to generate “granddaughter” subclones from

the tenth generation daughter pools. The strength of this strategy is that any new

base substitution in a single daughter cell, which occurred between the time the

daughter clone was originally generated until the recovery period following the

42

tenth Dox treatment, would be fixed in the granddaughter clonal population at a

predictable allele frequency depending on local chromosome ploidy (i.e., new

mutations would be expected at 50% in diploid regions, 33% in triploid regions,

25% in tetraploid regions, etc., of the 293 cell genome).

The dynastic relationship between mother, daughter, and granddaughter

clones is shown in Fig 3A. To provide initial estimates of the overall level of new

base substitution mutations, genomic DNA was extracted from each

granddaughter clone and subjected to single nucleotide polymorphism (SNP)

analysis using the Illumina OmniExpress Bead Chip. A base substitution

mutation was defined as a clear SNP difference between each daughter clone

and her respective granddaughter. These analyses revealed a wide range of

SNP alterations among granddaughter clones, ranging from a low of <500 in the

C-series eGFP expressing granddaughter subclone CG1 to a high of over 8,000

in the A-series A3B-eGFP expressing granddaughter subclone AA3 (Fig 3B).

This extensive variability was expected based on the sublethal Dox concentration

used in each exposure round, the randomness of granddaughter clone selection,

and the stochastic nature of the mutation processes. Nevertheless, A3B-eGFP

exposed granddaughter clones had an average of 3.4-fold more new cytosine

mutations than the eGFP controls (averages shown by dashed vertical lines in

Fig 3B). Sanger sequencing of cloned PCR products was used to confirm

several distinct SNP alterations and provide an orthologous validation of this

array-based approach (e.g., representative chromatograms of mutations in

granddaughter CA1 versus CG2 in Fig 3C). In addition, hundreds more genomic

43

copy number alterations were evident in A3B-eGFP exposed granddaughters in

comparison eGFP controls (Fig 3D). Interestingly, the overall number of copy

number alterations appears to correlate positively with the overall number of

cytosine mutations, suggesting that many A3B-catalyzed genomic DNA

deamination events are likely processed into DNA breaks and result in larger-

scale copy number aberrations (Fig 3E).

A3B mutational landscape by whole genome sequencing

Next, whole genome sequencing (WGS) was done to assess the mutation

landscape for 3 A3B-eGFP exposed and 3 eGFP control granddaughter clones

from two distinct biological replica experiments (granddaughters in Fig. 3A).

Samples were sequenced using the Illumina X Ten platform at the Beijing

Genome Institute. Approximately 700 million 150 bp paired-end reads were

generated for each genome, with an average read depth of 34.5 ± 2.8 (SD) per

locus. Reads were aligned against the hg19 genome with BWA and somatic

mutations were called using both VarScan2 (Washington University, MO) and

MuTect (Broad Institute, MA), with the intersection of the results these two

methods identifying unambiguous mutations for further analysis (98,99).

Using this conservative approach for mutation identification, a total of

6741, 3496, and 3530 somatic mutations occurred at cytosines in granddaughter

clones that had been subjected to 10 rounds of A3B-eGFP pulses in comparison

to only 910 and 1531 cytosine mutations in the eGFP controls, consistent with

the results of the SNP analyses described above (p=0.018, Student’s t-test; Fig

44

4A). In particular, the A3B-eGFP pulsed granddaughter clones had higher

proportions of C-to-T mutations than the eGFP controls, 59%, 52%, and 54%

versus 36% and 47%, respectively (red slices in pie graphs in Fig 4B). The A3B-

eGFP pulsed granddaughter clones also had higher proportions of mutations at

A/T base pairs suggesting that genomic uracil lesions introduced by A3B may be

processed by downstream error-prone repair processes analogous to those

involved in AID-dependent somatic hypermutation of immunoglobulin genes (6)

(Fig 4A).

However, despite finding significantly higher base substitution mutation

loads in A3B-eGFP pulsed granddaughter clones, the overall distributions of

cytosine mutations within the 16 possible trinucleotide contexts appeared visually

similar for the A3B-eGFP and eGFP controls (histograms comparing the absolute

frequencies of cytosine mutations with the 16 possible trinucleotide contexts are

shown in Fig 4C). This result was initially surprising because we had expected

obvious differences between the A3B-induced mutation spectrum and that

attributable to other mechanisms, particularly within 5’TC contexts. However, a

closer inspection of the eGFP control data sets strongly indicated that this 293-

based system is defective in post-replication mismatch repair. For instance, the

eGFP controls had large numbers base substitution mutations (predominantly C-

to-A, C-to-T, and T-to-C) as well as hallmark microsatellite instabilities consistent

with reported mutation spectra in mismatch repair defective tumors (65,100).

Moreover, each eGFP control had over 10,000 insertion/deletion mutations

ranging in size from 1 to 46 base pairs (constrained by the length of the Illumina

45

sequencing reads).

Therefore, to distinguish the A3B-eGFP mutational contribution from that

caused by mismatch repair deficiency, we used NMF decomposition via the

Somatic Signatures R package to extract mutational signatures from all

granddaughter clones (Methods). Extracted signature 1 (E1) contained minimal

C-to-A mutations and the overall proportion of C-to-T mutations was increased

greatly compared to the raw profiles observed for each sample. This contribution

of this signature to the overall mutation profile was specifically enriched in the

A3B-eGFP pulsed granddaughter clones, contributing to about 50% of all

mutations (Fig 4E). Notably, this signature shows significant enrichments for C-

to-T mutations within 5’TCG motifs, which are biochemically preferred by

recombinant A3B enzyme (5,22,23) (Fishers exact test for granddaughter clones

CA1/CA3: TCA, p=0.42/0.93; TCC, p=1.00/0.96; TCG, p < 0.0001/0.0001; and

TCT, p=1.00/0.46). Moreover, strong enrichments for C-to-G transversion

mutations were evident for cytosine mutations within TCW contexts (W=A or T) in

E1 in comparison to other trinucleotide combinations (p=0.027, Student’s t-test).

C-to-G transversions are hallmark A3B-mediated mutations as other cytosine-

biased mutational processes such as ageing (spontaneous deamination of

methyl-cytosines in 5’CG motifs) and UV-light (polymerase-mediated bypass of

cross-linked pyrimidine bases) primarily result in C-to-T transitions (26,63).

Extracted signatures 2 (E2) and 3 (E3) were characterized by larger proportions

of C-to-A mutations occurring independently of trinucleotide motif. Upon

clustering our extracted signatures with previously defined mutation signatures

46

(65), E2 and E3 appeared most similar to signature 16, which currently has no

known etiology. Unexpectedly our APOBEC-mediated signature, E1, clustered

most closely to signature 5, also of unknown etiology, rather than signatures 2 or

13, which are normally attributed to APOBEC-mediated mutagenesis. These

WGS data suggest that the “APOBEC” signature may be more complex than

anticipated from prior studies. Nevertheless, our WGS studies demonstrate

increased genome-wide mutagenesis attributable to A3B, even over additional

pre-existing mutational process in this human 293 cell-based system.

Discussion

A3B is emerging as a significant source of somatic mutation in many different

cancer types (reviewed by (27,62–64) and see Introduction for references to

primary literature). Here, we further develop a 293-based cellular system for

conditional, Dox-mediated expression of A3B. The system was validated using

flow cytometry, immunoblotting, enzyme activity assays, and, most importantly,

three complementary mutation detection methods (3D-PCR, SNP microarray,

and WGS). Our results demonstrate higher levels of cytosine-focused mutations

in A3B-eGFP expressing cells, in comparison to eGFP controls. In particular, C-

to-T transition mutations and C-to-G transversion mutations in A3B preferred

trinucleotide motifs predominated after the composite mutation spectra were

extracted into 3 separate signatures. These studies combine to fortify the

conclusion that A3B is a potent human genomic DNA mutagen.

An unexpected outcome of our studies was the discovery of a probable

47

defect in post-replication mismatch repair in the 293-based system. The

mismatch repair-defective phenotype is clear with hallmark microsatellite

instabilities and base substitution mutation biases. However, the molecular

nature of this defect is not obvious and may be genetic and/or epigenetic. For

instance, the WGS data show 6 exonic and over 100 intronic alterations to

mismatch repair and related genes that could induce a mismatch repair defective

phenotype. Our results are consistent with a prior WGS study that found 1000’s

of mutational differences between 6 different 293-derived cell lines, as well as

significant down-regulation of MLH1 and MLH3 in a subset of the lines (101). Our

studies are also consistent with at least two additional prior reports characterizing

the related 293T cell line as mismatch repair defective (102,103). Regardless of

the precise molecular explanation(s), given the large number of labs worldwide

that rely upon 293 or 293-derived cell lines, the results presented here are likely

to be helpful for informing future experimental designs using this system.

Despite a compelling case for A3B in cancer mutagenesis (key results

cited in the Introduction), the overall APOBEC mutation signature in cancer

cannot be explained by A3B alone, because it is still evident in breast cancers

lacking the entirety of the A3B gene due to a common deletion polymorphism

(104). One or more of the other APOBEC family members with an intrinsic

preference for 5’TC dinucleotide substrates may be responsible. For instance, a

leading candidate is A3A due to high catalytic activity in biochemical assays,

nuclear/cell-wide localization in some cell types, propensity to induce a DNA

damage response and cell death upon overexpression, and the resemblance of

48

its mutation signature in model systems to the observed APOBEC signature in

many cancers (11,12,14,22,28,87,94,105–110). A3A gene expression may also

be derepressed as a side-affect of the A3B gene deletion(111). Additional studies

will be needed to unambiguously delineate the identities of the full repertoire of

cancer-relevant APOBEC3 enzymes, quantify their relative contributions to

mutation in each cancer type, and build upon this fundamental knowledge to

improve cancer diagnostics and therapeutics.

Acknowledgements

We thank Emily Law for providing A3B-eGFP and eGFP constructs and several

Harris and Grigoriadis lab members for helpful comments.

49

Fig1.AconditionalsystemforA3Bexpression.

(A) Representative flow cytometry data for T-REx-293 A3B-eGFP and eGFP

daughter cultures 24 hrs after Dox treatment (n=3; mean +/- SD of technical

replicates).

(B) Representative anti-GFP immunoblot of T-REx-293 A3B-eGFP and eGFP

daughter cultures 24 hrs after Dox treatment.

(C) Representative DNA cytosine deaminase activity data of whole cell extracts

from T-REx-293 A3B-eGFP and eGFP daughter cultures 24 hrs after Dox

treatment.

(D, E, F) Biological replicate data using A-series daughter clones of the

experiments described in panels A, B, and C, which used C-series daughter

clones.

50

B

Cβ-actin

GFP

β-actin

GFP

0 2 10000 2 1000

A3B-eGFP eGFP

0 21000 0 2

1000

S

P

A

E

F

D

0

50

100

25

75

25

75

0 2 10000 2 1000

0 1 10000 1 1000

Dox ng/mL Dox ng/mL

Dox ng/mL0 1 1000

0 1 1000

A3B-eGFP eGFP

Dox ng/mL

Dox ng/mL

A3B-eGFP eGFP A3B-eGFP eGFP

A3B-eGFP eGFP A3B-eGFP eGFP

S

P

Dox ng/mL

70 kDa

30 kDa

42 kDa

70 kDa

30 kDa

42 kDa

43 nt

34 nt

43 nt

34 nt

50

100

0 110

00 0 110

000

C-series Daughter Clones A-series Daughter Clones

% G

FP+

% G

FP+

51

Fig2.A3Binductionoptimizationandtargetedsequencingresults.

Fig2.A3Binductionoptimizationandtargetedsequencingresults (A, B) Dose response curves indicating the relative colony forming efficiency

(viability index) of T-REx-293 A3B-eGFP daughter clones treated with the

indicated Dox concentrations (n=3; mean viability +/- SD of biological replicates).

The dotted lines show the Dox concentration required to induce 80% cell death

(2 or 1 ng/mL for C- and A-series daughter clones, respectively).

(C) A schematic of the experimental workflow depicting the viability index of a

population of cells induced to express A3B-eGFP and recover over time. Dox

treatment occurs on day 1, maximal death is observed on days 3 or 4, and each

population typically rebounds to normal viability levels by days 6 or 7.

(D-G) A summary of the base substitution mutations observed in MYC (241 bp)

and TP53 (176 bp) by 3D-PCR analysis of genomic DNA after 10 rounds of A3B-

eGFP or eGFP exposure. Red, blue, and black columns represent the absolute

numbers of C-to-T, C-to-A, and other base substitution types in sequenced 3D-

PCR products, respectively. Asterisks indicate cytosine mutations occurring in 5’-

TC dinucleotide motifs. The adjacent pie graphs summarize the base substitution

mutation load for each 3D-PCR amplicon. The number of sequences analyzed is

indicated in the center of each pie graph.

52

A3B-eGFP Exposed C-series Daughter

C-to-T

C-to-A

Other

TC context

C-to-T

C-to-A

Other

TC context

MYC

CB

D

A3B-eGFP Exposed C-series Daughter

eGFP Exposed C-Series Daughter

TP53

F

A3B-

Induced

Death

Growth

and

Recovery

A1 or 2ng/mL Dox

H

J

17

22

15

50 bp

50 bp

0 5 10 15

0

25

50

75

100

Dox (ng/mL) Dox (ng/mL)

Via

bil

ity

In

de

x

Via

bil

ity

In

de

x

0

25

50

75

100

Days

eGFP Exposed C-series Daughter

A3B-eGFP Exposed A-series Daughter

A3B-eGFP Exposed A-series Daughter

eGFP Exposed A-series Daughter

eGFP Exposed A-series Daughter

19

0

1

2

3

4

≥5

Mutations/

Sequence

0

1

2

3

4

≥5

Mutations/

Sequence

2ng/mL Dox

1ng/mL Dox

MYC

TP53

20

22

25

25

0 05 10 15

0

25

50

75

100

Via

bil

ity

In

de

x

C-series Titration A-series Titration

53

Fig3.SNPanalysestoestimatenewmutationaccumulation.

(A) A dynastic tree illustrating the relationship between mother, daughter, and

granddaughter clones used for SNP and WGS experiments. The red, dashed box

around the daughter clones denotes 10 cycles of Dox-treatment.

(B) A histogram summarizing the SNP alterations observed in granddaughter

clones by microarray hybridization. Red, blue, and black colors represent C-to-T,

C-to-A, and C-to-G mutations, respectively.

(C) Sanger sequencing chromatograms confirming representative cytosine

mutations predicted by SNP analysis. The left chromatogram shows a G-to-A

transition (C-to-T on the opposite strand) and the right chromatogram a C-to-G

transversion.

(D) A histogram plot of the total number of copy number (CN) alterations in the

indicated categories in A3B-eGFP exposed granddaughter clones in comparison

to eGFP exposed controls.

(E) A dot plot and best-fit line of data in panel B versus data in panel D.

55

Figure4.SummaryofsomaticmutationsdetectedbyWGS.

(A) Stacked bar graphs representing total number of C/G and T/A context

somatic mutations in each granddaughter subclone (black and white bars,

respectively). A1 and A3 are A3B-eGFP exposed subclones, and G1 and G2 are

eGFP exposed controls.

(B) Pie charts representing the proportion of each type of cytosine mutation

across the genome. Red, blue, and black wedges represent C-to-T, C-to-A, and

C-to-G mutations, respectively. A1 and A3 are A3B-eGFP exposed subclones,

and G1 and G2 are eGFP exposed controls.

(C) Stacked bar graphs representing the absolute frequency of C-context somatic

trinucleotide mutations of each subclone from the B panel.

(D) Stacked bar graphs representing the extracted mutation signatures from

WGS data.

(E) The proportion that each extracted mutation signature contributes to each

granddaughter clone.

56

A3B-eGFP eGFP

Num

ber o

f som

atic

mut

atio

ns

A

B C

E

D

ACAACCACGACTCCACCCCCGCCTGCAGCCGCGGCTTCATCCTCGTCT

0.000.020.040.060.080.10


0.000.020.040.060.080.10

CG1

CG2

AA3

36%

22%

42%

47%

20%

33%


0.000.020.040.060.080.10CA3

52%

19%

29%

3530

910

1531


0.000.020.040.060.080.10

CA1

54%

19%

27%

3496

Prop

ortio

n of

tota

l mut

atio

ns

CG1 - eGFP Exposed

CG2 - eGFP Exposed

CA3 - A3B-eGFP Exposed

AA3 - A3B-eGFP Exposed

CA1 - A3B-eGFP Exposed


0.000.020.040.060.080.10

AA3

CA1

CA3

CG1

CG2

0

5000

10000

15000T/A

C/G

6741


0.00.10.20.30.40.5

0.00

0.25

0.50

0.75

1.00

E1

E2

E3

AA3

CA1

CA3

CG1

CG2


0.00.10.20.30.40.5

Extracted Signature 3 (E3)



0.00.10.20.30.40.5


Con

trib

utio

n to

sig

natu

re

Sign

atur

e pr

opor

tion

59%

17%

24%

57

Chapter 3 Development of Expression-based Biomarkers of

Dasatinib Response in Hematologic Malignancies

Running Title: Development of Dasatinib-Response Biomarkers in B-cell

Malignancies

Monica K. Akre1, Mitra, Amit1, Wen Wang2, Chad L. Myers2, Brian Van Ness1*

*[email protected]

1 Department of Molecular, Cellular, Developmental Biology, and Genetics,

Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA, 55455

2 Department of Computer Science and Engineering, University of Minnesota,

Minneapolis, MN, USA, 55455

Keywords: B-cell Receptor (BCR) Pathway, Genomics of Drug Sensitivity in

Cancer (GDSC), B-cell malignancies, Dasatinib

Overview

Personalized medicine includes the identification of the biomarkers and/or

expression profiles important in the treatment of cancer. Cancer cell lines offer a

representative diversity of molecular processes involved in both malignant

58

phenotypes as well as therapeutic responses. Coupling cancer line expression

with high-throughput drug screens allows for examination of perturbed pathways

that influence drug sensitivities. Here we use The Sanger Institute’s Genomics of

Drug Sensitivity in Cancer to develop a bioinformatic and pathway analysis

approach that identifies the BCR pathway as an important biomarker in response

to the tyrosine kinase inhibitor, dasatinib, used in hematologic cancer therapies.

This approach establishes a process by which data from cell line repositories can

be used to identify biomarkers associated with drug response in the treatment of

cancers.

Introduction

Despite improvements in cancer therapies, wide variation in tumor

response to treatment is a major limitation in achieving consistent therapeutic

effect(1,2). The variation is likely represented by tumor diversity. Previously, we

demonstrated computational approaches to define response and resistance

based on gene expression profiles in myeloma, a plasma cell malignancy (3).

This approach took advantage of a large panel of myeloma cell lines

representing a wide diversity of response to therapeutic drugs used in the

clinic(4).

Numerous collections of cancer cell lines provide additional opportunities

to characterize the genetic signatures that may distinguish response and

resistance to a variety of drugs. The NCI-60, Cancer Cell Line Encylopedia

(CCLE), and the Genomics of Drug Sensitivity in Cancer (GDSC) provide a

wealth of information that includes genomic sequences, mutational status, gene

59

expression, as well as response to panels of drugs used in cancer therapies (5–

7). This offers a unique opportunity to apply approaches that may provide genetic

profiles that define response and resistance, with the potential to apply these as

predictors in clinical decisions.

The GDSC is a collection of 1,047 cell lines from diverse tumor types that

have been tested with 265 drugs (5). The data collection includes DNA

sequence, mutation status, and gene expression data that we have used to

develop a pipeline of computational approaches that predict response and

resistance. Drug response is determined by a 9 step 2-fold serial dilution of drug

concentration and measuring cell viability. From these, two quantitative values

are provided: the drug concentration required to reduce viability by 50% (IC50)

and the area under the dose-response survival curve (AUSC). Gene expression

data is available from the Affymetrix U219 gene array platform.

Because expression patterns may vary widely simply based on tissue

specificity, we chose to limit our initial analysis to B cell malignancies,

represented by 71 cell lines derived from leukemias, lymphomas, and myelomas.

Our approach comprises a series of steps in which we classify response and

resistance, develop a differential classification profile of gene expression

patterns, identify features by pathway analysis, and validate on cell lines and

reported clinical outcomes (Fig 1). We demonstrate this approach to stratify and

predict response to the protein tyrosine kinase inhibitor, dasatinib.

Dasatinib is a multi-target kinase inhibitor that has affinity for about 50

kinases and is most widely used to manage chronic myelogenous leukemia (8–

60

12). Dasatinib’s ability to reversibly and competitively inhibit the ATP binding site

of kinases make its potential applications wide in scope. However, therapeutic

applications have been reported in multiple cancers with significant variation in

response(13–16). For this report, we develop an approach that identifies a 5

gene signature distinguishing dasatinib response and resistance.

Methods

Expression Analysis

Robust-Multi-Array Average (RMA) normalized expression data and drug

response data was downloaded directly from the GDSC website

(www.cancerrx.org).

CCLE Robust-Multi-Array Average (RMA) normalized expression data

was downloaded from Genomicscape.com. Multiple ENSEMBL IDs mapping to

the same gene region were averaged.

Normalization of CCLE values GDSC

There are 48 cell lines of B-cell origin that are in both the GDSC and the

CCLE. These were used to determine the best method of normalization. Lowess

plot comparison between these 48 lines showed the best concordance between

the two sets. First, the natural log of each CCLE intensitiy value was subtracted

from that value, then the set was quantile normalized to the GDSC. The

Bioconductor limma package normalizeQuantile function was modified to adjust

the CCLE values to their GDSC “twin” lines(17).

R scripts available by request.

Statistical Analysis

61

The Significant Analysis of Microarray (sam) function of the siggenes R

package was used in differential expression analysis (18,19). The thresholds

were varied until the number of genes exceeded 200, yet still within an FDR of

0.10.

Responders and Non-Responders expression values for each gene in the

signature were compared using the unpaired, nonparametric, Mann-Whitney test

in GraphPad (Prism).

Ordinary one-way ANOVA was performed in GraphPad (Prism) on all

genes of signature using 4 groupings based on response values: Responders

(AUSC<0.75), Partial Responders (AUSC 075-0.85) and Limited Responders

(AUSC 0.85-0.98), and Non-Responders (AUSC>0.98).

The cor() function of the R stats package was used to find Pearson

correlation coefficients using all expression values between cell lines (20).

All heatmaps, including the Pearson correlation coefficients were rendered

using the pheatmap package (21).

Tissue Culture

All cell lines were maintained in RPMI-1640 (Lonza) supplemented with

10% FBS (Invitrogen Life Technology), 1X Antibiotic/Antimycotic (Gibco), 1X L-

Glutamine (Gibco), and 1ng/mL of human IL-6. Incubators were humidified and

maintained at 37° Centigrade with a 5% CO2 content.

Cells were plated at a concentration of 5x105/mL on day0. On day 1, a

dilution series of concentrations (2-fold dilutions from the max dose of 5.12µM)

as well as a DMSO vehicle control were administered in triplicate. On day 4 (72

62

hours after treatment) cell viability was measured by Cell Titer-Glo Luminescent

cell viability assay according to manufacturer's instructions (Promega) and

luminescence was read and recorded using Synergy 2 Microplate Reader

(Biotek).

Maximum viability assigned, as 100% for IC50 or 1 for AUSC calculations,

was normalized to untreated controls. IC50values were estimated by calculating

the nonlinear regression using the inhibitor-normalized response equation

(variable slope) in GraphPad (Prism).

As done in the calculation of AUSC in the GDSC, wells containing media,

drug, but no cells were used to calculate the value for normalizing maximum

response, which corresponds to a 0 value. AUSC used the concentrations that

overlapped with GDSC doses and substituted the first lowest concentration within

the GDSC dilution series for the lower 2 doses tested in lab. AUSC were

calculated using GraphPad (Prism).

Results

Classification of Response and Resistance.

The first step in the process is shown in Figure 2, in which we arranged

the 71 B cell lines by response, using both Area Under the Survival Curve

(AUSC) and IC50. The distribution of response favored non-response, with only

14 lines showing a strong response to low doses, and 11 lines showing

essentially no response. Specifically, we classified Responders as lines that

show an AUSC <0.75, and an IC50 value < maximum drug concentration divided

by 4 and Non-Responders were defined as having an AUSC>0.98 and an IC50 >

63

maximum dose tested. This resulted in the 14 strong Responder lines versus 11

highly resistant (Non-Responder) lines, representing the extreme ends of

response and resistance. Our rationale was that underlying this wide separation

of response may be a common expression signature or pathway(s) that can

serve as a predictive biomarker.

We applied Pearson correlation coefficient analysis on all 71 cell lines

(see methods). Notably, Responders shared more similarity than Non-

Responders (rank-sum p = 4.97x10-10) (Fig 2B,C). What is apparent is the Non-

Responders, as a group, are far more diverse in gene expression than those

cells that are within the Responder group. Thus, we directed our attention to the

features that define the Responders that are distinct from the gene expression of

the diverse Non-Responders.

Differential Gene Expression and Feature Selection

Differential gene expression between the Responder and Non-Responder

lines was performed using Significance Analysis of Microarray (see methods),

with a False Discovery Rate limited to 10%. This resulted in 228 genes to further

analyze for their relevance to the dasatinib response.

Ingenuity Pathway Analysis (IPA) (Qiagen) uses an extensive, curated

literature bank to identify molecule interaction and pathway regulation (23). Data

is uploaded and populates the knowledge base interactions. Significant p-values

are generated when genes fall within a pathway in a non-random manner.

Further, pathway activation scoring is achieved by similarly populating relevant

pathways and the directionality of the gene expression. Analysis of major

64

canonical pathways provides both the p-value (likelihood of true positive) and z-

score (directional impact of activation/de-activation).

Pathway analysis was conducted using IPA on the 228 differentially

expressed genes and their log-fold ratio data of Responders relative to Non-

Responders. Notably, the top canonical pathway with a significant z-score (z-

score=2, which is 2 standard deviations from the mean) was the B-cell-receptor

(BCR) pathway (p-value=0.013). Using the log-fold ratio scores of Responders

relative to Non-Responders, molecules of the BCR pathway reflected a uniquely

activated pattern in the Responders(Fig 3).

BCR and B-cell Development

The BCR pathway has been demonstrated to be active at different stages

of B cell development (24–26). Activation of various oncogenes and tumor

suppressor genes gives rise to the malignancies at different stages of B-cell

development (Fig. 4). The distribution of response to dasatinib along the B-cell

differentiation path indicates cancers arising from a pre- or early- B-cell may be

more likely to respond than B cells or plasma cells later in development that

notably have decrease expression of BCR pathway genes.

Evaluation of the Genes in the BCR Pathway

We further characterized the differentially expressed genes of the BCR

pathway between the Responders (high expression) and Non-Responders (low

expression). The genes from the BCR pathway were analyzed using t-test

between Responders and Non-Responders. Genes that were significant (p-

value <0.05) were further analyzed.

65

Mann-Whitney tests of individual genes of the BCR pathway were

performed. We reasoned that effective drug-response predictors would also

show a linear trend between Responders, intermediate groups, and Non-

Responders.

Intermediate response groups were included to identify an expression

trend across the full range of responses. These groupings included cell lines with

a partial response (AUSC between 0.75-0.85) and limited response (AUSC

between 0.85-0.98). The 5 genes displaying the best separations were chosen

for use in a predictive scoring system described below. Significant p-values

between adjacent groupings were not observed. However, ANOVAs were

performed using all 4 groupings and indicated 4 of the 5 genes showed highly

significant differences between the Responders and Non-Responders (ANOVA

p-values listed underneath Mann-Whitney test p-values in Fig 5), and trends of

decreasing expression across the increasing resistance groupings. The

expression of the 5 genes is shown as a heatmap comparing the strong

Responders and the highly resistant Non-Responders (Fig 6). Notably, CD19

alone showed a significant association with response (p<0.0001).

Validation

Independent Cell Lines

Eleven cell lines not included in the differential expression analysis had

available gene expression data (8 from CCLE, 3 from GDSC). These were tested

in-house for dasatinib response. The CCLE/GDSC expression platforms were

then normalized to one another to obtain comparable values (not shown). During

66

the validation phase, we used the AUSC as the sole metric indicating response.

The binary classification of these lines based on the 5 gene signature into

Responder (AUSC<0.8) and Non-Responder (AUSC>0.8) demonstrated the

previously modeled BCR expression associated with response in 9 of the 11

lines (Fig 7).

An algorithm was developed from the averaging the expression values of

the Responders and Non-Responders, we refer to these as Response Averages

(RA). Lines that had expression lower than the RA of CD19 were immediately

binned as a Non-Responder. Lines that had expression higher than that of the

CD19 RA were given a score of “1” for each gene the expression exceeded the

RA of EBF1, PAX5, BTK, and BLNK. If a cell line’s score was less than 3, they

were binned as Non-Responders.

The discriminating genes were determined using the extreme Responders

versus the extreme Non-Responders. When applying the scoring system to the

intermediate responding lines (that were not included in the gene discrimination

modeling) 30 lines were correctly predicted for an accuracy rate of 67%. It should

be noted low CD19 does place 23 out of 23 of these cell lines in the Non-

Responder category. Thus, low CD19 is an effective discriminator of non-

response. Nevertheless, the 5 gene discrimination was very accurate in

distinguishing the highly sensitive from the highly resistant lines. Despite the

difficulty of determining the response of the equivocal intermediate group, the

sensitivity of the test set, training set, and intermediate group is 78.9% and

specificity is 74.6%.

67

Clinical Associations

MM lines (plasma cells) rarely express CD19, and show a low activation of

the BCR pathway. Thus, our BCR gene discrimination model would indicate

poor response. Indeed, a recent clinical study (NCT00429949) of relapsed,

refractory, or plateau phase MM patients was discontinued after using dasatinib

as a single agent in which a partial response occurred in only 1 of 21 enrolled in

the study.

Waldenström’s macroglobulinemia (WM) is also a plasma cell

malignancy, but in contrast to multiple myeloma, is CD19+, and has recently

been described to express an activated BCR pathway (27). Based on our 5 gene

signature we predicted these malignancies would respond to dasatinib; thus,

representing an independent validation set of Responders. Indeed, WM primary

patient lines exhibited good response to dasatinib in primary patient samples

(n=32)(27) supporting our findings that the expression of CD19 and 4 other

molecules of the BCR pathway are associated with dasatinib response. A

comparison of the 5 gene expression of the WM patients and MM cell lines

shows the stark difference in expression between these two plasma cell

malignancies (Sup Fig 1).

Bortezomib Resistance, CD19, and Collateral Dasatinib Sensitivity

Further supporting the findings here is work done in mantle cell lymphoma

(MCL). MCL is mid-stage B-cell malignancy and may, or may not, express CD19

(28,29). Two MCL lines were treated with low doses of Bortezomib over time to

better understand mechanisms of resistance in MCL (30). This acquired

68

bortezomib resistance (BTZ-R) was accompanied by a re-activation of the BCR

pathway, ie. these cells had increased expression and phosphorylation of BCR

components above parental lines. Notably, along with this re-activation, a

collateral increase in sensitivity to dasatinib was observed. The BTZ-R lines

responded to doses of dasatinib 10-fold less concentrated than parental lines.

These observations held true in mouse xenografts of the MCL line pairs treated

with dasatinib as well (30). These data further support that CD19 and the BCR

pathway are consistent biomarkers associated with B cell lineage cells response

to dasatinib.

To further explore the possibility of acquired bortezomib resistance

inducing a collateral sensitivity to dasatinib, we tested the sensitivity of a

myeloma cell line pair. U266-P and U266-VR (Velcade resistant) was developed

in our laboratory and had RNAseq data available (130). The U266-VR line had a

2.8-fold increase in FPKM reads for CD19. In addition the U266-P line had a

dasatinib AUSC value of 0.91, whereas U266-VR had an AUSC of 0.73. This

supports the observations of Kim, et al. and also lays the groundwork for

exploring the bortezomib-dasatinib relationship further in multiple myeloma.

Discussion

We describe the dasatinib response of B-cell malignancies as two

extremes: Responders and Non-Responders. From this binary categorization,

comparisons were made between the groups. Differential expression revealed a

set of genes, that, when uploaded to IPA, was most significant for the BCR

pathway. The Responders had an activated pathway and conversely, the Non-

69

Responders had a de-activated BCR pathway. Five genes: CD19, EBF1, BTK,

BLNK, and PAX5, of this pathway were used to sort the 25 original cell lines into

their Responder/Non-Responder groupings.

Eleven independent cell lines were likewise binned according to their

expression values for each of the five genes. Ten of the 11 were appropriately

categorized. WM patient primary samples have expression patterns indicative of

response. As a malignancy, WM primary cell lines are responsive to dasatinib

(27). The expression of the 5 BCR genes, and particularly CD19, showed

effective discrimination of the strong responders and highly resistant cell lines.

We show here that cell line expression and drug response can be

interrogated through differential expression and pathway analysis to find

meaningful relationships and identify biomarkers of drug response. However,

there are potential limitations. First, the modeling was done with a very limited

number of cells, at the extremes of response. Despite this, the associations

identified in the limited modeling set provided a robust association with actual

clinical outcomes. The predictive power in a clinical setting may not be accurate

in identifying partial responses, nor the increased efficacies that often

accompany combination therapies. However, we note myelomas with emerging

proteasome inhibitor resistant populations that may contribute to relapses, may

respond to dasatinib, and possibly benefit from its use in combination therapies.

The modeling also takes into account only the diversity of tumor cells, without

consideration of variations in microenvironmental influences or patient variations

70

in drug metabolism or distribution. Yet, the patient correlations remained

significant to predict response across the B cell developmental pathway.

This is just one example of the use of available data bases to develop

response signatures. Similar approaches may provide gene signatures across

many other drugs within the GDSC, CCLE, or similar large cell line data bases.

Ultimately, expression signatures may add important biomarkers that better direct

therapeutic approaches,

Acknowledments:

We thank Zachary Hunter and Steven Treon ,M.D.,Ph.D. for supportive guidance

with RNAseq data of the Waldenström’s data set. We thank the Department of

Genetics, Cell Biology & Development for support for MA. AM was supported by

Minnesota State Partnership Grant.

.

71

Fig 1. Workflow

A) Cell lines representing B cell malignancies and showing strong response

and strong resistance to disatinib were chosen.

B) Differential expression using sig.genes with an FDR of <0.10 was used to

describe potential genetic biomarkers of response.

C) Gene lists and fold change expression were analyzed using Ingenuity

Pathway Analysis.

D) Mann-Whitney tests were performed on extreme Responders/Non-

Responders followed by ANOVA analysis to determine expression trends

across response.

E) Statistically significant genes defined a gene signature and were used to

separate cell lines by response.

F) Untested lines and clinical samples were used to validate the gene

signature.

72

Feat

ure

Sel

ectio

nC

lass

ifica

tion

Gen

eE

valu

atio

nD

iffer

entia

tion

Valid

atio

n

Line

1Li

ne 2

Fig

1. W

orkf

low

**

**

Gen

e S

epar

atio

n

A)

B)

C)

D)

E)

F)

73

Fig. 2 Diversity of Dasatinib Response and Expression

Correlation

A) The 71 B-cell malignancies found in the GDSC with dasatinib data are plotted

along the x-axis in increasing order of AUSC value. Both log2(IC50) values

(black boxes, scaled on the primary axis) and AUSC scores (gray circles, scaled

on secondary axis) are represented.

B-C) Pearson correlation coefficients calculated using all available 17,419 gene

expression observations within the GDSC. Perfect correlation coefficients of 1

can be seen along the diagonal as each cell line is compared to itself. The lowest

correlation coefficients are found between Non-Responders indicative of their

expression diversity.

75

Fig 3. BCR Pathway Activated in Extreme Responders

Genes of the BCR pathway represented as expression ratios of highly sensitive

Responders relative to Non-Responders. The darker the red-colored molecule

represents a greater fold-change between groups.

76

PlasmaMembrane

NuclearEnvelope

Fig 3. BCR Pathway Activated in Extreme Responders

77

Fig 4. B-cell Differentiation Stages with Malignancy and CD19

Expression of Cell Lines

Maturation of B-cells is represented from left to right. Malignancies arising from

corresponding stages of B-cell development are depicted along with cell lines

representing those malignancies in the same vertical axis. Next to cell line names

are either a red dot (Responders) or blue dot (Non-Responders) along with their

corresponding log2 fluorescence intensity of CD19, the marker found most

associated with dasatinib response.

78

Man

tle C

ell L

ymph

oma

Burk

itt Ly

mph

oma

Diffu

se L

arge

B-c

ell L

ymph

oma

Follic

ular L

ymph

oma

Mult

iple

Mye

loma

Chro

nic L

ymph

ocyti

c Leu

kem

ia

GA

-10

SU

-DH

L-6

SU

-DH

L-16

Fara

ge

SU

P-B

15

KA

RPA

S-2

31

NU

-DU

L--1

DO

HH

-2

SU

-DH

L-4

KA

RPA

S-4

22

MLM

A

NA

LM-6

697

MH

H-P

RE

B-1

SK-M

M-2

BC

-1

EB

2TU

R

RP

MI-8

226

WS

U-N

HL

RL

WS

U-D

LCL2

SC

C-3

L-36

3IM

-9

Hairy

Cell

Leu

kem

ia

GCB-

like

Lym

phom

a

Acut

e Ly

mph

ocyti

c Leu

kem

ia2345678

Fig

4. B

-cel

l Diff

eren

tiatio

n St

ages

with

Mal

igna

ncy

and

CD

19 E

xpre

ssio

n of

Cel

l Lin

es

79

Fig 5. Statistically Significant Signature Genes

Each Gene in the 5 gene signature is represented with cell lines binned

according to AUSC and their log2 expression values. The discrimination of

categorical response is depicted in the graph below as the following 4 categories

with AUSC values following parenthetically: Responder (0-0.75), partial

Responder (0.75-0.85), limited Responder (0.85-0.98), and Non-Responder

(>0.98). Mann-Whitney tests p values assessing the Responders versus Non-

Responders are listed above the p values for ANOVA across all categories.

80

Fig 5. Statistically Significant Signature Genes

0

2

4

6

8

10lo

g2(F

I)**** p<0.0001**** p<0.0001

CD19

Responder

Partial

Res

ponder

Limite

d Res

ponder

Non-Res

ponder

AU

SC

0

0.5

1

2.5

3.0

3.5

4.0

4.5

5.0

log2

(FI)

PAX5**** p<0.0001**** p<0.0001

0

5

10

15

log2

(FI)

EBF1**p=0.0013**p=0.0042

Responder

Partial

Res

ponder

Limite

d Res

ponder

Non-Res

ponder

AU

SC

0

0.5

1

Responder

Partial

Res

ponder

Limite

d Res

ponder

Non-Res

ponder

AU

SC

0

0.5

1

0

5

10

15

log2

(FI)

BTK**p=0.0077ns p=0.0522

Responder

Partial

Res

ponder

Limite

d Res

ponder

Non-Res

ponder

AU

SC

0

0.5

10

5

10

15

log2

(FI)

BLNK**p=0.0013**p=0.0085

Responder

Partial

Res

ponder

Limite

d Res

ponder

Non-Res

ponder

AU

SC

0

0.5

1

81

Fig 6. Gene expression Signature of Extreme Response Unsupervised clustering of the gene expression signature that discriminates

Responders from Non-Responders Cell lines are listed along the x-axis while the

5 genes most associated with dasatinib response are on the y-axis. Expression

values are represented as scaled as z-scores of the log2 transformed

fluorescence intensities. The 14 extreme Responders are boxed in red on the left

of the heatmap, the 11 extreme Non-Responders are boxed in blue on the right.

The dynamic ranges of each gene in the signature is not always reflective of its

contribution to identifying response as can be seen in the case of PAX5. This

gene is highly significant in differentiating response, but its absolute values vary

subtly, but significantly (see Figure 5, p<.0001).

82

FarageNU−D

UL−1

MLM

ADOHH−2

NALM

−6KARPA

S−422

SUP−B

15697SU−D

HL−6

GA−10

SU−D

HL−16

KARPA

S−231

MHH−P

REB−1

SU−D

HL−4

WSU−N

HL

IM−9

RL

EB2

WSU−D

LCL2

TUR

RPMI−8226

SK−M

M−2

L−363BC−1

SCC−3

PAX5

CD19

EBF1

BTK

BLNK

−3

−2

−1

0

1

2

Fig 6. 5 Gene Expression Signature of Extreme Response

Extreme Responders Extreme Non-Responders

83

Fig 7. Validation of Lines With 5 Gene Signature Training and Test Lines are evaluated based on their expression values relative

to the extreme responding lines. Lines with expression values less that the

average of the Responders and Non-Responders of the training set are binned

into Responder and Non-Responder. Test set of 11 cell lines had 1 true positive,

2 false positives, 8 true negatives, and no false negatives.

84

CD19LOW

HIGH

SU-DHL-4

WSU-NHL

BC-1

EB2TUR

RL

WSU-DLCL2

SCC-3

SK-MM-2RPMI-8226L-363IM-9

SUP-B15

NALM-6697

GA-10

SU-DHL-6

SU-DHL-16

Farage

KARPAS-231

NU-DUL--1DOHH-2

KARPAS-422

MHH-PREB-1MLMA

KHM-1B

KMS-34KMM-1KMS-18

BL-41

U-266KMS-28 BMNCI-H929KMS-26KMS-11

Mino

BLNKBTKEBF1PAX5

Fig 7. Separation of Cell Lines with 5 Gene Signature

Training Set

Training Set

Test Set

Test Set

85

Supplementary Fig 1. Expression Comparison between multiple

myeloma and Waldenström’s macroglobulinemia

Sixteen MM cell line and 57 WM primary patient samples had RNAseq available

for comparison of the 5 genes associated with dasatinib response. Each

sample’s gene expression value is represented as a fraction of the total number

of reads for that sample

86

MM WM-0.0001

0.0000

0.0001

0.0002

0.0003

0.0004

0.0005

CD19G

ene

Rea

ds/T

otal

Rea

ds

MM WM-0.0001

0.0000

0.0001

0.0002

0.0003

BLNK

Gen

e R

eads

/Tot

al R

eads

MM WM-0.0001

0.0000

0.0001

0.0002

0.0003

0.0004

BTK

Gen

e R

eads

/Tot

al R

eads

MM WM-0.0001

0.0000

0.0001

0.0002

0.0003

EBF1

Gen

e R

eads

/Tot

al R

eads

MM WM-0.0005

0.0000

0.0005

0.0010

0.0015

0.0020

PAX5

Gen

e R

eads

/Tot

al R

eads

A

C

D

EB

Supplementary Figure 1. Expression Comparison between multiple myeloma and Waldenström’s macroglobulinemia

87

Chapter4

Discussion and Future Aims

My thesis work can be described by two words: differences and

similarities. In Chapter 2, I described an endogenous mutagenic process that

contributes heterogeneity--differences. In Chapter 3, I describe a 5 gene

expression signature predictive of dasatinib sensitivity. Non-Responders have

low expression of these genes, Responders have higher expression—similarities.

Mutational Processes

In Chapter 2, I introduced a mechanism of endogenous mutagenesis that

contributes to cancer heterogeneity. Our work proved that A3B overexpression

mutates genomic DNA in human cell lines. The extent of the phenotypic changes

that result from this process is an area of active research. This work showed A3B

mutagenesis in human genomic DNA occurs most often at the sequence context

previously described in in vitro work (22).

Future Directions

Comparison of the whole genome sequences of the control clones

provided evidence of a background mutagenic process in the T-REx-293 lines

used. The 293 line and its derivatives (such as the T-REx-293 used in this study)

are widely used. Researchers should be aware of this feature of the 293 line as it

may be an inappropriate choice for some studies. For example, 293 lines are not

ideal in studies that assume no microsatellite instability.

Future study to better understand the source and extent of the apparent

mismatch repair deficiency is also be important for repair research. Down-

88

regulation of the mis-match repair genes, MLH1 and MLH3 have been found in

293-derived lines previously (101). The WGS revealed mutations of mismatch

repair genes in all clones indicating the original line procured commercially also

contained the mutations. Are these mutated genes still functional? Does the

down-regulation of either or both MLH1 and MLH3 cause the phenotype?

Transfection experiments of MLH1, MH3, and the mutated versions of mismatch

repair genes in our studies could rescue the defect and answer the above

questions.

Predictive Expression Profiles

In Chapter 3, we used pathway analysis to describe the expression of 5

genes of the BCR pathway as important for response to dasatinib. Pathway

analysis identified the BCR as the top canonical dysregulated. Study of

heatmaps of the BCR components indicates that not all responders have

upregulation of all BCR components, nor to the same degree (Fig 1). But taken

together, the dysregulation was significant and consistent – providing novel

understanding of factors that influence response.

Future Directions

Dasatinib is a multi-kinase inhibitor and can interact with multiple

molecules of the BCR pathway (139). How does BCR pathway activation

increase cell sensitivity to dasatinib? Is it a causal relationship or is BCR pathway

activation a concomitant event? These questions can be addressed with

transfection of dasatinib resistant lines and knock-down experiments of dasatinib

sensitive lines.

89

Importance of Pathway Analysis

Differential expression does not always identify a dysregulated pathway.

Each gene’s expression is evaluated independently of the networks or pathways

in which it biologically connected. For example, a more restrictive FDR of 5%

found 44 genes differentially expressed between dasatinib Responders and Non-

Responders, but only 2 of the 5 genes of the predictive response signature

(PAX5 and CD19) were among those genes. After identifying the BCR pathway

activation as important in dasatinib response, all genes of the pathway were re-

examined. This critical step identified the importance of BTK, which was not in

the original 228 genes, in dasatinib response. This illustrates differential

expression alone leaves out relationships between genes that are necessary to

make clear what is biologically relevant.

Personal Future Directions

Large databases like the GDSC are resources, but underutilized. My work

creates a workflow for identification of expression profiles of response (Ch 3, Fig

1). In the future, I will preform similar analysis in the GDSC and CCLE to gene

expression profiles for more drugs. I want to expand my analysis to include all

265 drugs that show efficacy in B-cell malignancies. My goal is to create a

hematologic-oncology expression chip that will direct physicians and patients to

the most effective treatments for the patient’s specific malignancy (Fig 2).

90

Fig1.GenesoftheBCRPathway

Responsive cell lines are represented on the left-most portion of the heatmap,

boxed in red. Non-Responders are on the right, boxed in blue. The heatmap

represents log2 (fluorescence intensity) values for 21 genes of the BCR pathway.

This illustrates the patchwork nature of expression in response. This difficulty is

overcome with pathway analysis when the genes that work together in a system

are examined together as well.

91

697

DOHH−2

KARPA

S−422

MHH−P

REB−1

MLMA

NALM

−6

SUP−B

15

Farage

GA−10

KARPA

S−231

NU−D

UL−1

SU−D

HL−16

SU−D

HL−4

SU−D

HL−6

IM−9

SK−M

M−2

RPMI−8226

EB2

TUR

WSU−N

HL

RL

BC−1

SCC−3

L−363

WSU−D

LCL2

PDK1GAB2BCRPAX5ABL1GAB1PTENTCF3CSKBLNKPLCG2GRB2LYN

BTKEBF1CD22CD79ABADFOXO1CD79BCD19

4

6

8

10Fig 1. Genes of the BCR Pathway

Responders Non-Responders

92

Fig2.PotentialofExpressionProfilinginPrecisionMedicine

The outline is a model for using expression signatures of response and non-

response for each drug (left panel). Patient Samples are assayed for their tumor

profile (middle panel). Patient profiles are matched to responsive profiles of

drugs. These matches result in the patient receiving therapy most likely to be

effective.

94

Bibliography

1. Boveri T. Concerning the Origin of Malignant Tumours by Theodor Boveri.

Translated and annotated by Henry Harris. J Cell Sci [Internet].

2008;121(Supplement 1):1–84. Available from:

http://jcs.biologists.org/cgi/doi/10.1242/jcs.025742

2. Rous P. a Sarcoma of the Fowl Transmissible By an Agent Separable

From the Tumor Cells. J Exp Med [Internet]. 1911;13(4):397–411.

Available from: http://www.jem.org/cgi/doi/10.1084/jem.13.4.397

3. Rusch HP, Baumann CA. Tumor Production in Mice with Ultraviolet

Irradiation. Am J Cancer [Internet]. 1939 Jan 1;35(1):55 LP-62. Available

from: http://cancerres.aacrjournals.org/content/35/1/55.abstract

4. Casellas R, Nussenzweig A, Wuerffel R, Pelanda R, Reichlin A, Suh H, et

al. Ku80 is required for immunoglobulin isotype switching. EMBO J.

1998;17(8):2404–11.

5. Leonard B, Hart SN, Burns MB, Carpenter MA, Temiz NA, Rathore A, et al.

APOBEC3B upregulation and genomic mutation patterns in serous ovarian

carcinoma. Cancer Res. 2013;73(24):7222–31.

6. Di Noia JM, Neuberger MS. Molecular Mechanisms of Antibody Somatic

Hypermutation. Annu Rev Biochem [Internet]. 2007;76(1):1–22. Available

from:

http://www.annualreviews.org/doi/10.1146/annurev.biochem.76.061705.09

0740

7. Casellas R, Basu U, Yewdell WT, Chaudhuri J, Robbiani DF, Di Noia JM.

95

Mutations, kataegis and translocations in B cells: understanding AID

promiscuous activity. Nat Rev Immunol [Internet]. 2016;16(3):164–76.

Available from: http://www.nature.com/doifinder/10.1038/nri.2016.2

8. Conticello SG. The AID/APOBEC family of nucleic acid mutators. Genome

Biol [Internet]. 2008;9(6):229. Available from:

http://genomebiology.biomedcentral.com/articles/10.1186/gb-2008-9-6-229

9. Salter JD, Bennett RP, Smith HC. The APOBEC Protein Family: United by

Structure, Divergent in Function. Vol. 41, Trends in Biochemical Sciences.

2016. p. 578–94.

10. Refsland EW, Harris RS. The APOBEC3 family of retroelement restriction

factors. Vol. 371, Current topics in microbiology and immunology. 2013. p.

1–27.

11. Stenglein MD, Burns MB, Li M, Lengyel J, Harris RS. APOBEC3 proteins

mediate the clearance of foreign DNA from human cells. Nat Struct Mol

Biol [Internet]. 2010;17(2):222–9. Available from:

http://www.nature.com/doifinder/10.1038/nsmb.1744

12. Chen H, Lilley CE, Yu Q, Lee D V., Chou J, Narvaiza I, et al. APOBEC3A

is a potent inhibitor of adeno-associated virus and retrotransposons. Curr

Biol. 2006;16(5):480–5.

13. Harris RS, Liddament MT. Retroviral restriction by APOBEC proteins. Nat

Rev Immunol [Internet]. 2004;4(11):868–77. Available from:

http://www.nature.com/doifinder/10.1038/nri1489

14. Lackey L, Law EK, Brown WL, Harris RS. Subcellular localization of the

96

APOBEC3 proteins during mitosis and implications for genomic DNA

deamination. Cell Cycle. 2013;12(5):762–72.

15. Kinomoto M, Kanno T, Shimura M, Ishizaka Y, Kojima A, Kurata T, et al. All

APOBEC3 family proteins differentially inhibit LINE-1 retrotransposition.

Nucleic Acids Res. 2007;35(9):2955–64.

16. Bogerd HP, Wiegand HL, Hulme AE, Garcia-Perez JL, O’Shea KS, Moran

J V., et al. Cellular inhibitors of long interspersed element 1 and Alu

retrotransposition. Proc Natl Acad Sci [Internet]. 2006;103(23):8780–5.

Available from: http://www.pnas.org/cgi/doi/10.1073/pnas.0603313103

17. Muckenfuss H, Hamdorf M, Held U, Perkovic M, Löwer J, Cichutek K, et al.

APOBEC3 proteins inhibit human LINE-1 retrotransposition. J Biol Chem.

2006;281(31):22161–72.

18. Lackey L, Demorest ZL, Land AM, Hultquist JF, Brown WL, Harris RS.

APOBEC3B and AID have similar nuclear import mechanisms. J Mol Biol.

2012;419(5):301–14.

19. Ramiro AR, Jankovic M, Eisenreich T, Difilippantonio S, Chen-Kiang S,

Muramatsu M, et al. AID is required for c-myc/IgH chromosome

translocations in vivo. Cell. 2004;118(4):431–8.

20. Potter M, Wiener F. Plasmacytomagenesis in mice: Model of neoplastic

development dependent upon chromosomal translocations. Vol. 13,

Carcinogenesis. 1992. p. 1681–97.

21. Küppers R, Dalla-Favera R. Mechanisms of chromosomal translocations in

B cell lymphomas. Oncogene [Internet]. 2001;20(40):5580–94. Available

97

from:

http://www.nature.com/doifinder/10.1038/sj.onc.1204640%0Ahttp://www.nc

bi.nlm.nih.gov/pubmed/11607811

22. Burns MB, Lackey L, Carpenter MA, Rathore A, Land AM, Leonard B, et al.

APOBEC3B is an enzymatic source of mutation in breast cancer. Nature

[Internet]. 2013;494(7437):366–70. Available from:

http://www.nature.com/doifinder/10.1038/nature11881

23. Burns MB, Temiz NA, Harris RS. Evidence for APOBEC3B mutagenesis in

multiple human cancers. Nat Genet [Internet]. 2013;45(9):977–83.

Available from: http://www.nature.com/doifinder/10.1038/ng.2701

24. Nik-Zainal S, Alexandrov LB, Wedge DC, Van Loo P, Greenman CD,

Raine K, et al. Mutational processes molding the genomes of 21 breast

cancers. Cell. 2012;149(5):979–93.

25. Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SAJR, Behjati S,

Biankin A V., et al. Signatures of mutational processes in human cancer.

Nature [Internet]. 2013;500(7463):415–21. Available from:


26. Harris RS, Stephens P, Tarpey P, Davies H, Loo P, Greenman C, et al.

Molecular mechanism and clinical impact of APOBEC3Bcatalyzed

mutagenesis in breast cancer. Breast Cancer Res 2015 171 [Internet].

2015;17(1):400404. Available from: http://breast-cancer-

research.com/content/17/1/8%5Cnhttps://breast-cancer-

research.biomedcentral.com/articles/10.1186/s13058-014-0498-3

98

27. Helleday T, Eshtad S, Nik-Zainal S. Mechanisms underlying mutational

signatures in human cancers. Nat Rev Genet [Internet]. 2014;15(9):585–

98. Available from: http://www.nature.com/doifinder/10.1038/nrg3729

28. Taylor BJM, Nik-Zainal S, Wu YL, Stebbings LA, Raine K, Campbell PJ, et

al. DNA deaminases induce break-associated mutation showers with

implication of APOBEC3B and 3A in breast cancer kataegis. Elife.

2013;2013(2).

29. Kidd JM, Newman TL, Tuzun E, Kaul R, Eichler EE. Population

stratification of a common APOBEC gene deletion polymorphism. PLoS

Genet. 2007;3(4):0584–92.

30. Long J, Delahanty RJ, Li G, Gao YT, Lu W, Cai Q, et al. A common

deletion in the APOBEC3 genes and breast cancer risk. J Natl Cancer Inst.

2013;105(8):573–9.

31. David A Hungerford PCN. A minute chromosome in human chronic

granulocytic leukemia. Science (80- ). 1960;132:1457–501.

32. Ren R. Mechanisms of BCR–ABL in the pathogenesis of chronic

myelogenous leukaemia. Nat Rev Cancer [Internet]. 2005;5(3):172–83.

Available from: http://www.nature.com/doifinder/10.1038/nrc1567

33. Lemonick M, Park A, Cray D GC. New Hope for Cancer. Time.

2001;157(21):62.

34. Druker BJ, Guilhot F, O’Brien SG, Gathmann I, Kantarjian H, Gattermann

N, et al. Five-year follow-up of patients receiving imatinib for chronic

myeloid leukemia. N Engl J Med [Internet]. 2006;355(23):2408–17.

99

Available from: http://www.ncbi.nlm.nih.gov/pubmed/17151364

35. Döhner H, Stilgenbauer S, Benner A, Leupolt E, Kröber A, Bullinger L, et

al. Genomic aberrations and survival in chronic lymphocytic leukemia. N

Engl J Med [Internet]. 2000;343(26):1910–6. Available from:

http://www.ncbi.nlm.nih.gov/pubmed/11136261%5Cnhttp://www.nejm.org/d

oi/pdf/10.1056/NEJM200012283432602

36. Slamon DJ. Studies of the HER-2/neu Proto-oncogene in Human Breast

Cancer. Cancer Invest [Internet]. 1990;8(2):253–4. Available from:

http://www.tandfonline.com/doi/full/10.3109/07357909009017573

37. Molina MA, Codony-Servat J, Albanell J, Rojo F, Arribas J, Baselga J.

Trastuzumab (Herceptin), a humanized anti-HER2 receptor monoclonal

antibody, inhibits basal and activated HER2 ectodomain cleavage in breast

cancer cells. Cancer Res. 2001;61(12):4744–9.

38. Carter P, Presta L, Gorman CM, Ridgway JB, Henner D, Wong WL, et al.

Humanization of an anti-p185HER2 antibody for human cancer therapy.

Proc Natl Acad Sci U S A [Internet]. 1992;89(May):4285–9. Available from:

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed

&dopt=Citation&list_uids=1350088%5Cnhttp://www.ncbi.nlm.nih.gov/pmc/a

rticles/PMC49066/pdf/pnas01084-0075.pdf

39. De Santes K, Slamon D, Anderson SK, Shepard M, Fendly B, Maneval D,

et al. Radiolabeled antibody targeting of the HER-2/neu oncoprotein.

Cancer Res [Internet]. 1992;52(7):1916–23. Available from:

http://www.ncbi.nlm.nih.gov/pubmed/1348016

100

40. Shepard HM, Lewis GD, Sarup JC, Fendly BM, Maneval D, Mordenti J, et

al. Monoclonal antibody therapy of human cancer: Taking the HER2

protooncogene to the clinic. J Clin Immunol. 1991;11(3):117–27.

41. Demetri GD, von Mehren M, Blanke CD, Van den Abbeele AD, Eisenberg

B, Roberts PJ, et al. Efficacy and safety of imatinib mesylate in advanced

gastrointestinal stromal tumors. N Engl J Med [Internet]. 2002;347(7):472–

80. Available from:

https://www.researchgate.net/publication/11206772_Efficacy_and_Safety_

of_Imatinib_Mesylate_in_Advanced_Gastrointestinal_Stromal_Tumors

42. Din OS, Woll PJ. Treatment of Gastrointestinal Stromal Tumor: Focus on

Imatinib Mesylate. Ther Clin Risk Manag [Internet]. 2008;4(1):149–62.

Available from:

http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2503651&tool=p

mcentrez&rendertype=abstract

43. Croxtall JD, McKeage K. Trastuzumab: In HER2-positive metastatic gastric

cancer. Vol. 70, Drugs. 2010. p. 2259–67.

44. Sawaki A, Ohashi Y, Omuro Y, Satoh T, Hamamoto Y, Boku N, et al.

Efficacy of trastuzumab in Japanese patients with HER2-positive advanced

gastric or gastroesophageal junction cancer: A subgroup analysis of the

Trastuzumab for Gastric Cancer (ToGA) study. Gastric Cancer.

2012;15(3):313–22.

45. Kim SY, Kim HP, Kim YJ, Oh DY, Im SA, Lee D, et al. Trastuzumab inhibits

the growth of human gastric cancer cell lines with HER2 amplification

101

synergistically with cisplatin. Int J Oncol. 2008;32(1):89–95.

46. Hoskins JM, Carey LA, McLeod HL. CYP2D6 and tamoxifen: DNA matters

in breast cancer. Nat Rev Cancer [Internet]. 2009;9(8):576–86. Available

from: http://www.nature.com/doifinder/10.1038/nrc2683

47. Jin Y, Desta Z, Stearns V, Ward B, Ho H, Lee KH, et al. CYP2D6

genotype, antidepressant use, and tamoxifen metabolism during adjuvant

breast cancer treatment. J Natl Cancer Inst. 2005;97(1):30–9.

48. Coons AH, Creech HJ, Jones RN. Immunological Properties of an Antibody

Containing a Fluorescent Group. Exp Biol Med. 1941;47(2):200–2.

49. Baselga J, Tripathy D, Mendelsohn J, Baughman S, Benz CC, Dantis L, et

al. Phase II Study of Weekly Intravenous Recombinant Humanized Anti-

p185HER2 Monoclonal Antibody in Patients with HER2/neu-

Overexpressing Metastatic Breast Cancer. J Clin Oncol. 1996;14(3):737–

44.

50. Morris S, Kirstein M, Valentine M, Dittmer K, Shapiro D, Saltman D, et al.

Fusion of a kinase gene, ALK, to a nucleolar protein gene, NPM, in non-

Hodgkin’s lymphoma. Science (80- ) [Internet]. 1994;263(5151):1281–4.

Available from:

http://www.sciencemag.org/cgi/doi/10.1126/science.8122112

51. Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S, et al.

Identification of the transforming EML4–ALK fusion gene in non-small-cell

lung cancer. Nature [Internet]. 2007;448(7153):561–6. Available from:


102

52. Sukov WR, Hodge JC, Lohse CM, Akre MK, Leibovich BC, Thompson RH,

et al. ALK alterations in adult renal cell carcinoma: frequency,

clinicopathologic features and outcome in a large series of consecutively

treated patients. Mod Pathol [Internet]. 2012;25(11):1516–25. Available

from: http://www.ncbi.nlm.nih.gov/pubmed/22743654

53. Rivlin N, Brosh R, Oren M, Rotter V. Mutations in the p53 Tumor

Suppressor Gene: Important Milestones at the Various Steps of

Tumorigenesis. Genes Cancer [Internet]. 2011;2(4):466–74. Available

from: http://gan.sagepub.com/lookup/doi/10.1177/1947601911408889

54. Posa I, Carvalho S, Tavares J, Grosso AR. A pan-cancer analysis of MYC-

PVT1 reveals CNV-unmediated deregulation and poor prognosis in renal

carcinoma. Oncotarget [Internet]. 2016;7(30). Available from:


55. Cline MS, Craft B, Swatloski T, Goldman M, Ma S, Haussler D, et al.

Exploring TCGA Pan-Cancer Data at the UCSC Cancer Genomics

Browser. Sci Rep [Internet]. 2013;3(1):2652. Available from:

http://www.nature.com/articles/srep02652

56. Zack TI, Schumacher SE, Carter SL, Cherniack AD, Saksena G, Tabak B,

et al. Pan-cancer patterns of somatic copy number alteration. Nat Genet


http://www.nature.com/doifinder/10.1038/ng.2760

57. Cancer Genome Atlas Research Network, Weinstein JN, Collisson E a,

Mills GB, Shaw KRM, Ozenberger B a, et al. The Cancer Genome Atlas

103

Pan-Cancer analysis project. Nat Genet [Internet]. 2013;45(10):1113–20.

Available from:


mcentrez&rendertype=abstract%5Cnhttp://www.ncbi.nlm.nih.gov/pubmed/2

4071849%5Cnhttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=P

MC3919969

58. Yang W, Soares J, Greninger P, Edelman EJ, Lightfoot H, Forbes S, et al.

Genomics of Drug Sensitivity in Cancer (GDSC): A resource for therapeutic

biomarker discovery in cancer cells. Nucleic Acids Res. 2013;

59. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S,

et al. The Cancer Cell Line Encyclopedia enables predictive modelling of

anticancer drug sensitivity. Nature [Internet]. 2012;483(7391):603–7.

Available from: http://dx.doi.org/10.1038/nature11003

60. Shoemaker RH. The NCI60 human tumour cell line anticancer drug screen.

Nat Rev Cancer [Internet]. 2006;6(10):813–23. Available from:

http://www.nature.com/doifinder/10.1038/nrc1951

61. Pieper K, Grimbacher B, Eibel H. B-cell biology and development. Vol. 131,

Journal of Allergy and Clinical Immunology. 2013. p. 959–71.

62. Roberts SA, Gordenin DA. Hypermutation in human cancer genomes:

footprints and mechanisms. Nat Rev Cancer [Internet]. 2014;14(12):786–

800. Available from: http://www.nature.com/doifinder/10.1038/nrc3816

63. Swanton C, McGranahan N, Starrett GJ, Harris RS. APOBEC Enzymes:

Mutagenic Fuel for Cancer Evolution and Heterogeneity. Vol. 5, Cancer

104

discovery. 2015. p. 704–12.

64. Henderson S, Fenton T. APOBEC3 genes: Retroviral restriction factors to

cancer drivers. Vol. 21, Trends in Molecular Medicine. 2015. p. 274–84.

65. Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio S a JR, Behjati S,

Biankin A V, et al. Signatures of mutational processes in human cancer.

Nature [Internet]. 2013;500:415–21. Available from:



66. Cleaver JE, Crowley E. UV damage, DNA repair and skin carcinogenesis.

Front Biosci [Internet]. 2002;7:d1024-43. Available from:


67. Hecht SS. Lung carcinogenesis by tobacco smoke. Int J Cancer.

2012;131(12):2724–32.

68. Poon SL, Pang S-T, McPherson JR, Yu W, Huang KK, Guan P, et al.

Genome-wide mutational signatures of aristolochic acid and its application

as a screening tool. Sci Transl Med [Internet]. 2013;5(197):197ra101.


69. Poon S, McPherson JR, Tan P, Teh B, Rozen SG. Mutation signatures of

carcinogen exposure: genome-wide detection and new opportunities for

cancer prevention. Genome Med [Internet]. 2014;6(3):24. Available from:

http://genomemedicine.biomedcentral.com/articles/10.1186/gm541

70. Lord CJ, Tutt ANJ, Ashworth A. Synthetic Lethality and Cancer Therapy:

Lessons Learned from the Development of PARP Inhibitors. Annu Rev

105

Med [Internet]. 2015;66(1):455–70. Available from:

http://www.annualreviews.org/doi/10.1146/annurev-med-050913-022545

71. Roberts SA, Sterling J, Thompson C, Harris S, Mav D, Shah R, et al.

Clustered Mutations in Yeast and in Human Cancers Can Arise from

Damaged Long Single-Strand DNA Regions. Mol Cell. 2012;46(4):424–35.

72. Roberts SA, Lawrence MS, Klimczak LJ, Grimm SA, Fargo D, Stojanov P,

et al. An APOBEC cytidine deaminase mutagenesis pattern is widespread

in human cancers. Nat Genet [Internet]. 2013;45(9):970–6. Available from:


73. Albin JS, Harris RS. Interactions of host APOBEC3 restriction factors with

HIV-1 in vivo: implications for therapeutics. Expert Rev Mol Med [Internet].

2010;12:e4. Available from:

http://www.journals.cambridge.org/abstract_S1462399409001343

74. Harris RS, Dudley JP. APOBECs and virus restriction. Vols. 479–480,

Virology. 2015. p. 131–45.

75. Bacolla A, Cooper DN, Vasquez KM. Mechanisms of base substitution

mutagenesis in cancer genomes. Vol. 5, Genes. 2014. p. 108–46.

76. Shinohara M, Io K, Shindo K, Matsui M, Sakamoto T, Tada K, et al.

APOBEC3B can impair genomic stability by inducing base substitutions in

genomic DNA in human cells. Sci Rep [Internet]. 2012;2. Available from:

http://www.nature.com/articles/srep00806

77. Henderson S, Chakravarthy A, Su X, Boshoff C, Fenton TR. APOBEC-

Mediated Cytosine Deamination Links PIK3CA Helical Domain Mutations

106

to Human Papillomavirus-Driven Tumor Development. Cell Rep.

2014;7(6):1833–41.

78. de Bruin EC, McGranahan N, Mitter R, Salm M, Wedge DC, Yates L, et al.

Spatial and temporal diversity in genomic instability processes defines lung

cancer evolution. Science (80- ) [Internet]. 2014;346(6206):251–6.

Available from:

http://www.sciencemag.org/cgi/doi/10.1126/science.1253462

79. Sieuwerts AM, Willis S, Burns MB, Look MP, Gelder MEM Van, Schlicker

A, et al. Elevated APOBEC3B Correlates with Poor Outcomes for

Estrogen-Receptor-Positive Breast Cancers. Horm Cancer. 2014;5(6):405–

13.

80. Cescon DW, Haibe-Kains B, Mak TW. APOBEC3B expression in breast

cancer reflects cellular proliferation, while a deletion polymorphism is

associated with immune activation. Proc Natl Acad Sci U S A [Internet].

2015;112(9):2841–6. Available from:

http://www.ncbi.nlm.nih.gov/pubmed/25730878%5Cnhttp://www.pubmedce

ntral.nih.gov/articlerender.fcgi?artid=PMC4352793

81. Leonard B, McCann JL, Starrett GJ, Kosyakovsky L, Luengas EM, Molan

AM, et al. The PKC/NF- B Signaling Pathway Induces APOBEC3B

Expression in Multiple Human Cancers. Cancer Res [Internet].

2015;75(21):4538–47. Available from:

http://cancerres.aacrjournals.org/cgi/doi/10.1158/0008-5472.CAN-15-2171-

T

107

82. Vieira VC, Leonard B, White EA, Starrett GJ, Temiz NA, Lorenz LD, et al.

Human papillomavirus E6 triggers upregulation of the antiviral and cancer

genomic DNA deaminase APOBEC3B. MBio. 2014;5(6).

83. Warren CJ, Xu T, Guo K, Griffin LM, Westrich JA, Lee D, et al. APOBEC3A

Functions as a Restriction Factor of Human Papillomavirus. J Virol


http://jvi.asm.org/lookup/doi/10.1128/JVI.02383-14

84. Mori S, Takeuchi T, Ishii Y, Kukimoto I. Identification of APOBEC3B

promoter elements responsible for activation by human papillomavirus type

16 E6. Biochem Biophys Res Commun. 2015;460(3):555–60.

85. Periyasamy M, Patel H, Lai CF, Nguyen VTM, Nevedomskaya E, Harrod A,

et al. APOBEC3B-Mediated Cytidine Deamination Is Required for Estrogen

Receptor Action in Breast Cancer. Cell Rep. 2015;13(1):108–21.

86. Land AM, Wang J, Law EK, Aberle R, Kirmaier A, Krupp A, et al.

Degradation of the cancer genomic DNA deaminase APOBEC3B by SIV

Vif. Oncotarget [Internet]. 2015;6(37). Available from:

http://www.oncotarget.com/abstract/5483

87. Carpenter MA, Li M, Rathore A, Lackey L, Law EK, Land AM, et al.

Methylcytosine and normal cytosine deamination by the foreign DNA

restriction enzyme APOBEC3A. J Biol Chem. 2012;287(41):34801–8.

88. Suspene R, Aynaud M-M, Guetard D, Henry M, Eckhoff G, Marchio A, et

al. Somatic hypermutation of human mitochondrial and nuclear DNA by

APOBEC3 cytidine deaminases, a pathway for DNA catabolism. Proc Natl

108

Acad Sci [Internet]. 2011;108(12):4858–63. Available from:

http://www.pnas.org/cgi/doi/10.1073/pnas.1009687108

89. Rasmussen M, Sundstrom M, Goransson Kultima H, Botling J, Micke P,

Birgisson H, et al. Allele-specific copy number analysis of tumor samples

with aneuploidy and tumor heterogeneity. Genome Biol [Internet].

2011;12(10):R108. Available from:


90. Gehring JS, Fischer B, Lawrence M, Huber W. SomaticSignatures:

Inferring mutational signatures from single-nucleotide variants.

Bioinformatics. 2015;31(22):3673–5.

91. Suspène R, Henry M, Guillot S, Wain-Hobson S, Vartanian JP. Recovery

of APOBEC3-edited human immunodeficiency virus G→A hypermutants by

differential DNA denaturation PCR. J Gen Virol. 2005;86(1):125–9.

92. Liu M, Duke JL, Richter DJ, Vinuesa CG, Goodnow CC, Kleinstein SH, et

al. Two levels of protection for the B cell genome during somatic

hypermutation. Nature [Internet]. 2008;451(7180):841–5. Available from:


93. Nilsen H, An Q, Lindahl T. Mutation frequencies and AID activation state in

B-cell lymphomas from Ung-deficient mice. Oncogene [Internet].

2005;24(18):3063–6. Available from:

http://www.nature.com/doifinder/10.1038/sj.onc.1208480

94. Mussil B, Suspène R, Aynaud MM, Gauvrit A, Vartanian JP, Wain-Hobson

S. Human APOBEC3A Isoforms Translocate to the Nucleus and Induce

109

DNA Double Strand Breaks Leading to Cell Stress and Death. PLoS One.

2013;8(8).

95. Rucci F, Cattaneo L, Marrella V, Sacco MG, Sobacchi C, Lucchini F, et al.

Tissue-specific sensitivity to AID expression in transgenic mouse models.

Gene. 2006;377(1–2):150–8.

96. Komori J, Marusawa H, Machimoto T, Endo Y, Kinoshita K, Kou T, et al.

Activation-induced cytidine deaminase links bile duct inflammation to

human cholangiocarcinoma. Hepatology. 2008;47(3):888–96.

97. Matsumoto Y, Marusawa H, Kinoshita K, Endo Y, Kou T, Morisawa T, et al.

Helicobacter pylori infection triggers aberrant expression of activation-

induced cytidine deaminase in gastric epithelium. Nat Med [Internet].

2007;13(4):470–6. Available from:

http://www.nature.com/doifinder/10.1038/nm1566

98. Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, et al.

VarScan 2: Somatic mutation and copy number alteration discovery in

cancer by exome sequencing. Genome Res. 2012;22(3):568–76.

99. Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C,

et al. Sensitive detection of somatic point mutations in impure and

heterogeneous cancer samples. Nat Biotechnol [Internet]. 2013;31(3):213–

9. Available from: http://www.nature.com/doifinder/10.1038/nbt.2514

100. Boland CR, Goel A. Microsatellite Instability in Colorectal Cancer.

Gastroenterology. 2010;138(6).

101. Lin Y-C, Boone M, Meuris L, Lemmens I, Van Roy N, Soete A, et al.

110

Genome dynamics of the human embryonic kidney 293 lineage in

response to cell biology manipulations. Nat Commun [Internet].

2014;5:4767. Available from:

http://www.nature.com/doifinder/10.1038/ncomms5767

102. Roesner LM, Mielke C, Fähnrich S, Merkhoffer Y, Dittmar KEJ, Drexler HG,

et al. Stable expression of MutL?? in human cells reveals no specific

response to mismatched DNA, but distinct recruitment to damage sites. J

Cell Biochem. 2013;114(10):2405–14.

103. Trojan J, Zeuzem S, Randolph A, Hemmerle C, Brieger A, Raedle J, et al.

Functional analysis of hMLH1 variants and HNPCC-related mutations

using a human expression system. Gastroenterology. 2002;122(1):211–9.

104. Nik-Zainal S, Wedge DC, Alexandrov LB, Petljak M, Butler AP, Bolli N, et

al. Association of a germline copy number polymorphism of APOBEC3A

and APOBEC3B with burden of putative APOBEC-dependent mutations in

breast cancer. Nat Genet [Internet]. 2014;46(5):487–91. Available from:


105. Thielen BK, McNevin JP, McElrath MJ, Hunt BVS, Klein KC, Lingappa JR.

Innate immune signaling induces high levels of TC-specific deaminase

activity in primary monocyte-derived cells through expression of

APOBEC3A isoforms. J Biol Chem. 2010;285(36):27753–66.

106. Hultquist JF, Lengyel JA, Refsland EW, LaRue RS, Lackey L, Brown WL,

et al. Human and Rhesus APOBEC3D, APOBEC3F, APOBEC3G, and

APOBEC3H Demonstrate a Conserved Capacity To Restrict Vif-Deficient

111

HIV-1. J Virol [Internet]. 2011;85(21):11220–34. Available from:

http://jvi.asm.org/cgi/doi/10.1128/JVI.05238-11

107. Aynaud MM, Suspène R, Vidalain PO, Mussil B, Guétard D, Tangy F, et al.

Human tribbles 3 protects nuclear DNA from cytidine deamination by

APOBEC3A. J Biol Chem. 2012;287(46):39182–92.

108. Land AM, Law EK, Carpenter MA, Lackey L, Brown WL, Harris RS.

Endogenous APOBEC3A DNA cytosine deaminase is cytoplasmic and

nongenotoxic. J Biol Chem. 2013;288(24):17253–60.

109. Shee C, Cox BD, Gu F, Luengas EM, Joshi MC, Chiu LY, et al. Engineered

proteins detect spontaneous DNA breakage in human and bacterial cells.

Elife. 2013;2:e01222.

110. Chan K, Roberts SA, Klimczak LJ, Sterling JF, Saini N, Malc EP, et al. An

APOBEC3A hypermutation signature is distinguishable from the signature

of background mutagenesis by APOBEC3B in human cancers. Nat Genet



111. Caval V, Suspène R, Shapira M, Vartanian J-P, Wain-Hobson S. A

prevalent cancer susceptibility APOBEC3A hybrid allele bearing

APOBEC3B 3′UTR enhances chromosomal DNA damage. Nat Commun

[Internet]. 2014;5:5129. Available from:

http://www.nature.com/doifinder/10.1038/ncomms6129

112. Vangsted A, Klausen TW, Vogel U. Genetic variations in multiple myeloma

II: Association with effect of treatment. Vol. 88, European Journal of

112

Haematology. 2012. p. 93–117.

113. Fonseca R, Bergsagel PL, Drach J, Shaughnessy J, Gutierrez N, Stewart

AK, et al. International Myeloma Working Group molecular classification of

multiple myeloma: spotlight review. Leukemia [Internet].

2009;23(12):2210–21. Available from:

http://www.nature.com/doifinder/10.1038/leu.2009.174

114. Mitra AK, Mukherjee UK, Harding T, Jang JS, Stessman H, Li Y, et al.

Single-cell analysis of targeted transcriptome (SCATTome) predicts drug

sensitivity of single cells within human myeloma tumors. Leukemia

[Internet]. 2015;(October 2015):1094–102. Available from:


115. Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, Schubert M, et

al. A Landscape of Pharmacogenomic Interactions in Cancer. Cell. 2015;

116. Lombardo LJ, Lee FY, Chen P, Norris D, Barrish JC, Behnia K, et al.

Discovery of N-(2-chloro-6-methylphenyl)-2-(6-(4-(2-hydroxyethyl)-

piperazin-1-yl)-2-methylpyrimidin-4-ylamino)thiazole-5-carboxamide (BMS-

354825), a dual Src/Abl kinase inhibitor with potent antitumor activity in

preclinical assays. J Med Chem. 2004;47(27):6658–61.

117. Chen R, Chen B. The role of dasatinib in the management of chronic

myeloid leukemia. Vol. 9, Drug Design, Development and Therapy. 2015.

p. 773–9.

118. Keating GM. Dasatinib: A Review in Chronic Myeloid Leukaemia and Ph+

Acute Lymphoblastic Leukaemia. Drugs [Internet]. 2017;77(1):85–96.

113

Available from: http://link.springer.com/10.1007/s40265-016-0677-x

119. Wei G, Rafiyath S, Liu D. First-line treatment for chronic myeloid leukemia:

dasatinib, nilotinib, or imatinib. J Hematol Oncol. 2010;3:47.

120. Kantarjian H, Shah NP, Hochhaus A, Cortes J, Shah S, Ayala M, et al.

Dasatinib versus imatinib in newly diagnosed chronic-phase chronic

myeloid leukemia. N Engl J Med [Internet]. 2010;362(24):2260–70.


121. Dunn EF, Iida M, Myers RA, Campbell DA, Hintz KA, Armstrong EA, et al.

Dasatinib sensitizes KRAS mutant colorectal tumors to cetuximab.

Oncogene [Internet]. 2011;30(5):561–74. Available from:

http://dx.doi.org/10.1038/onc.2010.430

122. Xiao J, Xu M, Hou T, Huang Y, Yang C, Li J. Dasatinib enhances antitumor

activity of paclitaxel in ovarian cancer through Src signaling. Mol Med Rep


http://www.ncbi.nlm.nih.gov/pubmed/25975261%5Cnhttp://www.pubmedce

ntral.nih.gov/articlerender.fcgi?artid=PMC4526065

123. Le XF, Mao W, He G, Claret FX, Xia W, Ahmed AA, et al. The role of

p27Kip1 in dasatinib-enhanced paclitaxel cytotoxicity in human ovarian

cancer cells. J Natl Cancer Inst. 2011;103(18):1403–22.

124. Montero JC, Seoane S, Ocaña A, Pandiella A. Inhibition of Src family

kinases and receptor tyrosine kinases by dasatinib: Possible combinations

in solid tumors. Vol. 17, Clinical Cancer Research. 2011. p. 5546–52.

125. Smyth GK. limma: Linear Models for Microarray Data Bioinformatics and

114

Computational Biology Solutions Using R and Bioconductor. In:

Bioinformatics and Computational Biology Solutions Using R and

Bioconductor. 2005. p. 397–420.

126. Chu G, Li J, Narasimhan B, Tibshirani R, Tusher V. SAM - Significance

Analysis of Microarrays - Users guide and technical document. Policy.

2011;1–42.

127. Schwender H, Krause A, Ickstadt K. Identifying interesting genes with

siggenes. RNews. 2006;6:45–50.

128. Seal HL. Studies in the History of Probability and Statistics. XV The

historical development of the Gauss linear model. Biometrika [Internet].

1967;54(1–2):1–24. Available from:

http://biomet.oxfordjournals.org/content/54/1-2/1.abstract

129. Kolde R. Package `pheatmap’. Bioconductor. 2012;1–6.

130. Mitra AK, Harding T, Mukherjee U, Jang J, Li Y, HongZheng R, et al. A

gene expression signature distinguishes innate response and resistance to

proteasome inhibitors in multiple myeloma. Blood Cancer J.

2017;(February):1–10.

131. Krämer A, Green J, Pollard J, Tugendreich S. Causal analysis approaches

in Ingenuity Pathway Analysis. Bioinformatics [Internet]. 2014;30(4):523–

30. Available from:



132. Irish JM, Czerwinski DK, Nolan GP, Levy R. Altered B-cell receptor

115

signaling kinetics distinguish human follicular lymphoma B cells from

tumor-infiltrating nonmalignant B cells. Blood. 2006;108(9):3135–42.

133. Irish JM, Myklebust JH, Alizadeh AA, Houot R, Sharman JP, Czerwinski

DK, et al. B-cell signaling networks reveal a negative prognostic human

lymphoma cell subset that emerges during tumor progression. Proc Natl

Acad Sci [Internet]. 2010;107(29):12747–54. Available from:

http://www.pnas.org/cgi/doi/10.1073/pnas.1002057107

134. Blix ES, Irish JM, Husebekk A, Delabie J, Forfang L, Tierens AM, et al.

Phospho-specific flow cytometry identifies aberrant signaling in indolent B-

cell lymphoma. BMC Cancer [Internet]. 2012;12(1):478. Available from:

http://bmccancer.biomedcentral.com/articles/10.1186/1471-2407-12-478

135. Argyropoulos K, Vogel R, Ziegler C, Altan-Bonnet G, Velardi E, Calafiore

M, et al. Clonal B cells in Waldenström’s macroglobulinemia exhibit

functional features of chronic active B-cell receptor signaling. Leukemia.

2016;30(108):1116–25.

136. Molot RJ, Meeker TC, Wittwer CT, Perkins SL, Segal GH, Masih AS, et al.

Antigen expression and polymerase chain reaction amplification of mantle

cell lymphomas. Blood. 1994;83(6):1626–31.

137. Ginaldi L, De Martinis M, Matutes E, Farahat N, Morilla R, Catovsky D.

Levels of expression of CD19 and CD20 in chronic B cell leukaemias. J

Clin Pathol [Internet]. 1998;51(5):364–9. Available from:

http://jcp.bmj.com/cgi/doi/10.1136/jcp.51.5.364

138. Kim A, Seong KM, Kang HJ, Park S, Lee S-S. Inhibition of Lyn is a

116

promising treatment for mantle cell lymphoma with bortezomib resistance.

Oncotarget [Internet]. 6(35). Available from:

www.impactjournals.com/oncotarget

139. Shi H, Zhang CJ, Chen GYJ, Yao SQ. Cell-based proteome profiling of

potential Dasatinib targets by use of affinity-based probes. J Am Chem

Soc. 2012;134(6):3001–14.

genetic analysis of mutation and response in cancer

Documents