tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number...

25

Upload: edith-armstrong

Post on 02-Jan-2016

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds
Page 2: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

• Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations,

and translocations that localize to hundreds or even thousands of genes

• A pivotal challenge in cancer genomics is to identify the small subset of altered genes (so-called drivers) that di-

rectly contribute to tumor fitness and progression, but most altered genes are so-called passengers and their al-

teration does not confer any type of advantage to tumors

• Even as data from cancer genomes accumulates, the identification of actionable driver genes remains a crucial

limitation to therapeutic development

- Only a small subset of established driver genes are druggable given the current pharmalogical state of the art

- When a driver is druggable, it may occur in a very small fraction of patients, limiting its clinical utility

 

Background

Page 3: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

• Most recent driver discovery efforts have focused on point mutations, which directly indicate the target genes by

virtue of their precise location and less progress has been made with respect to SCNAs

• SCNAs affect a larger fraction of the genome in cancers than do any other type of somatic genetic alteration, and

SCNAs have critical roles in activating oncogenes and in inactivating tumor suppressors

• The ability to discern drivers from copy-number alteration promises to dramatically expand the set of therapeutic

targets

Background

SCNA pattern in 4,934 cancers from TCGA

Significantly recurrent focal SCNAs were observed in 140 regions, including 102 without known oncogene

or tumor suppressor gene targets and 50 with significantly mutated genes

Page 4: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Frequency of alteration in the TCGA breast cancer data

>5% popula-tion

• The increased frequency of recurring SCNAs relative to point mutations (87 SCNA regions versus six mutated genes with >5%

population frequency) highlights the need for methods to pinpoint drivers within these regions

• The most recurrent genetic lesions in breast cancer are SCNAs, often driven by inactivation of DNA repair genes such as

BRCA1/2. Moreover, HER2, one of the most therapeutically targeted drivers in breast cancer, is primarily dys-regulated by

copy-number amplification

Page 5: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Whole genome RNA interference screen

Page 6: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

A schematic of pipeline for the identification of candidate driver gene

i. Identifying regions of focal SCNAs and then

ii. Identifying driver genes within each region

by integrating functional screens and other data

using a Bayesian transfer-learning framework

Page 7: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Classic vs Helios approach

Standard classification

- Rely on an initial list of examples : drivers and passengers

- The list of known oncogenic drivers is relatively small and strongly biased toward kinases and extreme phenotypes

Helios

- Assumption : driver gene is more likely to be near the most frequently amplified segment (defined as peak) of the ISAR region

- To classify genes as either drivers or passengers, use a hierarchical Bayesian mixture model

- Automatically learns the weights of features directly from the data by leveraging information among features

Copy number model

Applied ISAR to 785 cancer sample(TCGA)

: 3083 significantly amplified region finding

Page 8: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

hierarchical Bayesian mixture model of Helios

• Iterates between two stages until convergence by

i. learning the parameters to distinguish passengers and drivers on the basis of their SCNA profile and

on the additional genomic data

ii. Re-computing the probability that each gene is a driver using the parameters determined in step 1.

• In each iteration, Helios learns a better classification of drivers and passengers, which in turn is

used to learn better parameters, until convergence

w : parameters for the different sources(x)l : parameters for SCNATn=1 ; driver , Tn=0 ; passenger

Page 9: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Data sets used for Helios

Primary tumor data from Comprehensive molecular portraits of human breast tumors Nature

(2012) 490, 61-70

- Copy number Affimetrix 6.0 SNP arrays(n=785)

- Illumina Hiseq RNA sequencing(n=732)

- whole-genome sequencing(n=507)

Cell line shRNA screens(n=29) from Essential gene profiles in breast, pancreatic, and ovarian

cancer cells Cancer Discovery(2012) 2, 172-189

Cell line data from The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer

drug sensitivity Nature (2012) 483, 603-607

- Copy number Affymetrix 6.0 SNP arrays(n=27), mRNA Affymetrix U133 plus 2,0 arrays(n=27)

Page 10: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Helios integrates data from functional screens based on the oncogene addiction

Proto-oncogene

oncogene

17q12 region 14q13 region

Page 11: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Helios Identifies candidate drivers of Breast Cancer

10 top Helios gene : FOXO1, PIK3CA, CCND1, CDK4, MYB, ERBB2, IGFR, BCL2, ESR1 +

Gold standard set of 330 genes from

The set of known amplified oncogenes from The landscape of somatic copy-number alteration across human cancers Nature, 463 (2010), pp. 899–905

The set of genes related to breast cancer according to the University of Copenhagen DISEASES database from text mining and data integration of disease-gene associations. (bioRxiv) DISEASES (2014) with score >2.5.

Filtered out genes categorized as tumor suppressors according to the Update on activities at the Univer-sal Protein Resource (UniProt) Nucleic Acids Res., 41 (2013),

Page 12: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Helios Identifies candidate drivers of Breast Cancer

Not single feature, but a combination of features that identifies the top scoring gene in each region

Page 13: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Candidate Selection for Systematic In Vitro Validation of Helios-Predicted Genes

Validated gene

• ISAR score>5.5 (17 region)

• 7 of 17 region : Top Helios gene

Bona fide breast oncogene

• 10 additional gene : unknown

unbiased validation

• 10/12(83%) validated in all of the exp.

• 7/8(88%) validated in top scoring region

Positive control

Negative control

Page 14: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

RSF-1

• Overexpression of a chromatin remodeling factor, RSF-1/HBXAP, correlates with aggressive oral squamous cell carcinoma

Am. J. Pathol., 178 (2011), pp. 2407–2415

• Rsf-1 overexpression correlates with poor prognosis and cell proliferation in colon cancer

Tumour biology, 33 (2012), pp. 1485–1491

• Rsf-1 is overexpressed in non-small cell lung cancers and regulates cyclinD1 expression and ERK activity

Biochem. Biophys. Res. Commun., 420 (2012), pp. 6–10

• Amplification of a chromatin remodeling gene, Rsf-1/HBXAP, in ovarian carcinoma

Proc. Natl. Acad. Sci. USA, 102 (2005), pp. 14004–14009

• Rsf-1 overexpression in human prostate cancer, implication as a prognostic marker

Tumor Biology, 35 (2014) pp 5771-5776

• RSF1 functions as transcription co-activator when associated with hepatitis B virus X protein (HBX)

• RSF1 functions as chromatin remodeling and spacing when associated with SNF2H At the cellular level, Rsf1/SNF2H complex

participated in chromatin remodeling by mobilizing nucleosomes in response to a variety of growth modifying signals and envi -

ronmental cues

• An amplicon containing RSF-1 was recently associated with a breast cancer subtype bearing one of the worst clinical prognoses

• Although high expression levels of RSF-1 has been associated with poor prognosis in several malignancies, its involvement in

breast cancer pathogenesis has not yet been explicitly demonstrated

Page 15: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

High expression levels of RSF-1 promote Tumorigenesis

MCF-10A-TM

Page 16: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

RSF1 expression signatureRSF1 , up-regulated genes

luminal

Basal

Basal

luminal

RSF1 , down-regulated genes

RSF1 , up-regulated genes

Page 17: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

RSF-1 Alteration promotes metastasis

Page 18: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Conclusion

Using a method that integrates data from primary tumors with functional assays on cell lines to prioritize can-

didate drivers

• The unparalleled sensitivity and specificity of Helios enabled to execute the first reported systematic vali -

dation of an algorithm designed to identify driver genes

• Helios’s performance was confirmed by a success rate of 10/12 candidates in an anchorage-independent

growth assay, successfully characterizing several regions for which there was no previously implicated

driver

• Importantly, because they selected the genes for validation based on their amplification significance (ISAR

score), rather than their Helios score, they expect that this success rate will extend to additional regions

that have equally strong Helios scores

• Moreover, many of these genes are amplified in additional epithelial cancers

(e.g., C6orf203, NIT1,ZNF652) suggesting possible drivers in those cancers as well

•  Such data sets continue to accelerate drug development and to yield deep insights into

oncogenesis

•  Helios can be viewed as an accurate in silico screen for drivers. As such, it can be applied to

additional cancer types and data types to accelerate the identification of cancer drivers

Page 19: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

The landscape of Driver Mutations in Breast cancer

Scoring >5.5

• A previous study could assign each tumor a median of two established drivers

• Adding the Helios validated genes increases this number to a median of three drivers per tumor

• Adding all predicted drivers with a high Helios score further expands this number to a median of five driv-

ers in each tumor Thus Helios has substantially expanded the set of high-confidence drivers in breast

cancer.

Page 20: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Helios modeling approach

ISAR

Modeling copy number

HELIOS algorism

Page 21: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds
Page 22: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

soft agar colony formation assay

Page 23: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Number and distribution of driver gene mutations in tumor types

NSCLC Hepatocellular carcinoma

Breast cancer

Page 24: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Helios modeling approach

Page 25: Tumor cells may harbor thousands of genetic lesions including point mutations, somatic copy-number alterations, and translocations that localize to hundreds

Helios modeling approach

Hierarchical baysian model: 다양한 종류의 정보를 이용해서 미지의 , 혹은 알수 없는 영향을 설명하고 복잡한 관계를 기술하는 많은 숫자의 잠재적 변수나 parameter 를 추론해내는 모델링 구조 .베이즈 모형은 flexible framework 를 가짐 . 따라서 parameter, 변수가 전통적인 것보다 넓은 class 의 모델 구조를 다룰 수 있다 .

N 개의 gene 이 classified.. gene n 에 대해 Tn 값이 주어지면 w 와 l 는 독립적 . 그로 인해 expectation maximization(Em) 알고리즘을 효과적으로 , 사용 가능한 모델로 fit 될 수 있게 해준다 ..