cancer genome analysissssykim/teaching/s13/slides/lecture_cancer.pdf · tumors •...
TRANSCRIPT
Tumors
• Cancer cells – Reproduce in defiance of the normal restraints on cell growth and
division
– Invade and colonize territories normally reserved for other cells
• Types of cancers – Carcinomas: cancers arising from epithelial cells
– Sarcomas: cancers arising from connec8ve 8ssue or muscle cells – Leukemias and lymphomas: cancers derived from white blood cells
and their precursors
Development of Cancer Cells
• Agents that trigger carcinogenesis – Chemical carcinogens (causes local DNA altera8ons)
– Radia8on such as x-‐rays (causes chromosome breaks and transloca8ons), UV light (causes DNA base altera8ons)
– Viruses: Hepa88s-‐B, Hepa88s-‐C virus for liver cancer
Carcinogenesis
• Stages of progression in the development of cancer of the epithelium of the uterine cervix.
Cancer-Causing Genes
• Oncogenes – Muta8ons that confer gain of func8ons to oncogenes can promote
cancer – Muta8ons with growth-‐promo8ng effects on the cell – OXen heterozygous
• Tumor suppressor genes – Muta8ons that confer loss of func8on can contribute to cancer – Typically homozygous
• DNA maintenance genes – Indirect effects on cancer development by not repairing DNA or
correc8ng muta8ons
Driver and Passenger Mutations
• Driver muta8ons – Causally implicated in oncogenesis
– Gives growth advantage to cancer cells – posi8vely selected in the microenvironment of the 8ssue
– E.g., muta8ons that de-‐ac8vate tumor suppressor genes
• Passenger muta8ons – Soma8c muta8ons with no func8onal consequences
– Does not give growth advantage to cancer cells
Identifying Driver Mutations
• Typically involves sequencing tumor DNA and the matched normal DNA
• Comparison with reference genome and other known DNA polymorphisms to filter out benign muta8ons
• Signatures of driver muta8ons – Frequently observed muta8ons across tumors are likely to be driver
muta8ons. But, what about tumor heterogeneity? – Muta8ons that cluster in subset of genes (e.g., oncogenes). Passenger
muta8ons are more randomly distributed across genomes
Challenges
• Soma8c muta8ons in both genomes (SNP, CNVs, indels, chromosomal rearrangement etc.) and epigenomes can be posi8vely selected (drivers)
• Different cancer types have different rates of muta8ons. Mutator phenotype may or may not be present.
• Infrequently occurring driver muta8ons are hard to iden8fy.
Challenges
• Computa8onal challenges unique to cancer genome analysis – Sequence alignment and assembly can be significantly more
challenging because of highly rearranged chromosomes and high varia8on across cancer genomes
– Soma8c muta8on calling is more challenging
• the impurity of the sample – Normal genomes have allele copies of 0, 1, or 2
– Cancer genomes can have allele copies of frac8ons of 0, 1, or 2
• Most soma8c muta8ons are rare
Breast Cancer Genomes and Subtypes
Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70. 2012.
Sorting Intolerant to From Tolerant (SIFT)
• A tool that uses sequence homology to predict whether an amino acid subs8tu8on affects protein func8on
• Assuming that important amino acids are conserved in the protein family, changes at well-‐conserved posi8ons tend to be predicted as deleterious.
• Given a protein sequence, – choose related proteins – obtains an alignment of these proteins with the query – Based on the amino acids appearing at each posi8on in the alignment,
calculate the probability that an amino acid at a posi8on is tolerated condi8onal on the most frequent amino acid being tolerated.
• Classifies a subs8tu8on into tolerated or deleterious ones
SIFT: predic8ng amino acid changes that affect protein func8on. Nucl. Acids Res. (2003) 31 (13): 3812-‐3814.
PolyPhen
• SoXware for predic8ng damaging effects of missense muta8ons. – Predic8on based on
• Eight sequence based features • Three structure-‐based features
– Naïve-‐Bayes classifier – Train dataset 1
• Posi8ve examples: 3,155 damaging alleles annotated in the UniProt database as causing human Mendelian diseases and affec8ng protein stability or func8on
• Nega8ve examples: 6,321 differences between human proteins and their closely related mammalian homologs
– Train dataset 2 • Posi8ve examples: 13,032 human disease-‐causing muta8ons from UniProt
• Nega8ve examples: 8,946 human nonsynonymous SNPs without annotated involvement in disease.
A method and server for predic8ng damaging missense muta8ons. Nature Methods 7, 248 -‐ 249 (2010)
Summary
• Understanding the gene8cs of cancer – Both germline polymorphisms and soma8c muta8ons can contribute
to trigger tumorigenesis
– Determine driver and passenger muta8ons • OXen frequently occurring muta8ons are declared as driver muta8ons
• SIFT and PolyPhen for evalua8ng the func8onal effects of muta8ons