whole genome polymorphism analysis of regulatory elements in breast cancer

13
Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer AAGTCGGTGATGATTGGGACTGCTCT[C/T]AACACAAGCGAGATGAAGAAACTGA Jacob Biesinger Dr. Garry Larson City of Hope

Upload: esma

Post on 11-Feb-2016

23 views

Category:

Documents


0 download

DESCRIPTION

Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer . Jacob Biesinger Dr. Garry Larson City of Hope. AAGTCGGTGATGATTGGGACTGCTCT [C/T] AACACAAGCGAGATGAAGAAACTGA. Topics Covered Today. Cancer and Gene Regulation Combining Data: Bioinformatics Progress So Far. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

AAGTCGGTGATGATTGGGACTGCTCT[C/T]AACACAAGCGAGATGAAGAAACTGA

Jacob BiesingerDr. Garry Larson

City of Hope

Page 2: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

Topics Covered Today

Cancer and Gene Regulation Combining Data: Bioinformatics Progress So Far

Molecular Cause of Genetic Disease

Page 3: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

ATGCCGGCTTACCATA TCTACCTAAATCCGGTA

TCTACCTAAATCCGGTATGCCGGCTTACCATAAT

http://medicine.osu.edu/lend/Portfolios/0506/AR Port/files/SICKLE CELL WEBSITE/whatissickle.htm

SNPs in coding regions:

Sickle Cell Anemia

Single Nucleotide Polymorphisms and Genetic Disease

GluProPhe Ser Thr STOP

Genetic disease may also be caused by differential expression of vital proteins

ValProPhe Ser Thr STOP

TGTAGA

Protein Coding Region Untranslated region

Promoter Binding Mechanism

Micro RNA Binding Mechanism

Chunky sheep from miRNA binding site destruction

Nature Rev. Genet. 5, 202–212 (2004)

T

Page 4: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

Breast Cancer Expression

Tumor expression patterns are extremely divergent from normal cells

Could SNPs in regulatory regions of genes associated with breast cancer explain their overexpression in tumors?

http://genome-www.stanford.edu/breast_cancer/cell_line_review2001/images/figure2.html

Normal Breast Expression

Breast Tumor Expression

Page 5: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

Statistical Search for Dysregulated Genes

Expression patterns in cancers gives two categories: Estrogen Receptor + and ER-

Recent metaanalysis pooled tumor expression data for 9 studies and >15,000 genes

Top 1% ER+ > ER- 150 genes Top 1% ER+ < ER- 150 genes

Normalized expression difference between ER+ and ER-

Con

sist

ency

acr

oss

stud

ies

Page 6: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

Regulation Motifs Which TF binding sites exist in our selected genes? A recent study identified motifs conserved in

regulatory regions across 4 organismslymphocyte transmembrane adaptor 1

Promoter motifs: 123 known motifs 174 phylogenetically

conserved

Downstream motifs: 273 conserved 3’ UTR 343 conserved miRNA 6mer 368 conserved miRNA 7mer

Page 7: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

Motif Search Use Python and UCSC Genome Browser to:

Get promoter region DNA (2kb upstream from transcription start site (TSS) + max of 2kb downstream of TSS, limited by translation start)

Get 3’ untranslated region RNA Search for motifs on + and – strand

Results for Top 1% up and down:

9559 3’ UTR hits 42846 6mer hits 11719 7mer hits

22206 known motif hits 23475 phylo motif hits

Page 8: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

SNP Databases

SNP information is coming from two databases: HapMap- Four groups (270 total people) genotyped for

same SNPs CGEMS- Breast Cancer association study, complete

with p-values. A late-comer to our study (June 2007)

HapMap~4 million

CGEMS~550k

Page 9: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

Mapping SNPsHapMap~4 million

CGEMS~550k

Gene Promoters and 3’ UTR

Motif Matches

Use MSSQL 2003 and Python (pymssql) to perform a join of dbSNP, HapMap and CGEMS SNPs with regulatory motifs

Page 10: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

Verify Motif Significance

How do we know that these motifs are significant?

Hypothesis: Due to negative selection, there will be fewer SNPs in motifs than in random areas within the same region.

Method: Contrast how many motifs have at least one SNP in them against how many of 100 random sequences from the same region have at least one SNP in them

Page 11: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

Motif Counting Results

Known Top 1% Motif with Snp Motif without Snp Total1-Sided P-

ValueActual 97 18394 18491 0.000009494Random 14630 1834470 1849100  

Total 14727 1852864 1867591  

Phylo Top 1%1-Sided P-

Value

Actual 130 19363 19493 0.001889

Random 16499 1913438 1929937  

Total 16629 1932801 1979430  

3’ UTR results not yet available There is a significant difference between motifs

and random sequences.

Page 12: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

CGEMS Results

A number of SNPs that fall within motifs are associated with Breast Cancer

Highest ranking was 1514 out of 550,000

Further analysis required to say if significant

Page 13: Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer

Thanks! SoCalBSI mentors City of Hope Dr. Garry Larson Dr. David Smith Dr. Päl Sætrom Cathryn Lundberg All the SoCalBSI students!

Funded by: