whole genome polymorphism analysis of regulatory elements in breast cancer
DESCRIPTION
Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer . Jacob Biesinger Dr. Garry Larson City of Hope. AAGTCGGTGATGATTGGGACTGCTCT [C/T] AACACAAGCGAGATGAAGAAACTGA. Topics Covered Today. Cancer and Gene Regulation Combining Data: Bioinformatics Progress So Far. - PowerPoint PPT PresentationTRANSCRIPT
Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer
AAGTCGGTGATGATTGGGACTGCTCT[C/T]AACACAAGCGAGATGAAGAAACTGA
Jacob BiesingerDr. Garry Larson
City of Hope
Topics Covered Today
Cancer and Gene Regulation Combining Data: Bioinformatics Progress So Far
Molecular Cause of Genetic Disease
ATGCCGGCTTACCATA TCTACCTAAATCCGGTA
TCTACCTAAATCCGGTATGCCGGCTTACCATAAT
http://medicine.osu.edu/lend/Portfolios/0506/AR Port/files/SICKLE CELL WEBSITE/whatissickle.htm
SNPs in coding regions:
Sickle Cell Anemia
Single Nucleotide Polymorphisms and Genetic Disease
GluProPhe Ser Thr STOP
Genetic disease may also be caused by differential expression of vital proteins
ValProPhe Ser Thr STOP
TGTAGA
Protein Coding Region Untranslated region
Promoter Binding Mechanism
Micro RNA Binding Mechanism
Chunky sheep from miRNA binding site destruction
Nature Rev. Genet. 5, 202–212 (2004)
T
Breast Cancer Expression
Tumor expression patterns are extremely divergent from normal cells
Could SNPs in regulatory regions of genes associated with breast cancer explain their overexpression in tumors?
http://genome-www.stanford.edu/breast_cancer/cell_line_review2001/images/figure2.html
Normal Breast Expression
Breast Tumor Expression
Statistical Search for Dysregulated Genes
Expression patterns in cancers gives two categories: Estrogen Receptor + and ER-
Recent metaanalysis pooled tumor expression data for 9 studies and >15,000 genes
Top 1% ER+ > ER- 150 genes Top 1% ER+ < ER- 150 genes
Normalized expression difference between ER+ and ER-
Con
sist
ency
acr
oss
stud
ies
Regulation Motifs Which TF binding sites exist in our selected genes? A recent study identified motifs conserved in
regulatory regions across 4 organismslymphocyte transmembrane adaptor 1
Promoter motifs: 123 known motifs 174 phylogenetically
conserved
Downstream motifs: 273 conserved 3’ UTR 343 conserved miRNA 6mer 368 conserved miRNA 7mer
Motif Search Use Python and UCSC Genome Browser to:
Get promoter region DNA (2kb upstream from transcription start site (TSS) + max of 2kb downstream of TSS, limited by translation start)
Get 3’ untranslated region RNA Search for motifs on + and – strand
Results for Top 1% up and down:
9559 3’ UTR hits 42846 6mer hits 11719 7mer hits
22206 known motif hits 23475 phylo motif hits
SNP Databases
SNP information is coming from two databases: HapMap- Four groups (270 total people) genotyped for
same SNPs CGEMS- Breast Cancer association study, complete
with p-values. A late-comer to our study (June 2007)
HapMap~4 million
CGEMS~550k
Mapping SNPsHapMap~4 million
CGEMS~550k
Gene Promoters and 3’ UTR
Motif Matches
Use MSSQL 2003 and Python (pymssql) to perform a join of dbSNP, HapMap and CGEMS SNPs with regulatory motifs
Verify Motif Significance
How do we know that these motifs are significant?
Hypothesis: Due to negative selection, there will be fewer SNPs in motifs than in random areas within the same region.
Method: Contrast how many motifs have at least one SNP in them against how many of 100 random sequences from the same region have at least one SNP in them
Motif Counting Results
Known Top 1% Motif with Snp Motif without Snp Total1-Sided P-
ValueActual 97 18394 18491 0.000009494Random 14630 1834470 1849100
Total 14727 1852864 1867591
Phylo Top 1%1-Sided P-
Value
Actual 130 19363 19493 0.001889
Random 16499 1913438 1929937
Total 16629 1932801 1979430
3’ UTR results not yet available There is a significant difference between motifs
and random sequences.
CGEMS Results
A number of SNPs that fall within motifs are associated with Breast Cancer
Highest ranking was 1514 out of 550,000
Further analysis required to say if significant
Thanks! SoCalBSI mentors City of Hope Dr. Garry Larson Dr. David Smith Dr. Päl Sætrom Cathryn Lundberg All the SoCalBSI students!
Funded by: