effect of single nucleotide polymorphism in affymetrix probes

37
Effect of Single Nucleotide Polymorphism in Affymetrix probes Olivia Sanchez-Graillet Departments of Biological Sciences and Mathematical Sciences University of Essex (UK) December 2008

Upload: teague

Post on 30-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Effect of Single Nucleotide Polymorphism in Affymetrix probes. Olivia Sanchez-Graillet Departments of Biological Sciences and Mathematical Sciences University of Essex (UK) ‏ December 2008. Single Nucleotide Polymorphisms (SNPs ) ‏. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Effect of Single Nucleotide

Polymorphism in Affymetrix

probes

Olivia Sanchez-GrailletDepartments of Biological Sciences and Mathematical Sciences

University of Essex (UK)December 2008

Page 2: Effect of Single Nucleotide Polymorphism in Affymetrix probes

SNPs: a single base pair is

different between one

individual and the other.

Polymorphism: if at least two variants have frequencies > 1% in a population.

Single Nucleotide Polymorphisms (SNPs )

Page 3: Effect of Single Nucleotide Polymorphism in Affymetrix probes

SNPs are the most common type of

sequence variation between individuals.

SNPs are markers of phenotypes and

diseases.

SNPs may alter the gene expression and

may change or not the amino acid sequence.

Page 4: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Other common variations: DIP: deletion/insertion polymorphism : -/T , C/-

STR: short tandem repeat (microsatellite) polymorphism

(CA)19/20/21/22/23/24/25/26

MIXED: cluster containing submissions from 2 or more alleleic classes

-/AAA/AAAAA/AAAACCAAAAAAAAAAAAAAA

MNP: multiple nucleotide polymorphism with alleles of common length > 1

AAA/CCC

Page 5: Effect of Single Nucleotide Polymorphism in Affymetrix probes

We are studying the relationships between probes intensities on Affymetrix GeneChips.

Affymetrix Gene chips contain thousands of probes

Page 6: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Probes map to different exons. Because of alternative splicing, some of the exons may be upregulated whereas others may be downregulated. We therefore focus on probes within exons.

Page 7: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Probes mapping to the same exon should behave similarly.

What causes Affymetrix probes to behave as outliers with respect to other probes within a single exon?

Objective: Study the impact of SNPs and other common

variation upon Affymetrix probes on GeneChips. Explore whether the existence of a SNP causes a

probe to behave differently to other probes which map uniquely to a single exon.

Page 8: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Previous research on how SNPs might affect gene

expression: Allele A is over-expressed compared to allele B or vs or both alleles

are equally expressed (Kumari et al.,2007).

Hybridization resulted from variation might mislead the interpretation

of data from individual genes, even if a single probe is affected

(Alberts et al., 2007).

In 15 of 25 probesets, SNPs caused a difference in hybridization.

Not every SNP causes a difference in hybridization (Alberts et al.,

2007).

When the SNPs located at the very beginning or end of a probe, it

might have little or not effect on hybridization (Hughes et al., 2001).

Page 9: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Method:

A) Generation of exon heatmaps

B) Identification of probes containing SNPs.

C) Study of SNP-probes which are outliers.

Page 10: Effect of Single Nucleotide Polymorphism in Affymetrix probes

1. CEL files are downloaded from the GEO

database.

2. Calibration of microarray data: Quality control: detection of spatial flaws.

Row Quantile Normalisation.

3. Correlate the intensities for groups of probes, using many thousands of GeneChip experiments.

(A) Generation of exon heatmaps

Page 11: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Example flaw in CEL file

W. B. Langdon et al. (2008). A Survey

of Spatial Defects in Homo Sapiens

Affymetrix GeneChips. In IEEE/ACM

Transactions on Computational

Biology and Bioinformatics.

Page 12: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Probe correlations

The correlation in log intensities between Probe 9 and Probe 11 from probeset 208772_at, obtained from 5,638 HG-U133A GeneChips.

Page 13: Effect of Single Nucleotide Polymorphism in Affymetrix probes

The number in each square is the correlation multiplied by 10 and rounded

Blue = low correlationYellow = high correlation

Average intensity in GEO

Relative probe position on exon

Standard deviation in GEO

Probe number on heatmap

Page 14: Effect of Single Nucleotide Polymorphism in Affymetrix probes

4. Unique mappings (alignments) of probes to individual

exons (Sanchez-Graillet et al.,2008. Widespread existence of

uncorrelated probe intensities from within the same probeset on Affymetrix

GeneChips. In Journal of Integrative Bioinformatics, 5(2):98) :

avoid cross-hybridization and multiple targeting.

sense direction (antisense is avoided).

X

(25 bases, 96% identity)probe 2

exon 3

transcript 3

(25 bases, 100% identity)probe 2

exon 2

transcript 2

(25 bases, 100% identity)probe 1

exon 1

transcript 1

Page 15: Effect of Single Nucleotide Polymorphism in Affymetrix probes

(B,C) Identification of probes containing SNPs and outlier

SNP probes

Page 16: Effect of Single Nucleotide Polymorphism in Affymetrix probes

1. SNPs data downloaded from Ensembl 48 : 3' UnTranslated Region, 5' UTR, and coding

regions.

Chromosome 10

Gene 1

Gene 2

transcript 1

transcript 2

transcript 3

3'UTR

3'downstream

3'5'

5'upstream

ENSG00000172586

ENSG00000172586

ENSG00000212959

gene_id

ENST00000372837

ENST00000372833

ENST00000391642

trans_id

3downstream

3utr

5upstream

G/A

G/A

G/A

75213225

75213225

75213225

10

10

10

rs11000776

rs11000776

rs11000776

biotypeallelechrom_positionchrom_namesnp_id

Page 17: Effect of Single Nucleotide Polymorphism in Affymetrix probes

2. Identification of exons with SNPs by using

transcript information and chromosomic

positions.

3. Selection of unique exons and probes:

Only unique exons with more than 4 probes.

SNP positions on the probes uniquely

mapping to exons are obtained.

Page 18: Effect of Single Nucleotide Polymorphism in Affymetrix probes

4. Identification of SNP-probes which are

outliers:

The overall correlation matrix median

(OMM) is compared with each SNP-probe

median (SPM).

If OMM – SPM >= 0.15

Page 19: Effect of Single Nucleotide Polymorphism in Affymetrix probes

0.66>0.150.03<0.15Difference

SPM_9

0.21

SPM_8

0.84

OMM

0.87

SNP in anoutlier probe

SNP in an no-outlier probe

Page 20: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Results

Page 21: Effect of Single Nucleotide Polymorphism in Affymetrix probes

ENSE00001454795HG_U133_Plus_2

O

N

N

N

N

Page 22: Effect of Single Nucleotide Polymorphism in Affymetrix probes

ENSE00001191156HG_U95A

SNP in overlapped probes.

The same SNP is in outlier probes and no-outliers probes

10 1045_s_at-109-625 rs45612038 14 T/C CTTCAAGAGCATCATGAAGAAGAGT O

9 1045_s_at-237-557 rs45612038 16 T/C ACCTTCAAGAGCATCATGAAGAAGA O

8 1045_s_at-357-497 rs45612038 18 T/C AGACCTTCAAGAGCATCATGAAGAA N

7 1045_s_at-586-137 rs45612038 20 T/C TGAGACCTTCAAGAGCATCATGAAG N

6 1045_s_at-233-503 rs45612038 23 T/C ATATGAGACCTTCAAGAGCATCATG N

5 1045_s_at-153-611 rs45612038 25 T/C ACATATGAGACCTTCAAGAGCATCA N

Probe position heatmap probe_id snp_id snp position allele sequence Outlier

Page 23: Effect of Single Nucleotide Polymorphism in Affymetrix probes

ENSE0000129003HG_U133A

SNPs in only no-outlier probes

rs11038 221667_s_at-512-441 10 13 A/G GTTTATGATCTGACCTAGGTCCCCC N

rs6413487 221667_s_at-570-641 9 7 C/G TAAGGACGCTGGGAGCCTGTCAGTT N

snp_id probe_id probe_position_heatmap snp_position_probe allele seq

Page 24: Effect of Single Nucleotide Polymorphism in Affymetrix probes

ENSE00001416163

HG_U133A (5,374 CEL files)

SNP in only outlier probes

rs13505 219768_at-2-233 8 24 C/A CTGAATTTAGATCTCCAGACCCT GC O

rs13505 219768_at-602-267 9 4 C/A CCT GCCTGGCCACAATTCAAATTAA O

snp_id probe_id probe_position_heatmap snp_position_probe allele sequence

Page 25: Effect of Single Nucleotide Polymorphism in Affymetrix probes

ENSE00001416163HG_U133_Plus_2 (2,572 CEL files)

SNP in both outlier and no-outlier probes

rs13505 219768_at-765-395 8 24 C/A CTGAATTTAGATCTCCAGACCCT GC N

rs13505 219768_at-507-443 9 4 C/A CCT GCCTGGCCACAATTCAAATTAA O

snp_id probe_id probe_position_heatmap snp_position_probe allele sequence

Page 26: Effect of Single Nucleotide Polymorphism in Affymetrix probes

ENSE00001416163 HG_U133A_2(159 CEL files)

SNP in only NO-outlier probes

rs13505 219768_at-432-225 8 24 C/A CTGAATTTAGATCTCCAGACCCT GC N

rs13505 219768_at-534-259 9 4 C/A CCT GCCTGGCCACAATTCAAATTAA N

snp_id probe_id probe_position_heatmap snp_position_probe allele sequence

Page 27: Effect of Single Nucleotide Polymorphism in Affymetrix probes

~60,000 SNPs distributed in unique exons of ten array

designs. 11% in unique exons in which all probes that contain the

same SNP are outliers.

5% in which not all the probes containing the same SNP

are outliers.

84% in which all probes are not outliers.

These numbers may vary according to the Ensembl

version used and the threshold for outliers chosen.

Page 28: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Cross-validation for HG_U133_Plus_2

Examination of SNP-Outlier Associations

Outlier (Yes) Outlier (No) Total

SNP (Yes)

11.4%

(n=1,788)

88.6%

(n=13,869)

100%

SNP (No)

11.6%

(n=17,231)

88.4%

(n=131,035)

100%

Phi = -.002

Page 29: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Median differences and positions of SNPs on probes in HG_U133_Plus_2

Page 30: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Median differences and main alleles (A,C,T,G) found in SNPs in HG_U133_Plus_2

Page 31: Effect of Single Nucleotide Polymorphism in Affymetrix probes

We have identified other causes of outlier probes:

Probes containing a contiguous run of 4 or more guanines: formation of G-quadruplexes occurring on the surface of a GeneChip. (Upton et al., BMC Genomics (in press)).

Probes located next to bright probes, such as at

the edge of the Genechip, are affected by blur.

Motifs or any other “problematic” subsequences.

Page 32: Effect of Single Nucleotide Polymorphism in Affymetrix probes

11%

89%

With PS

Without PS

Outlier SNP-probes in HG_U133_Plus_2 with “problematic” sub sequences (PS):

G’s (>=4), CCTCC, CCACC, GGTGG

40%

60%

With PS

Without PS

Gs, CCTCC

CCACC, GGTGG

Outlier probes No-outlier probes

Page 33: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Conclusions

Page 34: Effect of Single Nucleotide Polymorphism in Affymetrix probes

We have not found a common behaviour when SNPs are present in a probe.

SNPs do not seem to cause outliers in groups of probes representing individual exons.

SNPs may influence other biological events like alternative poly(A).

The genomic region where SNPs are found, the position of the SNP in a probe, the main allele, and the number of SNPs in a probe does not make a probe an outlier in the correlation heatmap.

Page 35: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Bioinformatics GroupDr Andrew Harrison PhysicsDr Berthold Lausen StatisticsDr Abdel Salhi MathematicsProfessor Graham Upton Statistics

Dr William Langdon Physics and Computer Sc.Dr Olivia Sanchez Computer Sc. Dr Maria Stalteri Inorganic Chemistry & Bioinformatics

Jose Arteaga-Salas StatisticsRohmatul Fajriyah StatisticsAbdelhak Kheniche Pharmacology & MathematicsRahim Bux Khokhar MathematicsZain-Ul-Abdin Khurho MathematicsFarhat Memon Computer Sc. Joanna Rowsell Mathematics

Page 36: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Thank you!

Page 37: Effect of Single Nucleotide Polymorphism in Affymetrix probes

Adjacent probes within a cell on a GeneChip have the same sequence – a run of Guanines will result in closely packed DNA with just the right properties to form quadruplexes.