introduction to the emerald dataset ron peterson anne bergstrom lucas, agilent jean lozach,...

11
Introduction to the EMERALD Dataset Ron Peterson Anne Bergstrom Lucas, Agilent Jean Lozach, Illumina Marc Salit, NIST Russ Wolfinger, SAS Walter Liggett, NIST Jean Thierry-Mieg, NCBI DanielleThierry-Mieg, NCBI

Upload: phoebe-booker

Post on 16-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to the EMERALD Dataset  Ron Peterson  Anne Bergstrom Lucas, Agilent  Jean Lozach, Illumina  Marc Salit, NIST  Russ Wolfinger, SAS  Walter

Introduction to the EMERALD Dataset

Ron Peterson Anne Bergstrom Lucas, Agilent Jean Lozach, Illumina Marc Salit, NIST Russ Wolfinger, SAS Walter Liggett, NIST Jean Thierry-Mieg, NCBI DanielleThierry-Mieg, NCBI

Page 2: Introduction to the EMERALD Dataset  Ron Peterson  Anne Bergstrom Lucas, Agilent  Jean Lozach, Illumina  Marc Salit, NIST  Russ Wolfinger, SAS  Walter

2 EMERALD dataset introduction

MicroArray Quality Control

Shippy, R. et al, Nature Biotechnology - 24, 1123 - 1131 (2006)

Titration working Group

Ambion Human Brain RNA Stratagene Universal Human RNA

Page 3: Introduction to the EMERALD Dataset  Ron Peterson  Anne Bergstrom Lucas, Agilent  Jean Lozach, Illumina  Marc Salit, NIST  Russ Wolfinger, SAS  Walter

3 EMERALD dataset introduction

MAQC Phase II – Conduct a new titration experiment

Agilent performed a one color analysis of the phase I material.

Added more intermediate titrations to a total of 19 samples.

Total of 80 samples processed

Experiment split up and performed over three days

First day used a mixture of reagent kits. Second day fresh kit. Third day reagents from colleague.

Ambion Brain HUR

100 0

99.5 0.5

99 1

95 5

90 10

80 20

75 25

70 30

60 40

50 50

40 60

30 70

25 75

20 80

10 90

5 95

1 99

0.5 99.5

0 100

Page 4: Introduction to the EMERALD Dataset  Ron Peterson  Anne Bergstrom Lucas, Agilent  Jean Lozach, Illumina  Marc Salit, NIST  Russ Wolfinger, SAS  Walter

4 EMERALD dataset introduction

Evaluation of New Agilent Titration.

-20 -10 0 10

Discriminate Coordinate 1

-15

-10

-50

51

0

Dis

cri

min

ate

Co

ord

ina

te 2

14

13

171311

18

20

10

1715

1

15

88

2

3

410

3

17

1813

16

20

915

11

6

1219

5

11

195

12

7

2013

6

7

212

5

11

1418

19

20

14

9

7

151

2

10

76

14

5

163

8

4

39

18

14

1612

1

17

192

8

9

1016

6

4

Clustering of the arrays by amplify-label date. 3/2, 3/6, 4/10, 4/12

Page 5: Introduction to the EMERALD Dataset  Ron Peterson  Anne Bergstrom Lucas, Agilent  Jean Lozach, Illumina  Marc Salit, NIST  Russ Wolfinger, SAS  Walter

5 EMERALD dataset introduction

Sample information for Biological vs Technical variation study.

Animal: Rattus norvegicusStrain: Sprague Dawley Crl:CD(SD)Age: 7-8 weeksSex: Male Treatment: 20% propylene glycol/80%/lactic acid containing

4.3% mannitol, pH 4.0Duration : intravenous, once per week for 13 weeksNumber: 6

A 100% Liver 1A, 2A, 3A, 4A, 5A, 6A

B 75% Liver, 25% Kidney 1B, 2B. 3B, 4B, 5B, 6B

C 25% Liver, 75% Kidney 1C, 2C, 3C, 4C, 5C, 6C

D 100% Kidney 1D, 2D, 3D, 4D, 5D, 6D

Page 6: Introduction to the EMERALD Dataset  Ron Peterson  Anne Bergstrom Lucas, Agilent  Jean Lozach, Illumina  Marc Salit, NIST  Russ Wolfinger, SAS  Walter

6 EMERALD dataset introduction

Study Design

Each Sample was performed in triplicate

• eg, 1-A-1, 1A-2, 1-A3, etc.

• Each sample was placed into a single well of a 96 well plate in a randomized pattern.

• Remaining wells were filled with samples made from pooling animals.- 1-3A, B, C, D & 4-6A, B, C, D; single samples.

- 1-6A, B, C, D; each sample repeated 4 times on plate.

3 duplicate plates were produced and one plate was processed on;

• Affymetrix Rat Genome U133 plus 2.0 arrays

• Agilent Whole Rat Genome Oligo Microarray (4x44K) [G4131F]

• Illumina RatRef-12 v1 Expression BeadChip

Page 7: Introduction to the EMERALD Dataset  Ron Peterson  Anne Bergstrom Lucas, Agilent  Jean Lozach, Illumina  Marc Salit, NIST  Russ Wolfinger, SAS  Walter

7 EMERALD dataset introduction

Affymetrix Study design

Chip performed at the Novartis Institutes for Biomedical Research

96 well plate was processed on an Affymetrix GCAS robot.

96 chips were washed on 12 Fluidics Machines (48 chip lots).

48 chips lots were scanned on one of two Affymetrix Scanner.

• Technical variation parameters. - Sample plate location.

- Affymetrix chip lot.

- Fluidics station location.

- Scanner used.

- Day processed (8 of the 96 chips were processed on a different date by rehybridizing the hybridization mix on a new chip.

Page 8: Introduction to the EMERALD Dataset  Ron Peterson  Anne Bergstrom Lucas, Agilent  Jean Lozach, Illumina  Marc Salit, NIST  Russ Wolfinger, SAS  Walter

8 EMERALD dataset introduction

Agilent Study Design

Chips were processed at Agilent.

2 different technicians processed the arrays.

12 chips (48 arrays) were processed on 2 different days

A single scanner was used.

• Technical variation parameters- Technician.

- Day processed.

- Chip and reagent lots.

- Substrate.

- Starting total RNA amount (400 ng vs 200 ng).

Page 9: Introduction to the EMERALD Dataset  Ron Peterson  Anne Bergstrom Lucas, Agilent  Jean Lozach, Illumina  Marc Salit, NIST  Russ Wolfinger, SAS  Walter

9 EMERALD dataset introduction

Illumina Study Design

Chips were processed at Asuragen (service provider).

2 different technicians processed the arrays.

8 chips (96 arrays) were processed on 4 different days with 4 different kits.

A single scanner was used.

• Technical variation parameters- Technician.

- Day processed.

- Chip and reagent lots.

- Location on the chip.

- cRNA yield.

Page 10: Introduction to the EMERALD Dataset  Ron Peterson  Anne Bergstrom Lucas, Agilent  Jean Lozach, Illumina  Marc Salit, NIST  Russ Wolfinger, SAS  Walter

10 EMERALD dataset introduction

Data Access links

The data and supporting material are available from ArrayExpress

Affymetrix

• http://www.ebi.ac.uk/microarray-as/aer/result?queryFor=Experiment&eAccession=E-TABM-536

Agilent

• http://www.ebi.ac.uk/microarray-as/aer/result?queryFor=Experiment&eAccession=E-TABM-555

Illumina

• http://www.ebi.ac.uk/microarray-as/aer/result?queryFor=Experiment&eAccession=E-TABM-554

Page 11: Introduction to the EMERALD Dataset  Ron Peterson  Anne Bergstrom Lucas, Agilent  Jean Lozach, Illumina  Marc Salit, NIST  Russ Wolfinger, SAS  Walter

11 EMERALD dataset introduction

Current Plans of MAQC Phase II Titration Group

Jean and Danielle Thierry-Mieg (NCBI) have done a complete annotation of the rat genes in AceView and identified the alternative transcripts tested on all three rat arrays. http://www.aceview.org/index.html?rat

The mapping, available at ftp://ftp.ncbi.nlm.nih.gov/repository/acedb/rat, will be used to identify the groups of probes from the three array platforms testing the same transcripts and genes.

We will use this correspondence to contrast the performance of the arrays in their ability to identify biological and technical variation.