design considerations for array cgh to oligonucleotide arrays

8

Click here to load reader

Upload: r-a-baldocchi

Post on 11-Jun-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Design considerations for array CGH to oligonucleotide arrays

Design Considerations for Array CGH toOligonucleotide Arrays

R. A. Baldocchi,1,4* R. J. Glynne,2,5 K. Chin,1 D. Kowbel,1 C. Collins,1

D. H. Mack,2,6 and J. W. Gray1,31University of California at San Francisco Cancer Center, San Francisco, California

2Eos Biotechnology, Inc., South San Francisco, California3Lawrence Berkeley National Laboratory, Berkeley, California

4DakoCytomation California, Inc., Carpinteria, California5Genomics Institute of the Novartis Research Foundation, San Diego, California

6Alta Partners, San Francisco, California

Received 5 March 2005; Revision Received 22 March 2005; Accepted 23 March 2005

Background: Representational oligonucleotide microar-ray analysis has been developed for detection of singlenucleotide polymorphisms and/or for genome copy num-ber changes. In this process, the intensity of hybridiza-tion to oligonucleotides arrays is increased by hybridiz-ing a polymerase chain reaction (PCR)–amplified repre-sentation of reduced genomic complexity. However,hybridization to some oligonucleotides is not suffi-ciently high to allow precise analysis of that portion of thegenome.Methods: In an effort to identify aspects of oligonu-cleotide hybridization affecting signal intensity, we ex-plored the importance of the PCR product strand towhich each oligonucleotide is homologous and thesequence of the array oligonucleotides. We accom-plished this by hybridizing multiple PCR-amplified pro-ducts to oligonucleotide arrays carrying two sense and

two antisense 50-mer oligonucleotides for each PCRamplicon.Results: In some cases, hybridization intensity dependedmore strongly on the PCR amplicon strand (i.e., sense vs.antisense) than on the detection oligonucleotide sequence.In other cases, the oligonucleotide sequence seemed todominate.Conclusion: Oligonucleotide arrays for analysis of DNA copynumber or for single nucleotide polymorphism content shouldbe designed to carry probes to sense and antisense strandsof each PCR amplicon to ensure sufficient hybridization andsignal intensity. q 2005 International Society for Analytical Cytology

Key terms: comparative genomic hybridization; micro-arrays; multiplex polymerase chain reaction; genomicamplification; single nucleotide polymorphism

Hybridization of genomic DNA to arrayed DNA probesis now widely used to assess genome copy numberchanges and/or allelotype throughout complex genomes.In mammalian analyses, initial arrays for copy numberanalysis carried relatively large DNA probes prepared fromyeast artificial chromosome (YAC) (1) and bacterial artificialchromosome (BAC) (2,3) clones. These array comparativegenomic hybridization (array CGH) analyses provided suffi-cient measurement precision to detect gain or loss of singlecopies of a genomic segment in aneuploid cells or in popu-lations contaminated by normal genomic DNA (3,4). Earlyanalyses provided approximately megabase resolution witha few thousand probes on each array (5). Higher resolutionarrays were developed for limited regions of the genomeby adding overlapping BACs that covered regions of inter-est (6). Most recently, BAC arrays have been developed car-rying more than 30,000 BACs, thus providing near contigu-ous coverage over the entire human genome (7).

BAC arrays provide remarkable measurement precisionand resolution. However, they are limited in several

aspects. First, production of arrays carrying large numbersof BACs is labor intensive and clone management issuessometimes lead to probe identity errors. In addition, thesearrays provide no information about allelotype and theresults cannot be linked to specific genes because eachBAC probe typically spans several genes. Higher resolu-tion can be obtained by replacing BACs with cDNAs (8),but the array manufacture and clone management prob-lems remain and these analyses also yield no informationabout allelotype.All of these limitations can be overcome by substituting

oligonucleotides for cDNAs or BACSs as array probes. The

*Correspondence to: R. A. Baldocchi, DakoCytomation California, Inc.,

6392 Via Real, Carpinteria, CA 93013.

E-mail: [email protected]

Contract grant sponsor: USPHS, Contract grant numbers: CA58207,

CA 64602

Published online 12 September 2005 in Wiley InterScience (www.

interscience.wiley.com).

DOI: 10.1002/cyto.a.20161

q 2005 International Society for Analytical Cytology Cytometry Part A 67A:129–136 (2005)

Page 2: Design considerations for array CGH to oligonucleotide arrays

challenge in this approach is to achieve sufficiently highhybridization and signal intensity to allow precise mea-surements of allelotype or copy number. This has beenovercome by reducing the complexity of the genomicDNA being hybridized. One approach, termed Representa-tional Oligonucleotide Microarray Analysis (ROMA), reducesgenomic complexity by preparing a polymerase chain reac-tion (PCR)–amplified representation of a fraction of the gen-ome to be analyzed (9). This is accomplished by digestingwith one or more restriction enzymes, ligating common PCRprimer sequences to both ends of each restriction fragment,and amplifying the resulting complex mixture by PCR withthe common primer sequence. Hybridization of this ampli-fied mixture to an array carrying probes that match the ampli-fied restriction fragments allows assessment of copy number(9) and allelotype (10). However, the hybridization efficiencyand, hence, the measurement precision may vary consider-ably between probes so that some regions of the genome areanalyzed with considerably more precision than others.

Our goal in this study was to develop a more robust oli-gonucleotide array CGH procedure by understandingsome of the factors that influence signal intensity. Weapproached this by evaluating hybridization of reducedcomplexity genomic mixtures to arrays comprised of 50-mer oligonucleotides. We produced reduced complexityprobes by multiplex PCR amplification of selected regionsof the genome. Duplicates of four separate oligonucleo-tide probes were placed on the array for each PCR ampli-con: two homologous to the sense strand and two comple-mentary sequences homologous to the antisense strand.Hybridization intensities were measured for each of thefour probes for each amplicon. We compared the perfor-mance of this approach to oligo-array CGH by analyzing X-chromosome copy number in cell lines with one to fivecopies of the X-chromosomes and by comparing oligo-

array CGH to BAC array CGH for assessment of genomecopy number abnormalities on chromosome 20 in a well-characterized breast cancer cell line.

MATERIALS AND METHODSOligonucleotide Arrays

We developed oligonucleotide arrays carrying four 50-mer detection oligonucleotide probes for each PCR ampli-con as illustrated in Figure 1. Two of the four probes werehomologous to the amplicon sense strand. The other twoprobes were complementary to the first two. Sequencesfor the oligonucleotide probes were selected using Pri-mer3 software (11). They were 50 nucleotides in length,did not overlap with the PCR primers, had melting tem-peratures (Tm) between 73�C and 95�C (at 500 mM NaCl,calculated as described previously) (12), and had freeenergies greater than22.5 kcal/mol for hairpin loop struc-ture. Oligonucleotide arrays were printed on Motorola(Surmodics) 3D-Link slides as described by the manufac-turer with the exception that oligonucleotides were resus-pended at 30 mM in 10 mM sodium phosphate buffer(pH 8.5). Each oligonucleotide was printed in replicatespots (Fig. 1) at �150 mm center-to-center spacing andallowed to dry. The amino-modified 50 ends of the printedoligonucleotides were covalently attached after printingby exposing the printed slides to 75% humidity. This humi-dification also serves to block inappropriate attachment ofDNA in later steps.

Multiplex PCR Primers

The primer design strategy used for multiplex PCRamplification is illustrated in Figure 1. One PCR primerpair was prepared for each region of the genome to beinterrogated. Each primer was comprised of a �25-nucleo-tide, unique 30 sequence and a 23-nucleotide, universal 50

FIG. 1. Method of multiplex and quantitative PCR. A schematic PCR amplicon illustrates the relative positions of primer pairs and detection oligonucleo-tides (probes), shown adjacent to their complementary sequences. The sequences of four probes per amplicon are flanked by those of composite PCR pri-mers. The 50 -amino–modified probes were synthesized and printed as array elements onto 3D-Link (Surmodics) microscope slides. Genomic DNA was PCRamplified using a collection of 12 to 48 primer pairs that are comprised of sequence-specific portions in tandem with T3 or T7 universal primer sequences.These universal primers were used in a subsequent PCR step. Test and reference genomic samples were separately amplified. Fluorescent nucleotides wereused to separately label the PCR products in a Klenow reaction using random hexamers. Labeled products representing the test and reference sampleswere cohybridized to the arrays.

130 BALDOCCHI ET AL.

Page 3: Design considerations for array CGH to oligonucleotide arrays

sequence (13). The universal primer sequences were T3(50-TAATACGACTCACTATAGGGAGA) and T7 (50-AAT-TAACCCTCACTAAAGGGAGA) sequences described byWang et al. (14). Only coding sequences were interro-gated in this study but this is not necessary. PCR primerswere selected using Primer3 software (11) to producePCR products from 170 to 350 bp in length and to have(a) �25-nucleotide locus-specific segments, (b) Tm bet-ween 60�C and 73�C at 100 mM NaCl (15), (c) differencesin Tm between paired primers of lower than 5�C, and(d) free energies greater than 0 kcal/mol for 30-end dimerformation and greater than 22.5 kcal/mol for hairpin loopstructure.

Choice of Query Loci

Chromosome and array CGH analyses have shown thatchromosome 20q is frequently amplified in ovarian andother cancers. For example, Suzuki et al. (16) showed thatthis region is present at increased copy number in �70%of ovarian cancers. Thus, we developed an oligonucleo-tide array and multiplex PCR strategy to assess 33 knownor predicted genes in this region. This analysis system alsointerrogated 5 loci on the X chromosome and 10 loci onother chromosomes. The X-linked loci were used to evalu-ate and optimize the sensitivity and linearity of oligo-arrayCGH. The 10 loci on other chromosomes were used tonormalize ratio data. Sequence data for each locus weretaken from the National Center for Biotechnology Informa-tion (NCBI) databases (http://www.ncbi.nlm.nih.gov/),including dbEST, Unigene, dbSTS, nr, and htgs, and fromthe University of California, Santa Cruz (UCSC) HumanGenome Project Working Draft (http://genome.ucsc.edu/).Sequences used in this work were blasted against severalof these databases three times throughout the course ofthe work, to test for uniqueness within the genome as thedatabases were rebuilt.

Cell Lines

We obtained female human genomic DNA samplesrepresenting the normal or aneuploid genotypes 45XO,46XX, 47XXX, 48XXXX, and 49XXXXX from The CoriellInstitute for Medical Research (NIGMS Coriell Cell Reposi-tories; Camden, NJ, USA). The breast cancer cell lineMCF7 was obtained from ATCC (Rockville, MD, USA).

Multiplex PCR

Twelve to 48 loci were PCR amplified from normal,aneuploid, and tumor genomic DNAs in a two-step amplifi-cation protocol. In the first step, 10-ml volumes containing2 ml of 53 modified RDA buffer (17) (340 mM TrisHCl, pH8.8; 80 mM [NH4]2SO4, 20 mM MgCl2, 320 mM each dNTP;all from Sigma-Aldrich Corp., St. Louis, MO, USA; and50 mM 2-mercaptoethanol from Promega, Madison, WI,USA), 0.038 U of AmpliTaq (Applied BioSystems, FosterCity, CA, USA), 25 nM each primer, and 10 ng genomicDNAwere compiled on ice in a cold (4�C) room. Templategenomic DNA was pretreated by heat shearing (95�C for5 min in 2 mM Tris-HCl, 0.2 mM ethylenediaminetetraace-

tic acid, pH 8.0) to enhance the PCR yield and uniformity.To minimize nonspecific amplification, the primer poolswere heated to 65�C for 1 min and then snap-cooled onice before their addition to the master mix. Primers andtemplate genomic DNA were added in the cold room andthe reactants were immediately inserted into a thermocy-cler (model 480, Perkin Elmer, Oak Brook, IL, USA) pre-heated to 80�C before the start of thermocycling. Theparameters for amplification included an initial denatura-tion step of 95�C for 1 min, 14 to 22 cycles at 94�C for 30s and at 62�C for 3 min, an extension step at 72�C for 7min, and a 4�C soak. Three microliters of this reaction pro-duct was then added, in the cold room, to a second tubeof PCR reactants that contained 6 ml of 53 modified RDAbuffer, 0.12 U of AmpliTaq, and 500 nM each T3 and T7primers (14) in a final volume of 30 ml. The second step ofPCR was performed at 94�C for 30 s, 45�C for 1 min, and72�C for 2 min, for 2 to 18 cycles. Primers used in thisstudy can be obtained on request.Successful PCR was assessed by 1.5% agarose gel elec-

trophoresis in 0.53 Tris-Borate EDTA (TBE). Before elec-trophoresis, PCR products were purified using QiagenPCR Purification Kit (Qiagen, Chatsworth, CA, USA). To30 ml of PCR reaction, 500 ml of Qiagen ‘‘buffer PB’’ and5 ml of 3 M sodium acetate (pH 5.4) were added beforethe first spin. The sodium acetate solution is added toadjust the pH to �7, to enable higher yield of the PCR pro-duct. After elution from the column using 0.53 Qiagen‘‘buffer EB,’’ the concentrate was adjusted to 40 ng/ml andelectrophoresed. Products of individual and pooled pairsof all primers were evaluated for yield and abundance rela-tive to unwanted primer dimers.

PCR Product Labeling and Hybridization

Two-hundred nanograms of PCR product was labeledusing a Klenow reaction, including 1 ml 103 RandomPriming Buffer (Amersham Biosciences, Arlington Heights,IL, USA), 1 mg random hexamers (Invitrogen, La Jolla, CA,USA), 25 mM each dATP, dCTP, and dGTP, 10 mM dTTP(Invitrogen, Carlsbad, CA, USA), and 30 mM of Cy5-dUTPor Cy3-dUTP (Amersham Biosciences) in a total volume of9.4 ml. The reactants are heated in a PCR machine to 94�Cfor 1 min to denature the template. Upon cooling to 37�Cin normal maximum ramp time, 0.4 ml of Klenow Frag-ment (30fi50 exo-; Amersham Biosciences) is added toeach tube, and the reaction is incubated at 37�C for 1 to2 h. The labeled test (Cy5) and reference (Cy3) productswere combined, applied in a 50-ml volume (adjusted withnuclease-free water) to Microspin G50 columns (Amer-sham Biotech), and spun according to the manufacturer’sprotocol. The incorporation of Cy-dUTPs was demon-strated by electrophoresis on mini (22 3 22 mm) agarosegels poured on microscope slides. Briefly, 3.0% 3:1NuSieve (Sigma-Aldrich) gels were formed under a cover-slip with a controlled thickness of 1 mm. Lanes wereloaded with a total of 3.0 ml containing 0.6 ml 30% glycerolin nuclease-free water, 1.4 ml labeled sample, and 1.0 ml0.4% warm (40�C) agarose in exonuclease-free water. Themini gels were run at 150 V for 12 min and, after removing

131OLIGONUCLEOTIDE ARRAY CGH

Page 4: Design considerations for array CGH to oligonucleotide arrays

the coverslip, were dried for 15 min at 80�C on a heatblock. These dried gels were then scanned on the laserscanner. The relative abundance of incorporated and unin-corporated fluorophores could thus be qualitatively andquantitatively monitored. The remainder of the purified,labeled probe was combined with 25 nmol of each dNTP,1.5 mmol of sodium pyrophosphate, 2.3 ml 203 salinestandard citrate (SSC), and 1 mg salmon sperm DNA beforeconcentrating the probe by lyophilization. After adjustingthe volume to 15 ml, 0.38 ml of 10% sodium dodecylsulfatewas added and the sample was denatured at 95�C for2 min and allowed to cool gradually to room temperaturefor 5 min. After blocking the surfaces of the 3D-Link slidesin accordance with the manufacturer’s protocol, the dena-tured sample was spun to collect the condensation, mixedwith a pipette, applied to each array, and covered with a22- 3 22-mm coverslip. Hybridization was carried out in acustom-built humidified chamber (Protein Design LabsInc., Fremont, CA, USA) in a 64�C water bath for 1 to 2 h.Stringent washing done at room temperature included2 min in 33 SSC, 0.03% sodium dodecylsulfate, 5 min in13 SSC, and 5 min in 0.23 SSC. Slides were dried by cen-trifugation (3 min at 800 rpm) before imaging with a laserscanner.

Analysis of Array Hybridization Results

Slides were scanned at 532 nm and at 632 nm using aGenePix 4000B laser scanner (Axon Instruments, Inc.,Union City, NJ, USA) and analyzed using the bundledGenePix Pro 3.0 segmentation software. Scanning wasperformed at a laser power setting of 30%, and the photo-multiplier tube setting was adjusted independently foreach laser channel and for each hybridization experiment,so that the brightest spots were 50% to 80% of saturation.Segmentation of spots was performed using the algo-rithms in this software, including alignment and spot-fit-ting features. Mean pixel intensity values were taken asthe mean raw intensity of the population of the pixels ineach segmented region containing a spot. We correctedfor nonspecific background fluorescence by subtractingintensities of spots containing ‘‘null’’ oligonucleotides(expected not to hybridize to any sequences in the labeledprobe). This subtraction of signal due to the nonspecifichybridization component of mean pixel intensity valuesgave null-subtracted intensity (NSI) values. If an NSI valuewas less than 50% greater than the median intensity of thenull set, it was discarded as not informative. If informative,NSI values corresponding to a given locus from each ofthe fluorescence channels (where 635 nm representedthe test samples and 532 nm represented the referencesamples) were used to calculate a test/reference ratiodescribing the relative abundance for each locus. Ratios ofNSI values were averaged among four replicate spots ofdetection oligonucleotides. Means and standard deviationsof NSI ratios were compared between sets of replicatespots, alternate detection oligonucleotide sequences perlocus, alternate sense and antisense strands for each detec-tion oligonucleotide, and various loci.

RESULTS

Our main goal in this study was to explore aspects ofoligo-array CGH that influenced hybridization intensityand linearity to guide development of improved strategiesfor genome copy number and single nucleotide poly-morphism analyses. We assessed the relative importanceof oligonucleotide probe sequence and PCR ampliconsequence on hybridization intensity during oligonucleo-tide array CGH analysis of several different genes. In addi-tion, we assessed the linearity of oligo-array CGH duringanalysis of X-chromosome sequences in cell lines carrying1, 2, 3, 4, or 5 copies of the X-chromosome. We comparedoligo-array CGH and BAC array CGH for assessment of gen-ome copy number along a region of chromosome 20 thatis frequently amplified in breast and ovarian cancers.We produced low complexity DNA for hybridization by

multiplex PCR amplification of up to 48 different seg-ments of the genome. We employed a two-step PCR pro-cedure to balance representation among query sequencesand to enhance linearity of the assay. The first PCR stepinvolved a pool of 12 to 48 pairs of primers. This effec-tively amplified all of the PCR targets. However, becausePCR primers tend to amplify with variable efficiency, com-plexity may become reduced with extended amplification.To prevent this, we limited the number of cycles in thefirst step and added a second step of PCR using modifiedT3 and T7 primers to produce sufficient material for label-ing and hybridization.In initial multiplex PCR experiments, primer-dimer for-

mation to the exclusion of product of interest was a pro-blem. We investigated the basis of primer-dimer formationby cloning and sequencing the primer-dimer products.These analyses revealed that primer dimers formedbetween primers with 30-complementarity involving atleast five nucleotides. Thus, some primers were redesignedso that none had 30-complementarity of no more than fournucleotides. We also found that the number of cycles usingthe multiplex primer pool affected primer-dimer forma-tion. We found that using 20-cycle amplification withsequence-specific primers, before switching to amplifica-tion using the universal primers, increased the yield ofdesired product (mainly �300- to 400-bp range) relative tothat of a primer-dimer side product (�90 to 100 bp).We adjusted the number of cycles in the second step of

the PCR amplification with T3 and T7 primers to allow lin-ear, quantitative analysis. We tested the linearity of amplifi-cation and oligo-array CGH analysis by comparing copynumber estimates at seven loci on chromosome 20 withfive loci on the X chromosome for a cell line with fourcopies of the X chromosome and two copies of chromo-some 20. Figure 2 summarizes results derived from variouscycle numbers in the second step. Best linearity of detec-tion was achieved using DNA prepared by second-stepamplification of 6 to 10 cycles.We determined the extent of bias in hybridization and

signal intensity due to the use of different dye-conjugateddUTPs by separately labeling two 200-ng aliquots of a mul-tiplex PCR product with Cy5- or Cy3-dUTP. Labeled pro-ducts were then mixed and cohybridized to arrayed oligo-

132 BALDOCCHI ET AL.

Page 5: Design considerations for array CGH to oligonucleotide arrays

nucleotides. We observed (data not shown) that fluores-cent intensity variability associated with using alternateCy3- and Cy5-dUTP conjugates was insignificant. We alsoobserved some reaction-to-reaction variability during PCRas evidenced by starting with the same template and inde-pendently amplifying, labeling with only Cy3-dUTP, andhybridizing. Although this seemed to be a minor problem,we found that this could be diminished by pooling 10separate multiplex PCR reactions before labeling. Becausethis pooling was deemed impractical, most of the datareported here were obtained without employing suchpooling.

We investigated the extent to which hybridization andsignal intensity varied between the sense and the comple-mentary antisense oligonucleotide probes. In some cases,the signal intensity varied up to 16-fold between senseand antisense probes within one PCR amplicon, whereasthe variability between alternate sense probes or alternateantisense probe sequences within a PCR amplicon wasmuch less. Figure 3A, for example, shows this to be thecase for PCR amplicons distributed along the gene,CDC25B. In other cases, hybridization intensities varied

most between oligonucleotides complementary to thesame strand of the PCR amplicon. Figure 3B shows this tobe the case for sense and antisense probes for severalamplicons distributed along the gene, ZNF217.We assessed the linearity of oligo-array CGH carried out

as described above by analyzing cell lines carrying fromone to five copies of the X chromosome and two copies ofchromosome 20. A karyotypically normal, 46,XX cell linewas used as the reference in these experiments. Theresults are summarized in Figure 4. Seven chromosome 20and five X chromosome measurements were averaged forthis analysis. These analyses were part of a study in which48 separate loci were analyzed simultaneously, i.e., hybri-dization probe was prepared using 48-plex PCR amplifica-tion. The estimated chromosome X copy numbermeasured in this study increased monotonically withthe number of X chromosomes. However, the magnitude ofthe estimated copy number increase was somewhat lowerthan expected at higher X-chromosome copy numbers. Thereasons for this attenuation are not fully understood.We compared the copy number measured using oligo-

array CGH and BAC array CGH. BAC array CGH analyses

FIG. 2. Optimizing the linearity of the method depends on the determination of appropriate number of PCR cycles. We tested the ability of our methodto detect the difference between four copies and two copies of the X chromosome. These data are depicted as the log2 of ratios of Cy5/Cy3 fluorescence.Values represent the relative copy number of seven chromosome 20 and five chromosome X loci. Data points are averages of four replicate spots, with stan-dard deviation bars. Each locus has four such averages, representing two different 50-mer probes, and two different locations on the microscope slidewhere replicate sets were printed. Results shown are the result of (A) 20 1 2 cycles, (B) 20 1 6 cycles, (C) 20 1 10 cycles, and (D) 20 1 14 cycles. Ourresults demonstrate that the capacity to detect twofold differences in copy number is optimal in the range of approximately 20 1 6 to 20 1 10 cycles ofPCR (approximately equivalent to 23 to 27 cycles in total; see Materials and Methods).

133OLIGONUCLEOTIDE ARRAY CGH

Page 6: Design considerations for array CGH to oligonucleotide arrays

were carried out as described previously (3). Figure 5 com-pares measurements of copy number at several loci alongchromosome 20 for the breast cancer cell line, MCF7. Thiscell line was chosen for this comparison because chromo-some 20 is highly amplified (18), and this amplicon is wellcharacterized. Landmark gene loci are indicated for orien-tation. In general, oligo-array CGH and BAC array CGH esti-mates of copy number were concordant.

DISCUSSION

Array CGH using YACs (1), BACs (2,3,19), cDNAs (8),and cloned genomic sequences (20) as probes has beenused successfully to detect genome copy number changesin several species including humans. In general, arrayCGH resolution is limited by the density and genomicextents of probes on the arrays. Typically, measurementprecision is higher with larger probes because the hybridi-zation signals are more intense. However, the ultimategenomic resolution is lower with large probes becausethe probes interrogate an extended portion of the gen-ome. Lucito et al. (9,20) demonstrated that signal intensityand, hence, measurement precision could be increased bydecreasing the complexity of the DNA sample used forhybridization to the point where use of oligonucleotidearrays became possible. In their case, complexity wasreduced by preparing a representation of the genome(e.g., by amplifying only fragments flanked by specificrestriction sites). This increased signal intensity on oligo-nucleotide arrays enough that single copy numberchanges could be reliably detected. However, the signalintensity, measurement precision, and linearity variedalong the genome. We assessed several aspects of probeand hybridization mixture design to optimize the process.The most important influence on hybridization and sig-

nal intensity was revealed during analyses of hybridizationto sense and antisense probes in the same PCR amplicon.In general, signal intensities among alternate sense or anti-sense probes were similar within a PCR amplicon. How-ever, the signal sometimes varied dramatically between asense probe and the complementary antisense probewithin the same PCR amplicon. Because the concentra-tion of hybridizing DNA is identical for all sense and anti-sense probes within one PCR amplicon, we conclude thatthe difference in hybridization and signal intensitybetween sense and antisense probes is due to differencesin the conformation of the sense and antisense strands inthe labeled DNA during hybridization. This is similar tothe remarkable variation in hybridization intensity of RNAto oligonucleotides tiled along transcribed genes observed

FIG. 3. Variation in signal intensity between a sense and its complemen-tary antisense probe can be substantial. Five genetic loci were used toassess the variability that exists when complementary oligonucleotidesare used as probes during hybridization. Three different segments wereanalyzed per genetic locus, and each was hybridized with a multiplexPCR product labeled with only Cy3-dUTP. A, B: The results from two ofthese loci are depicted. The variation is quite large, up to a 16-fold differ-ence in signal intensity between complementary probes.

FIG. 4. Copy number discrimination wasdetermined using a series of aneuploid DNAswith one to five copies of chromosome X.These DNAs were amplified using 48 pairs ofoligos, five of which amplified X loci, and byusing 20 1 9 cycles of PCR. Fluorescenceratios were plotted as log2 values versus theratio of the amount of chromosome X mate-rial to the amount of chromosome 20 mate-rial in each DNA sample (fluorescence ratioswere adjusted by leave-one-out analysis).Data for autosomal loci are represented byopen diamonds and chromosome X data bysolid diamonds. Dashed line indicates whereideal chromosome X data might be plotted.Data points represent averages of locusaverages (16 replicate spots per locus). Therewere 43 autosomal loci and five chromosomeX loci. Error bars represent standard devia-tions of the averages of locus averages ofreplicate spots.

134 BALDOCCHI ET AL.

Page 7: Design considerations for array CGH to oligonucleotide arrays

by Mir et al. (21). Our results suggest that oligonucleotidearrays should be designed to carry sense and antisenseprobes for each region of the genome for highest preci-sion analyses of copy number or allelotype. This strategywill ensure that each locus will be interrogated with thehighest precision. Of course, this comes at the expense ofdoubling the number of elements on the array but thisbecomes less of an issue as technologies are developed toincrease the oligonucleotide densities. Further, oligonu-cleotides associated with suboptimal signal can be elimi-nated during second-generation array design.

Other aspects of oligo-array CGH can be improved incases in which the number of loci interrogated is limitedas was the case in the present study. For example, the pro-blem of primer-dimer formation during PCR amplificationinfluenced hybridization linearity and could be minimizedby designing all primers with minimal 30-end complemen-tarity (fewer than five contiguous nucleotides) with anyother primer in the pool. Use of a two-step PCR amplifica-tion strategy also helped linearity. This consisted of �20cycles of amplification using a collection of all primerpairs, followed by 5 to 10 cycles using a pair of universalprimers that amplified all PCR amplicons.

In conclusion, we have explored several aspects ofoligo-array CGH that might influence copy number and/orsingle nucleotide polymorphism analyses. Conformationaldifferences between sense and antisense strands of thePCR-amplified sequences during hybridization seem to

have the largest effect so that sense and antisense probesshould be tested for each PCR amplicon for best resultsduring array design. When fully optimized, oligo-arrayCGH seems well suited to accurate assessment of genomecopy number and/or allelotype across the genome(9,10,20) or in limited regions as described in the presentreport.

LITERATURE CITED1. Solinas-Toldo S, Lampel S, Stilgenbauer S, Nickolenko J, Benner A,

Dohner H, Cremer T, Lichter P. Matrix-based comparative genomichybridization: biochips to screen for genomic imbalances. GenesChromosomes Cancer 1997;20:399–407.

2. Pinkel D, Segraves R, Sudar D, Clark S, Poole I, Kowbel D, Collins C,Kuo WL, Chen C, Zhai Y, Dairkee SH, Ljung BM, Gray JW, AlbertsonDG. High resolution analysis of DNA copy number variation usingcomparative genomic hybridization to microarrays. Nat Genet 1998;20:207–211.

3. Hodgson G, Hager JH, Volik S, Hariono S, Wernick M, Moore D,Nowak N, Albertson DG, Pinkel D, Collins C, Hanahan D, Gray JW.Genome scanning with array CGH delineates regional alterations inmouse islet carcinomas. Nat Genet 2001;29:459–464.

4. Snijders AM, Fridlyand J, Mans DA, Segraves R, Jain AN, Pinkel D,Albertson DG. Shaping tumor and drug resistant genomes by instabil-ity and selection. Oncogene 2003;22:4370–4379.

5. Snijders AM, Nowak N, Segraves R, Blackwood S, Brown N, Conroy J,Hamilton G, Hindle AK, Huey B, Kimura K, Law S, Myambo K, Palmer J,Ylstra B, Yue JP, Gray JW, Jain AN, Pinkel D, Albertson DG. Assembly ofmicroarrays for genome-wide measurement of DNA copy number. NatGenet 2001;29:263–264.

6. Lapuk A, Volik S, Vincent R, Chin K, Kuo WL, de Jong P, Collins C,Gray JW. Computational BAC clone contig assembly for compre-hensive genome analysis. Genes Chromosomes Cancer 2004;40:66–71.

FIG. 5. Comparison of oligo-array and BAC array CGH. This is a high-resolution analysis of the chromosome 20q amplicon assayed by both methods. Inboth assays, the same genomic DNA samples were analyzed, but in oligo CGH, the reference genome is represented by a pool of 10 replicate multiplexamplification reactions, using the 23 female DNA as template. Tester is MCF7 genomic DNA. BAC array data are indicated by solid diamonds and oligo-arraydata by open diamonds. Chromosome position was determined according to the December 2000 freeze of the UCSC compilation.

135OLIGONUCLEOTIDE ARRAY CGH

Page 8: Design considerations for array CGH to oligonucleotide arrays

7. Ishkanian AS, Malloff CA, Watson SK, DeLeeuw RJ, Chi B, Coe BP, Snij-ders A, Albertson DG, Pinkel D, Marra MA, Ling V, MacAulay C, LamWL. A tiling resolution DNA microarray with complete coverage ofthe human genome. Nat Genet 2004;36:299–303.

8. Pollack, JR Perou, CM Alizadeh, AA, Eisen MB, Pergamenschikov A,Williams CF, Jeffrey SS, Botstein D, Brown PO. Genome-wide analysisof DNA copy-number changes using cDNA microarrays. Nat Genet1999;23:41–46.

9. Lucito R, Healy J, Alexander J, Reiner A, Esposito D, Chi M, Rodgers L,Brady A, Sebat J, Troge J, West JA, Rostan S, Nguyen KC, Powers S, YeKQ, Olshen A, Venkatraman E, Norton L, Wigler M. Representationaloligonucleotide microarray analysis: a high-resolution method todetect genome copy number variation. Genome Res 2003;13:2291–2305.

10. Zhao X, Li C, Paez JG, Chin K, Janne PA, Chen TH, Girard L, Minna J,Christiani D, Leo C, Gray JW, Sellers WR, Meyerson M. An integratedview of copy number and allelic alterations in the cancer genome usingsingle nucleotide polymorphism arrays. Cancer Res 2004;64:3060–3071.

11. Rozen S, Skaletsky H. Primer3 on the WWW for general users and forbiologist programmers. Methods Mol Biol 2000;132:365–386.

12. Meinkoth J, Wahl G. Hybridization of nucleic acids immobilized onsolid supports. Anal Biochem 1984;138:267–284.

13. Wang DG, Fan JB, Siao CJ, Berno A, Young P, Sapolsky R, Ghandour G,Perkins N, Winchester E, Spencer J, Kruglyak L, Stein L, Hsie L, Topa-loglou T, Hubbell E, Robinson E, Mittmann M, Morris MS, Shen N, Kil-burn D, Rioux J, Nusbaum C, Rozen S, Hudson TJ, Lander ES. Large-scale identification, mapping, and genotyping of single-nucleotidepolymorphisms in the human genome. Science 1998;280:1077–1082.

14. Wang BQ, Lei L, Burton ZF. Importance of codon preference for pro-duction of human RAP74 and reconstitution of the RAP30/74 com-plex. Protein Expr Purif 1994;5:476–485.

15. Breslauer KJ, Frank R, Blocker H, Marky LA. Predicting DNA duplexstability from the base sequence. Proc Natl Acad Sci USA 1986;83:3746–3750.

16. Suzuki S, Moore DH 2nd, Ginzinger DG, Godfrey TE, Barclay J, PowellB, Pinkel D, Zaloudek C, Lu K, Mills G, Berchuck A, Gray JW. Anapproach to analysis of large-scale correlations between genomechanges and clinical endpoints in ovarian cancer. Cancer Res 2000;60:5382–5385.

17. Lisitsyn N, Wigler M. Cloning the differences between two complexgenomes. Science 1993;259:946–951.

18. Volik S, Zhao S, Chin K, Brebner JH, Herndon DR, Tao Q, Kowbel D,Huang G, Lapuk A, Kuo WL, Magrane G, De Jong P, Gray JW, CollinsC. End-sequence profiling: sequence-based analysis of aberrant gen-omes. Proc Natl Acad Sci USA 2003;100:7696–7701.

19. Albertson DG, Ylstra B, Segraves R, Collins C, Dairkee SH, Kowbel D,Kuo WL, Gray JW, Pinkel D. Quantitative mapping of amplicon struc-ture by array CGH identifies CYP24 as a candidate oncogene. NatGenet 2000;25:144–146.

20. Lucito R, West J, Reiner A, Alexander J, Esposito D, Mishra B, PowersS, Norton L, Wigler M. Detecting gene copy number fluctuations intumor cells by microarray analysis of genomic representations. Gen-ome Res 2000;10:1726–1736.

21. Mir KU, Southern EM. Determining the influence of structure onhybridization using oligonucleotide arrays. Nat Biotechnol 1999;17:788–792.

136 BALDOCCHI ET AL.