arial 12-point - home - math - the university of utahpalais/pcr/proposals/phase ii_norichan... ·...

22
Specific Aims “Mutation scanning” techniques attempt to detect the presence of any sequence alteration in a fragment of DNA. When used with diploid DNA, mutation scanning commonly screens for differences between the two copies. The DNA fragments are generated by PCR and analyzed for completely matched hybrids called homoduplexes, and mismatched hybrids called heteroduplexes. All conventional scanning techniques require a separation step to detect heteroduplexes. However In contrast , we have recently introduced a homogenous scanning system that requires no separation steps or reagent additions after PCR (1). Certain DNA dyes can detect heteroduplexes in solution by subtle changes in melting behavior. These dyes do not inhibit PCR and are added before amplification. After PCR, heteroduplexes are detected by high-resolution melting curve analysis in less than 5 minutes. In Phase I, we identified a scanning dye compatible with standard fluorescein optics and demonstrated mutation scanning by high-resolution melting in 384-well format. Furthermore, we achieved >90% sensitivity of heterozygous SNP detection on 384-well plates: , our milestone to proceed onto Phase II studies. Our Phase II specific aims focus on improving, extending and testing the capabilities of our highly parallel, homogeneous scanning system to enable a commercial launch. 1. Improve the sensitivity and convenience of mutation scanning by optimizing dye chemistry, hardware, software, and melting method to prepare the, “LightScanner”, as a competitive commercial product. Several candidate dyes will be systematically screened for PCR inhibition, heteroduplex detection and stability. Hardware modifications will include improving sample temperature homogeneity and increasing the resolution of both sample temperature and fluorescence measurements. The major software addition will be automatic heterozygote detection by a hierarchical clustering algorithm. The effect of ionic strength, pH, common PCR additives (DMSO, etc.), cooling and heating rates will be determined. 2. Implement internal temperature controls to increase the sensitivity of homozygous variant detection. Homozygous variant detection depends on the absolute sample temperature. Well-to-well temperature homogeneity better than 0.5C is difficult to achieve during a temperature ramp. An internal temperature control would monitor the temperature in the PCR solution of each well and be used to correct for any temperature difference between wells. Possible internal controls include DNA (linear double strand or hairpin) or a passive dye with a steep temperature coefficient that could be monitored at a second

Upload: dohanh

Post on 17-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

Specific Aims

“Mutation scanning” techniques attempt to detect the presence of any sequence alteration in a fragment of DNA. When used with diploid DNA, mutation scanning commonly screens for differences between the two copies. The DNA fragments are generated by PCR and analyzed for completely matched hybrids called homoduplexes, and mismatched hybrids called heteroduplexes. All conventional scanning techniques require a separation step to detect heteroduplexes. HoweverIn contrast, we have recently introduced a homogenous scanning system that requires no separation steps or reagent additions after PCR (1). Certain DNA dyes can detect heteroduplexes in solution by subtle changes in melting behavior. These dyes do not inhibit PCR and are added before amplification. After PCR, heteroduplexes are detected by high-resolution melting curve analysis in less than 5 minutes. In Phase I, we identified a scanning dye compatible with standard fluorescein optics and demonstrated mutation scanning by high-resolution melting in 384-well format. Furthermore, we achieved >90% sensitivity of heterozygous SNP detection on 384-well plates:, our milestone to proceed onto Phase II studies. Our Phase II specific aims focus on improving, extending and testing the capabilities of our highly parallel, homogeneous scanning system to enable a commercial launch.

1. Improve the sensitivity and convenience of mutation scanning by optimizing dye chemistry, hardware, software, and melting method to prepare the, “LightScanner”, as a competitive commercial product. Several candidate dyes will be systematically screened for PCR inhibition, heteroduplex detection and stability. Hardware modifications will include improving sample temperature homogeneity and increasing the resolution of both sample temperature and fluorescence measurements. The major software addition will be automatic heterozygote detection by a hierarchical clustering algorithm. The effect of ionic strength, pH, common PCR additives (DMSO, etc.), cooling and heating rates will be determined.

2. Implement internal temperature controls to increase the sensitivity of homozygous variant detection. Homozygous variant detection depends on the absolute sample temperature. Well-to-well temperature homogeneity better than 0.5C is difficult to achieve during a temperature ramp. An internal temperature control would monitor the temperature in the PCR solution of each well and be used to correct for any temperature difference between wells. Possible internal controls include DNA (linear double strand or hairpin) or a passive dye with a steep temperature coefficient that could be monitored at a second wavelength. Sensitivity of SNP homozygote detection, with and without internal temperature controls, would be monitored with the DNA toolbox.

3. Investigate clinical testing by scanning large genes for mutations. Testing for many genetic disorders is difficult because the causative mutations are often scattered all over the gene. We will test our scanning method on samples from two clinical disorders, paroxysmal nocturnal hemoglobinuria (PNH), and cystic fibrosis (CF).

4. Investigate options for multiplexing, including amplicon Tm and fluorescent color. Although highly parallel analysis decreases the need for multiplexing, repetitive screening applications may justify the initial effort of multiplex design. We will demonstrate the feasibility of multiplexing in mutation scanning by: a) combining amplicons of different melting temperatures, and b) amplifying different targets with labeled primers of unique spectra that can be differentiated by multicolor optics.

Our long-term objective is to further develop homogeneous nucleic acid techniques, integrating sample preparation, amplification and analysis into cost effective research and clinical solutions.

Page 2: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

Significance

Most genetic diseases are complex. Many different sequence alterations in the same or different genes result in the same disease phenotype. The initial hope that most human diseases are caused by a limited number of sequence alterations has turned out not to be true. Many genes are often involved, and many different mutations within a gene may cause the same or similar disease patterns. The future of genetic testing will require highly parallel analysis of many coding regions within the same gene and in many different genes.

The Human Genome Project has succeeded in sequencing most regions of human DNA. Work to identify the genes and sequence alterations associated with disease continues at a rapid pace. Usually, linkage studies associate phenotype with genetic markers like simple sequence repeats (SSRs) or single nucleotide polymorphisms (SNPs) to identify candidate genes. Then, specific sequence alterations including SNPs, insertions, and deletions that cause missense, frameshift, or splicing mutations pinpoint the gene and the spectrum of responsible mutations.

However, even when the genetic details become known, it is difficult to use this knowledge in routine medical practice because the methods to analyze DNA are expensive and complex. Only when costs are significantly lowered and the methods dramatically simplified will DNA analysis be used in every day clinical practice for effective disease detection and better treatment.

Sequencing is the gold standard for identifying sequence variation. Why not just sequence everything? Sequencing can be commercially feasible for genetic analysis – Myriad Diagnostics Genetics, Inc. routinely sequences BRCA1 and BRCA2 for clients worldwide. The only drawbacks are cost (approaching $3,000 per individual) and complexity. Standard sequencing requires the following steps, 1) amplification by PCR, 2) clean up of the PCR product, 3) addition of cycle sequencing reagents, 4) cycle sequencing for dideoxy termination, 5) clean up of the termination products, and 6) separation on a DNA sequencer. This complexity can be automated, as it has been at Myriad and in all large sequencing centers. However, 90-99% of sequences come back normal. A simple method to identify sequences as normal could eliminate most of the time, cost, and effort of sequencing. These methods of screening DNA for abnormalities are known as “scanning” methods. Once an abnormality is identified, then and only then need it be sequenced or otherwise genotyped. Mutation scanning is in contrast to “genotyping” that focuses on detecting specific sequence alterations.

Many scanning techniques have been developed. These include single strand conformational polymorphism (SSCP), denaturing gradient gel electrophoresis (DGGE), conformation sensitive gel electrophoresis (CSGE), denaturing high-performance liquid chromatography (dHPLC), and temperature gradient capillary electrophoresis (TGCE). Most of these methods are based on detecting heteroduplexes produced from the amplification of heterozygous DNA, that is, one chromosome copy is normal and the other is altered. The detection sensitivity ranges from 50-100% and depends on the PCR product length and type of mismatch (2). All of these techniques require separation on a gel or matrix and expose PCR products to the environment. Analysis either dilutes or eliminates the PCR product and either require hours (SSCP, DGGE, CSGE, TGCE), or analyzes only one sample at a time (dHPLC).

Most human mutations are present in one copy and can be detected by scanning techniques. Conventional scanning methods do not detect homozygous changes (SSCP is an exception, but the sensitivity is low). This is a limitation, because important homozygous mutations do occur in humans (e.g., cystic fibrosis F508del) and single copy organisms (bacteria, viruses) only have one copy. Homozygous changes are best identified by genotyping methods or sequencing. Sequencing also

Page 3: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

identifies heterozygous DNA, although interpretation is not always easy, especially for insertions or deletions.

This proposal aims to adapt a new heteroduplex detection technique to a highly parallel format in order to make screening DNA simple and cost effective. The method is simple because only PCR reagents and a nucleic acid dye are required. No separation or purification steps are necessary. After PCR, the 96- or 384-well plate is moved from the thermal cycling instrument to a high-resolution melting instrument and a melting curve is obtained in <5 min. The method is “closed-tube” with no reagent additions or risk of PCR contamination. The analysis does not dilute or destroy the PCR product. It can be used for any purpose after analysis.

High-resolution melting is similar to high-definition TV or enhanced satellite imaging. The

ability to collect high-density information allows image magnification to reveal greater detail. The “images” of DNA-melting take the form of fluorescence vs temperature plots, or “melting curves”. Interpretation of data is greatly aided by software algorithms. Although DNA melting analysis has been known for many years, the conventional technique measures absorbance, requires large amounts of purified DNA, and takes hours to complete. In contrast, fluorescent melting curve analysis requires only a few minutes and can be performed directly on the PCR product mixture.

In addition to identifying the presence of a heterozygous change somewhere in the PCR

product, high-resolution melting can often identify the specific mutation. In Preliminary ResultsStudies, we show that all four common beta-globin heterozygotes (AS, AC, AE, SC) are distinguishable by the shape of their melting curves (3). In such cases, scanning and genotyping can be combined into one simple melting analysis. Obviously, since the number of possible variants is large, some sequence variants will show melting curves that are difficult or impossible to distinguish. For example, 300 different single base heterozygotes are possible within a 100 base region, and this does not consider, multiple base changes, insertions, or deletions. However, when the spectrum of mutations is limited high-resolution melting analysis will often genotype as well as scan. Sequencing can always be performed for confirmation, if the time and expense are indicated.

Not only can high-resolution melting be used for heteroduplex scanning and genotyping, it can identify homozygous sequence alterations as well. This is also covered in Preliminary ResultsStudies, where the 4 common beta-globin homozygotes (AA, SS, CC, and EE) are distinguished. Again, not all homozygous changes will be distinguishable, but many will. The sensitivity and specificity of scanning and genotyping with high-resolution melting analysis depends entirely on the resolution of the technique. High-resolution melting analysis is easiest with a single sample where a capillary can be completely surrounded by a heating element. This proposal suggests that similar results can be obtained in micro-titer format that will be good enough for most applications. In Phase I, we will show that a 384-platform is good enough for heteroduplex scanning with an SNP detection sensitivity of at least 90% in PCR products up to 300 bp. In Phase II, we will improve detection sensitivity for heterozygous and homozygous variants, scan two interesting clinical targets, develop multiplexing strategies, and launch a commercial instrument.

Although not a commercial goal, our methods and heteroduplex dyes may also be used on conventional real-time PCR instruments. However, the resolution of these instruments for melting curve analysis is poor and the detection sensitivity will be limited (3). We also have broad protection on the methods through pending patent applications.

High-resolution scanning/genotyping has many applications in both gene discovery research and genetic testing. Although not a focus of the current application, the discrimination power of high-resolution melting analysis can be used as a genetic marker for initial disease correlation to identify responsible genes. The primary advantage of such a marker is that it is easy to obtain by rapid, non-

Page 4: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

destructive analysis without probes or electrophoresis. For example, loss of heterozygosity (LOH) could be established by melting curve analysis. Furthermore, the heterozygosity of the PCR product can be increased at will by including more than one variable locus, for example, multiple SNPs. The genotype does not need to be known for association studies; segregation with phenotype is the only requirement. PCR products could be equally spaced over a chromosome at areas of heterozygosity. No probes, no electrophoresis.

Once a gene is identified as a likely candidate responsible for a phenotype, all exons and splice junctions are screened for sequence variants that may be disease-causing mutations. Again, scanning techniques (and high-resolution melting) can be very useful to rule out normal exons. When an exon is abnormal, it must be followed up with sequencing to identify the specific polymorphisms and mutations that are present. This establishes the frequency and identity of various mutations and polymorphisms.

When a disease is caused by only a few mutations, direct genotyping tests are reasonable. These vary from conventional restriction digestion of PCR products to homogeneous fluorescent methods. However in most diseases, many mutations occur and specific genotyping tests are not feasible because too many would be required. The options are sequencing or scanning. We believe high-resolution melting can eliminate 90-99% of sequencing requirements in the analysis of complex genetic disease. So, are you still going to sequence everything? Consider this. Your first step in sequencing is to amplify by PCR. Can you spare 5 min to move your amplified PCR plate onto a melting instrument to scan for sequence variants? After 5 min, you can take your plate back and continue your normal sequencing workflow, including amplicon cleanup, addition of cycle sequencing reagents, thermal cycling for cycle sequencing, cleanup of your extension products, separation on a sequencer, and analysis of the results. Might you consider not going to sequencing for the 90-99% of your PCR products that are normal or can be directly genotyped by melting?

HIgh-resolution melting can easily determine the identity of individuals at highly polymorphic loci. In Preliminary Studies, we show that melting analysis of HLA-A exon 2 segregates members of a family into histocompatible groups. This is a very rapid and cost effective way to establish HLA identity between related individuals prior to transplantation. Similar HLA matching of unrelated individuals would be more complex but may be feasible. Forensic identity typing is also possible.

Another application of high-resolution melting that deserves brief mention is haplotyping of multiple polymorphisms. Different haplotypes produce hetero- and homoduplexes with different stabilities. Hence, the cis or trans sequence relationship results in distinguishable melting curves (as long as the loci are in the same melting domain). Haplotyping is difficult by other methods and impossible by many (e.g. sequencing).

A summary of high-resolution melting analysis compared to other scanning techniques follows.

High-Resolution Melting Other Scanning TechniquesMethod Homogeneous Separation-basedAmplicon Exposure Closed-tube Open environmentTime to Result 1-5 min HoursDisposition of Sample Reusable Diluted, discardedApplication Scanning and Genotyping ScanningVariants Detected Hetero- and Homozygotes Heterozygotes

Page 5: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

Preliminary Studies

Instrumentation. We have been working on PCR methods and instruments for homogeneous DNA analysis for over 10 years. In the early 1990s, we developed rapid-cycle PCR, amplifying genomic DNA in as little as 10 min (). In the mid 1990s, STTR funding allowed us to develop the LightCycler (), one of the leading real-time PCR instruments available today. To date, approximately 4,000 LightCyclers have been sold. The R.A.P.I.D. (Ruggedized Advanced Pathogen Identification Device) is a modified LightCycler developed by Idaho Technology for the US military in 1998. It is rugged (still works after a 1 meter drop onto concrete), lightweight and small enough to fit into a backpack. Its intended use is as a defense against biologic warfare. Approximately 300 have been sold to date. In 2003, the R.A.P.I.D. was chosen as the platform for the Joint Biological Agent Identification and Diagnostics System (JBAID) of the US Government. The principle investigator of this grant is the primary inventor of the LightCycler and related technologies, including the use of SYBR Green I, hybridization probes, and melting analysis in real-time PCR.

The LightTyper is a 96- or 384-well low-resolution melting instrument developed for genotyping with fluorescently-labeled probes that is currently manufactured at Idaho Technology and sold by Roche. High-resolution is not necessary because probe melting temperature transitions are broad and different genotypes are separated by several degrees C. The LightTyper competes against many other high-throughput SNP typing systems. Characteristics of the LightTyper will be detailed in the Experimental Plan. We will convert the LightTyper into a high-resolution instrument (the “LightScanner”) in order to achieve the specific aims of this grant.

The HR-1 is a high-resolution melting instrument that analyzes one sample at a time, launched commercially by Idaho Technology in the fall of 2003. The HR-1 was developed as a gold standard to see what might be possible with high-resolution melting. The sample geometry is ideal for temperature homogeneity because the sample capillary is completely surrounded by a heating element. Analysis is fast (1-2 min) with a throughput of 45 samples per hour and the price is right (<$10K), but the manual handling of glass capillaries lessens its commercial appeal. The resolution obtainable on the HR-1 is vastly superior to the LightCycler and all other real-time PCR instruments. Further details on our instruments can be found on the Idaho Technology (www.idahotech.com) and Roche (www.roche-applied-science.com) websites.

Probe melting analysis. Genotyping on the LightCycler or LightTyper monitors the melting of fluorescent probes. Thermal melting of DNA is a simple and elegant way to genotype and can be thought of as a “dynamic dot-blot”. Two strands of DNA fall apart or “melt” as the sample is gradually heated. The melting of hybridization probes was first observed on fluorescence vs. temperature plots acquired continuously during PCR (Fig. 1). In the annealing/extension phase, the probes hybridize to single stranded product and the fluorescence increases. When heating toward denaturation, the probes dissociate and the fluorescence returns to baseline levels. Exactly how they melt depends on the probe stability, and this depends on the genotype. Genotypes that differ in only a single base, can easily be discriminated, first demonstrated in 1997 for factor V Leiden (). The standard LightCycler scheme for genotyping uses 2 adjacent hybridization probes, one labeled 3' with fluorescein and the other labeled 5' with a longer wavelength dye (). In 2000, we modified this method to use only one single labeled probe (SimpleProbe), simplifying design considerations and cost. Recently, we developed a system that does not require any probes for genotyping. High-resolution melting of the entire amplicon after PCR allows genotyping without probes (,). Subtle differences in DNA sequence down to single base changes can be identified by high-resolution melting of the PCR product. The method can be applied to genotyping known mutations or to scanning for unknown sequence alterations within the PCR product. SNP genotyping (factor V Leiden) with adjacent hybridization probes (2 probes), SimpleProbes (1 probe) and high-resolution amplicon melting (no probes) is shown in Fig. 2.

Page 6: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

Amplicon melting analysis. When PCR amplification is continuously monitored in the presence of SYBR Green I, the double or single strand nature of the PCR products can be followed (Fig. 3). The rapid drop in fluorescence at high temperature monitors the denaturation or melting of the PCR product. This melting curve is a characteristic of the PCR product and depends on GC%, length, and sequence (). It is now common practice to add a melting protocol at the end of PCR with SYBR Green I to assess the purity and identity of the amplified products. However, the LightCycler (and other real-time PCR instruments) were never designed for high-resolution melting analysis.

Despite much effort and high hopes, simple SYBR Green I melting analysis of PCR products usually fails to detect small sequence alterations like heterozygous SNPs. If the PCR products are purified and excess SYBR Green I is added after PCR, heterozygous SNPs have been detected in products up to 167 bp (). The concentration of SYBR Green I necessary inhibits PCR (), so the dye is added after amplification. In another report, SYBR Green I melting analysis was used to detect heterozygous SNPs in products up to 212 bp (). However, a GC clamp was required on one primer and adjustment of the products to 12 M urea was necessary before melting analysis. Simple, closed-tube scanning with SYBR Green I has proved elusive.

As it turns out, SYBR Green I was one of the problems (instrument resolution was the other). This dye inhibits PCR at concentrations needed to saturate amplified products at the end of PCR. When used at less than saturating concentrations, products that melt at low temperatures are not observed, presumably because of dye redistribution to higher melting products during heating (). In Fig. 4, compare the peaks observed when a DNA ladder is melted in the presence of SYBR Green I or LCGreen I. LCGreen I is a novel dye, synthesized and marketed by Idaho Technology specifically for scanning applications. It is an asymmetric cyanine dye that, to the best of our knowledge, is not covered by prior patents. LCGreen I is not as tightly bound to DNA as other common DNA dyes. As a result, PCR is very tolerant to high concentrations of the dye. LCGreen I can be used at concentrations that saturate all PCR products at the end of amplification. Unlike SYBR Green I and other common dyes used in real-time PCR, LCGreen I detects heteroduplexes. This is shown in Fig. 5, where several genotypes near the F508del locus are compared. With the short amplicons of this model system, heteroduplexes appear as distinct low temperature peaks on derivative melting curve plots.

Within the past year, we have found that high-resolution melting of DNA, in combination with LCGreen I, is more powerful than previously imagined. For normal-sized PCR products, heteroduplex peaks are seldom completely separated from homoduplex peaks. Nevertheless, genotype differences can still be identified with high-resolution melting and appropriate software analysis. Heteroduplexes melt at lower temperatures than the homoduplexes and form a shoulder on the low-temperature side of the derivative plots (Fig. 6). For highest resolution, it is best to view melting curve plots (fluorescence [F] vs temperature [T]) rather than derivative plots (-dF/dT vs T) because numerical estimation of derivatives always requires smoothing of the data.

With high-resolution analysis, the different duplexes contain a surprising amount of information that can be used to differentiate genotypes. Heterozygotes can be distinguished from homozygotes by the presence of low melting heteroduplexes. Different heterozygotes have uniquely shaped fluorescent melting curves because each is composed of different homoduplex and heteroduplex products. Even different homoduplexes can often be distinguished by differences in melting temperature.

To demonstrate the power of high-resolution melting analysis, we studied the well-known -globin gene variants HbS, HbC, and HbE (all single base changes) in a 110 bp product. Fig. 8 shows the melting curves of all common heteroduplexes (AS, AC, AE, SC) compared to wild type (AA). Quadruplicate samples from different dried blood spots were amplified by PCR and analyzed on the single sample, high-resolution melting instrument HR-1 (). Not only can the heterozygotes be distinguished from the wild type homozygote, but all heterozygotes can be distinguished from each other. That is, not only is scanning for sequence variants possible, but in many cases the sequence variant can be identified, or genotyped.

Page 7: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

All common homozygotes of -globin gene variants were also studied (). Fig. 9 shows that all homozygotes tested (AA, SS, CC, and EE) appear differentiable, including AA and SS genotypes that differed in Tm by less than 0.2°C. Finally, analysis of larger PCR products is also possible. Fig. 10 shows high-resolution melting analysis with LCGreen I of a 544 bp human genomic PCR product that splits into 2 melting domains (). All three possible genotypes are shown in duplicate. The melting curves differ in the lower melting domain where the SNP is located.

We have just completed a systematic study (submitted) of SNP detection sensitivity using a “DNA mutation toolbox” of engineered plasmids (). Plasmids of 40, 50, and 60% average GC content with either A, C, G, or T at one location () were used alone (for homozygotes) or mixed 1:1 (for heterozygotes). All 3 possible heterozygotes for each of 4 homozygotes (12 combinations) were compared for every PCR product. For 100, 200, 300, 400, 500 and 600 bp products with the mutation in the middle, the sensitivity of heterozygote detection was 96%, 100%, and 100% for the 40, 50 and 60% GC templates, respectively. When the 40% GC template was studied in more detail with both symmetric and asymmetric placement of the mutation, 97% (417/432) of the heterozygotes were detected. Errors were made only with products 400 bp or greater. We have not yet studied products greater than 600 bp or mutations < 50bp from one end of the amplicon.

Like other scanning techniques, high-resolution melting is dependent on the quality of the PCR amplification. Since no probes are used, spurious amplification can complicate results and increase errors. Surprisingly however, the method appears resilient to poor template DNA. The data shown in Figs. 8 and 9 were obtained from DNA extracted from dried blood spots without purification. Hot start methods (anti-Taq antibodies or heat-activating enzymes) can also be used if the PCR quality is poor.

Another concern is that the heterozygote peak is never as big as the homozygotes peak (Fig. 6), even though the amounts should theoretically be the same. This is likely due to preferential annealing of homozygotes during cooling or re-association of heterozygotes as homozygotes during melting (,). With short amplicons, rapid cooling and heating favor detection of heterozygotes ().

Page 8: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

Experimental Plan (Phase II)

A detailed Phase I Final Report will be submitted after Phase I research is completed, as indicated in the PHS 2003-2 instructions for Fast-Track applications. We expect to have accomplished the three specific aims of Phase I. They are:

1. Identify a DNA scanning dye compatible with standard fluorescein optics.

2. Demonstrate homogeneous, closed-tube mutation scanning by high-resolution melting analysis in 384-well format.

3. Establish the sensitivity of SNP heterozygote detection according to amplicon size and base mismatch.

Our milestone for proceeding to Phase II is an SNP heterozygote sensitivity and specificity of >90% for PCR products 300 bp or less in length. That is, out of 576 combinations (described in Phase I for products 300 bp or less) we must pick up at least 9 out of every 10 heterozygotes, and miscall fewer than 1 out of 10 homozygotes as a heterozygote.

Our phase I work will show that homogeneous mutation scanning by high-resolution melting is feasible in a highly parallel, 384-well format. However, for many applications 90% sensitivity and specificity will not be good enough. The sensitivity of SNP identification by sequencing depends on read length and several technical factors. With the introduction of automatic base calling and quality scores (phred, polyphred), correct calls can be made up to 99.9 or even 99.99% of the time (,,,). However, there is an important difference between sequencing and scanning accuracy. Scanning accuracy is reported per amplicon while sequencing accuracy is reported per base. If a 100 bp amplicon is scanned, 99% scanning accuracy is equivalent to 99.99% sequencing accuracy. In Phase II of our work, we will pursue specific aims to increase our accuracy to the 99% level, equivalent to a sequencing accuracy of >99.99%.

Specific Aim #1. Improve the sensitivity/specificity and convenience of mutation scanning by optimizing dye chemistry, hardware, software, and melting method to prepare the “LightScanner” as a competitive commercial product. Several candidate dyes will be systematically screened for PCR inhibition, heteroduplex detection and stability. Hardware modifications will include improving sample temperature homogeneity and increasing the resolution of both sample temperature and fluorescence measurements. The major software addition will be automatic heterozygote detection by a hierarchical clustering algorithm. The effect of ionic strength, pH, common PCR additives (DMSO, etc.) on heterozygote detection will be determined.

Dye Selection. During Phase I we will have identified from commercial sources, and/or synthesized on site, several potential dyes for homogeneous scanning with spectral properties similar to fluorescein (whether from commercial sources or synthesized). These dyes will be rigorously tested through the systematic process exemplified below:

1. Saturation data: The concentration of dye versus fluorescence intensity (in the presence of DNA) will be plotted for each candidate dye. The objective is to establish the lower end of dye concentration required for detection, and the concentration at which the signal saturates. binding affinity for DNA under PCR conditions is determined by plotting saturation curves. Dilutions of dye (by factors of 2) are added to a standard amount of DNA (100 ng/10 ul) in rapid cycle PCR buffer (50 mM Tris, pH 8.3, 500 ug/ml BSA, 3 mM MgCl2). The fluorescence is measured in the F1 channel of the LightCycler at a constant

Page 9: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

temperature (e.g. 40C). The maximum fluorescence obtained is set to 100% and all other values are normalized to this maximum. When normalized fluorescence is plotted against dye concentration, S-shaped saturation curves result. Fig. X shows these saturation curves for LCGreen I and YoYo-1. LCGreen I binds less tightly to DNA than YoYo-1 and requires higher dye concentrations for DNA saturation.

2. Useful range for PCR: For each dye, the upper end of useful Dye concentration will be determined bys that PCR inhibition PCRstudies are determined.. Although any target can be used to test for PCR inhibitionPCR amplification can be used, it is convenient to use the small amplicon that brackets the F508del locus of the cystic fibrosis gene as it is well characterized (Fig. 4, ref. Y). The aAmplification from genomic DNA is followed in real-time by acquiring fluorescence at each cycle, and a cycle threshold determined for each dilution. When the dye dilutions differ by factors of two, the inhibition of PCR is usually observed with a sharp transition. concentration is usually very sharp. That is, the cycle thresholds remain nearly constant with increasing dye concentration until a sudden severe inhibition occurs. LCGreen I is compatible with PCR even at 100% saturation, whereas YoYo-1 completely inhibits PCR at 50% saturation (Fig. X). It is advantageous to chose a dye that has as broad a concentration range of usefulness as possible, because at this point, the stability of candidate dyes in dilute aqueous solutions are unknown..

3. Quantify ability to detect heteroduplexes: Determine the effectiveness of each dye for heteroduplex detection using a model system. As a model, an F508del heterozygote of the cystic fibrosis gene is used as template to compare the effectiveness of each dye for heteroduplex detection.. After PCR amplification of a short (44 bp) segment (), the 3-bp loops in heterozygotes are destabilized enough that they appear as separate peaks in derivative plots (Fig. 5). These peaks are quantified (non-linear least squares regression of multiple Ggaussians) and expressed as a percentage of the total peak area (“Het. Area”). Although the heteroduplex areaHet Area is the most important criteria for a good dye, it is also preferable that the melting temperature (Tm) and peak width (standard deviation of the Ggaussian) remain low. In addition, a large change in fluorescence with melting, and a low background fluorescence are desirable. The melting curves at different concentrations of LCGreen I are shown in Fig.Y and a summary of derived data is presented in the Table. The optimal dye concentration (shaded) is selected as a compromise between several factors.

LCGreen I Characteristics at Different Dilutions [Dye](uM) DNA

Saturation (%)PCR

Inhibition (CT)Het.

Area (%)Tm (°C) Peak SD

(°C)250 >100 >36

Fig.X. Dye Saturation Data

Fig.X. Dye Saturation Data

Page 10: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

125 100 27.8 22.1 76.5 0.7763 92 27.1 21.3 74.7 0.7831 64 26.6 20.2 73.5 0.7816 40 26.4 18.7 73.0 0.768 26 26.1 17.7 72.6 0.764 15 26.1 17.7 72.0 0.77

4. Establish dye stability in a 10X solution. Shelf life is important for commercial products. With some unasymmetrical cyanines (like SYBR Green I), stability in dilute solution is a concern. Although LCGreen I is stable for weeks at room temperature, months at 4C, and (probably) years at -20C, we will confirm that each candidate dye is stable by: 1) peak absorbance measurements, and 2) functional studies with the model heterozygote system described above.

5. Brightness of dye: LightTyper optics are optimized for fluorescein, with excitation between 460-490 nm and emission above 510 nm. Matching the scanning dye to the instrument will allow both scanning and probe genotyping in one instrument. The brighter the dye is, the greater the signal-to-noise ratio and the greater the potential melting curve resolution. Better melting curve resolution means greater scanning accuracy with larger amplicons and a more attractive commercial product. “Brighter” dyes have high extinction coefficients and quantum yields with low backgrounds, and can also be identified by melting curves (Fig. Y) with large change in fluorescence with melting and a low final background. The better the dye is, the less work required for hardware modifications to achieve high-resolution melting. Even with an ideal dye however, we anticipate that some hardware modifications will improve system performance.

5. Optimization of dye structure: From the screening listed 1-5 above, we will gain knowledge on the structure-function relationship of dyes. Based on that knowledge, we will continue to synthesize dyes during Phase II so as to find dyes with better features. Those will then be subjected to the screening cycle, and the iterative process will be repeated. One barrier to establishing a structure-function relationship of dyes may be the proprietary nature of some of the commercial dyes. Therefore, we will determine their structure through NMR and mass spectrometry to aid the process. Also, known inhibitors exist in the synthesis pathway of some of the dyes, and the level of impurities will be checked for each dye candidate (the impurity is detected at 260 nm). If necessary, the dyes will be repurified and subjected to the screening cycle again. The following table shows some examples of modifications that may be done to the dye structure which is shown in general schematic in Fig.Z:

Z A R X #methine B & B’ Q (including isomers)

Fig.Y. LCGreen melting curves (with increasing concentration of dye).

Fig.Z. Generalized structure of unsymmetrical cyanines

Page 11: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

HClBr

NO2

MeNH2

NoneMePh

OS

OneThree

HMe

OMePhSH

SCH3

NH2

-S-pyrimidine

Hardware Modifications. The LightTyper is currently manufactured by Idaho Technology and distributed by Roche. It is a 96 or 384-well fluorescent imager that acquires melting curves for genotyping with fluorescein-labeled probes. Current temperature control and melting curve acquisition are “low resolution” compared to the HR-1. We will improve the resolution of the LightTyper so that detection of heteroduplexes is optimized.

The LightTyper is designed to genotype by melting analysis of probes () that have broad melting transitions and large differences between genotypes. In contrast, the temperature shifts observed with amplicon melting of heterozygotes are often only fractions of a degree compared to homozygotes. Observing these small differences is critical to successful scanning. Hence, the LightTyper needs to be “tuned” for higher resolution to become a LightScanner.

We will systematically optimize of the entire hardware system for resolution.The current LightTyper is shown in Fig. 12. Light from two banks of 470 nm LEDs pass through 460-480 nm bandpass interference filters to directly illuminate samples in microtiter wells. Emitted fluorescence returns by a parallel path through another filter (>510 nm) into a CCD camera (Roper Scientific CoolSnap CF). The black and white camera has 1392 x 1040 pixels with 10-bit resolution and a frame rate of up to 10 frames/sec. The CCD is cooled to 10C below ambient to reduce noise. The microtiter plate sits in a resistive heating block with pulse-width modulated DC heating. The temperature sensor is a PT-100, platinum resistive sensor placed in the center of the heating block with 12-bit A-to-D conversion.

In Phase I, we focused on: 1) increasing the number of data points/C, 2) decreasing the noise in the temperature measurement, and 3) implementing temperature adjustment in software to compare samples. In Phase II we will focus on additional factors to improve fluorescence and temperature resolution. In each case, the effect of modifications will be judged by, 1) signal-to-noise ratio of the normalized melting curves, and 2) ability to detect difficult heterozygotes identified in Phase I using the DNA Toolbox. Briefly, the signal-to-noise ratio is obtained from software already written in LabView that compares the mean-square-error at the top of the melting curve to the magnitude of the curve. The DNA toolbox uses regions of plasmids of 40, 50, and 60% GC content where one position is engineered to be either A, C, G, or T in order to construct all possible heterozygotes and homozygotes. Amplicons up to 600 bp were studied in Phase I and some of the heterozygotes are sure to be difficult to call. Running these difficult samples under different instrument conditions is a good functional check on whether the modifications make any real difference in heteroduplex detection. Our final goal is 99% accuracy for amplicons up to 400 bases in size. The modifications are grouped below as improvements in fluorescence resolution, improvements in temperature resolution, effects of temperature heating rate and finally, sample evaporation considerations.

Page 12: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

1. Fluorescence resolution. All cameras have limited sensitivity. The current LightTyper camera is a good compromise between sensitivity and price. The camera sensitivity needed depends on the brightness of the dye, the intensity of illumination, and the integration time of the camera. Brightness is a consideration in selecting the dye and is a function of the dye extinction coefficient and quantum yield in the presence of DNA under PCR conditions. We would prefer not to lengthen the integration time in order to minimize dark noise and keep the data to about 20 points/C. The last variable, excitation intensity, could be modified. By increasing the number of LED banks from two to four, the excitation intensity could be doubled (Fig. X). Fluorescence quenching does not appear to be an issue at current power levels.

Another method to lower noise and increase frame rate is to bin multiple pixels during readout of the CCD. The camera we are using supports 2x2 binning, effectively decreasing the image resolution by a factor of four. This is not a disadvantage. Instead of summing 500 pixels over each microtiter well, only 125 2x2 bins need be added. We estimate that read-out noise can be reduced by 50%. In addition, faster frame rates would be possible if sufficient fluorescence is present, potentially allowing more data points per C.

2. Temperature resolution. In Phase I, we detail several options for increasing the temperature resolution of the LightTyper. They are: 1) increasing the resolution of the A-to-D converter, 2) better shielding of the temperature sensor, 3) better stability of the electrical supply to the sensor, and 4) decreasing electrical noise from the heater by converting from pulse-width-modulated to continuous DC control. If any of these have yet to be fully investigated, we will do so in Phase II.

Attainable temperature resolution also depends on the temperature homogeneity of the sample. If the temperature varies within a sample, it does not matter how precisely the average temperature is measured, the resolution will still be poor. The smaller the sample is, the easier it is to maintain homogeneity within the sample. Smaller samples in 384-well plates are more homogeneous than larger samples in 96-well plates. We will compare the resolution obtained with 20, 10, 5 and 2 ul samples.

3. Temperature heating rate. For small amplicons, heteroduplexes are best detected at heating rates of 0.3C/sec or greater (). However, the advantage of rapid heating does not appear to extend to larger amplicons () more commonly used in scanning. For samples in microtiter plates, slower rates are much better for temperature homogeneity. Our target heating rate in the LightScanner will be 0.1C/sec. Instead of 1-2 min runs at 0.3C/sec, a typical run on the Light Scanner will require 3-6 min at 0.1C/sec. The longer run time is not so important when 384 samples are run in parallel. For comparison, we will also determine resolution on the LightScanner with heating rates of 0.05C/sec and 0.2C/sec.

4. Sample evaporation concerns. DNA melting transitions are very sensitive to buffer

conditions such as ionic strength. With small samples, evaporation is a real concern, both during PCR and during melting curve acquisition. An oil overlay is one solution. A heated lid is another. A transparent glass lid with a thin coating of indium oxide can be used as a heated lid. A current is passed through the indium layer for resistive heating. Light transmittance through the indium layer is about 80%. We will compare the resolution obtained with oil overlays (5, 10, 20 ul) to an indium oxide heated lid.

How much resolution is required for scanning? It depends on the accuracy you require and the size of amplicon you want to scan. Better melting curve resolution means better scanning without limit.

Page 13: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

The power of mass spectrometry is also directly related to the resolution obtainable – as higher resolution A-to-D converters became available, mass spectrometry became less expensive and more useful. We suggest parallel reasoning applies to DNA melting analysis.

Automatic heterozygote detection. Detecting heterozygotes by visual inspection of the melting curves is surprisingly easy (Fig. X). After normalization and temperature adjustment, homozygotes cluster together and heterozygotes stand out because the upper parts of the curves are shifted to lower temperatures. However, when there are 384 curves to compare, intuitive visual clustering becomes less attractive and automatic clustering becomes more reasonable. We will implement classical hierarchical clustering to identify different genotypes (). Such methods are familiar to most scientists from dendrograms of sequence similarity or gene expression analysis ().

Hierarchical clustering of melting curves will be performed as follows. Suppose you have a set of n curves, all produced by amplification from the same primers and representing an unknown number of genotypes. The first step is to find the two curves that are “closest together”. These two parent curves can be averaged to produce a new child curve. The two parent curves are deleted and replaced by the child curve to produce a new set of n-1 curves. This produces the first branch of the dendrogram. The process is repeated n-1 times resulting in n dendrogram levels. This is unsupervised, agglomerative, hierarchical clustering.

How do we identify the two curves that are, “closest together”? What do we mean by the distance between two curves? There is no single correct answer. We will compare three different simple computations for the “distance” between two curves.

1. Sum the absolute value of the difference in fluorescence between the two curves at each temperature.

2. Sum the square of the difference in fluorescence between the two curves at each temperature.

3. Find the maximum difference in fluorescence (vertical distance) between the two curves.

The average of two curves is taken as the average fluorescence at each temperature. Since the number of different genotypes in the set is not known, the appropriate dendrogram level (number of clusters) is not known. First we will develop a tool that allows the user to scroll up and down between dendrogram levels. At each level, different clusters will be displayed as different colors, allowing a rapid visual check for appropriate clustering. We will then replace this semi-automated clustering with fully automated clustering by calculating intra-cluster and inter-cluster distances at each dendrogram level. The best level will be selected by maximizing inter-cluster distances while minimizing intracluster differences. Programming will be performed using LabView (National Instruments), the programming language used for HR-1 analysis software. Different clustering techniques will be compared for execution time and clustering accuracy using the DNA toolbox.

Chemistry/heteroduplex optimization. We have prior evidence that heteroduplex detection is better under low ionic strength. Using the F508del heterozygote model (Fig X), the heteroduplex peak was larger with lower [Mg++] compared to higher [Mg++] (). We will confirm this finding for both K+ and Mg++ on the LightScanner. We have not yet studied the effect of pH or common PCR additives, such as DMSO, glycerol, formamide and betaine. The pH will be varied from 7.1 to 9.1 (Tris buffer) and several dilutions of each additive that bracket commonly used concentrations in PCR will be studied. In all cases, concentrations will be adjusted after PCR so that PCR effects are eliminated. To judge the affect of these variations, we will use both the F508del model system (area of the heterozygote peak), and our ability to distinguish heterozygotes using difficult Toolbox amplicons identified in Phase I.

Page 14: Arial 12-point - Home - Math - The University of Utahpalais/pcr/proposals/Phase II_norichan... · Web viewWith the introduction of automatic base calling and quality scores (phred,

Specific Aim #2. Implement internal temperature controls to increase the sensitivity of homozygous variant detection. Homozygous variant detection depends on the absolute sample temperature. Well-to-well temperature homogeneity better than 0.5C is difficult to achieve during a temperature ramp. An internal temperature control would monitor the temperature of the solution in each well and could be used to correct for any temperature difference between wells. Possible internal controls include DNA (linear double strand or hairpin) or a passive dye with a steep temperature coefficient that could be monitored at a second wavelength. Sensitivity of SNP homozygote detection, with and without internal temperature controls, will be monitored with the DNA toolbox.

Specific Aim #3. Investigate clinical testing by scanning large genes for mutations. Testing for many genetic disorders is difficult because the causative mutations are often scattered all over the gene. We will test our scanning method on samples from two clinical disorders, paroxysmal nocturnal hemoglobinuria (PNH), and cystic fibrosis (CF).

Specific Aim #4. Investigate options for multiplexing, including amplicon Tm and fluorescent color. Although highly parallel analysis decreases the need for multiplexing, repetitive screening applications may justify the initial effort of multiplex design. We will demonstrate the feasibility of multiplexing in mutation scanning by: a) combining amplicons of different melting temperatures, and b) amplifying different targets with labeled primers of unique spectra that can be differentiated by multicolor optics.

Contractual Arrangements. All costs and work performed under this grant will be split equally between Idaho Technology as the applicant organization and the University of Utah as the research institution. The percentage of work performed at each institution is proportional to the requested costs for each party. The principal investigator will oversee the project at both institutions and will verify that the work is being performed and that all expenditures are appropriate.

The University of Utah will be responsible for construction of the prototype multicolor LightCycler (specific aim 1). Idaho Technology will supply the mechanical and electrical sub-assemblies

including all mechanical, electronic, and optical enhancements necessary. Idaho Technology will also provide programming expertise necessary to modify the software to store and display multiple channels and will implement the algorithms for software compensation of fluorescence crosstalk between channels (specific aim 2).

The University of Utah will be responsible for the molecular biology research, including probe design and synthesis, software compensation testing, and assay development and validation (specific aims 2 and 3). The University of Utah and Idaho Technology have worked together successfully before in bringing the RapidCycler and LightCycler technologies to market.

Written agreements covering both the cooperative research effort and product licensing and royalty payments are in place between Idaho Technology and the University of Utah.

Duda RO, Hart PE, and Stork DG. Pattern Classification (2nd ed.), Wiley-Interscience, 2000.

Eisen MB, Spellman PT, Brown PO and Botstein D. (1998). Cluster Analysis and Display of Genome-Wide Expression Patterns. Proc Natl Acad Sci U S A 95, 14863-14868.