accurate sampling and deep sequencing of the hiv-1 ...accurate sampling and deep sequencing of the...

6
Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID Cassandra B. Jabara a,b,c , Corbin D. Jones a,d , Jeffrey Roach e , Jeffrey A. Anderson b,c,f,1 , and Ronald Swanstrom b,c,g,2 a Department of Biology, b Lineberger Comprehensive Cancer Center, c University of North Carolina Center for AIDS Research, d Carolina Center for Genome Sciences, e Research Computing Center, f Division of Infectious Diseases, and g Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, NC 27599 Edited by John M. Cofn, Tufts University School of Medicine, Boston, MA, and approved November 8, 2011 (received for review June 24, 2011) Viruses can create complex genetic populations within a host, and deep sequencing technologies allow extensive sampling of these populations. Limitations of these technologies, however, potentially bias this sampling, particularly when a PCR step precedes the se- quencing protocol. Typically, an unknown number of templates are used in initiating the PCR amplication, and this can lead to un- recognized sequence resampling creating apparent homogeneity; also, PCR-mediated recombination can disrupt linkage, and differ- ential amplication can skew allele frequency. Finally, misincorpora- tion of nucleotides during PCR and errors during the sequencing protocol can inate diversity. We have solved these problems by including a random sequence tag in the initial primer such that each template receives a unique Primer ID. After sequencing, repeated identication of a Primer ID reveals sequence resampling. These re- sampled sequences are then used to create an accurate consensus sequence for each template, correcting for recombination, allelic skewing, and misincorporation/sequencing errors. The resulting population of consensus sequences directly represents the initial sampled templates. We applied this approach to the HIV-1 protease (pro) gene to view the distribution of sequence variation of a com- plex viral population within a host. We identied major and minor polymorphisms at coding and noncoding positions. In addition, we observed dynamic genetic changes within the population during in- termittent drug exposure, including the emergence of multiple re- sistant alleles. These results provide an unprecedented view of a complex viral population in the absence of PCR resampling. drug resistance | genetic diversity | high throughput sequencing | HIV | population dynamics H igh throughput sequencing allows the acquisition of large amounts of sequence data that can encompass entire genomes (14). With sufcient amounts of starting DNA, PCR is not needed before the library preparation step of the sequencing protocol. Sequencing miscalls inherent in high throughput se- quencing approaches are resolved using multiple reads over a given base. Deep sequencing can also capture the genetic diversity of viral populations (510), including intrahost populations derived from clinical samples. This approach offers the opportunity to view population diversity and dynamics and viral evolution in un- precedented detail. One place where the presence of minor variants is of immediate practical importance is in the detection of drug-resistant variants. Standard bulk sequencing methods typically miss allelic variants below 20% in frequency within a population (11, 12). Alternative assays can detect less abundant variants that confer drug resistance, but require a priori selection of sites and variants (1323). Thus, deep sequencing approaches offer the opportunity to identify minor variants associated with resistance de novo with the goal of understanding their role in therapy failure. Although screening for drug-resistant variants is a practical application of the deep sequencing technology, this technology also addresses broader questions of sequence diversity and structure for a complex population like HIV-1. However, the relatively high sequencing error rates of these technologies articially increase genetic diversity, which confounds the detec- tion of natural genetic variation especially when sequencing a highly heterogeneous viral population. Moreover, the use of PCR to amplify the amount of material before starting the se- quencing protocol adds the potential for several serious artifacts (2427): First, nucleotide misincorporation by the polymerase during many rounds of amplication articially increases se- quence diversity; second, artifactual recombination during am- plication occurs when premature termination products prime a subsequent round of synthesis, which can obscure the linkage of two sequence polymorphisms (28, 29); third, differential ampli- cation can skew allelic frequencies; and fourth, PCR amplication can create a signicant mass of DNA from a small number of starting templates, which obscures the true sampling of the orig- inal population as these few starting templates/genomes get re- sampled in the PCR product, creating sequence resampling rather than the observation of independent genomes (30). Overall, these biases articially decrease true diversity while introducing arti- factual diversity and also skew allelic frequencies, which can lead to incongruence between the real and observed viral populations. Most investigators use statistical tools to attempt to control for the types of sequencing errors that are associated with each se- quencing platform. To make deep sequencing useful for complex populations, it is necessary to overcome PCR resampling, which is mistaken for sampling of the original population, and PCR and sequencing errors, which can be mistaken for diversity. As nucleotide mis- incorporation is largely random across sites and template switching/recombination is more likely to occur in the later cycles of a PCR (31), strategies that create a bulk or consensus se- quence for each sampled template will call the correct base at each position. One approach to sampling highly heterogeneous populations, such as the HIV-1 env gene, is through endpoint dilution titration of the template before nested PCR, such that a single template is present in each PCR amplication (3235). In addition to masking the misincorporations, PCR-mediated recombination produces recombinant templates identical to the parental sequence. Although highly accurate, this technique is labor-intensive and, as population sampling is dependent on the number of templates sequenced, this methodology does not lend itself to the identication of minor variants or to understanding the structure of a complex population, nor is it easily adaptable to a high throughput approach. We have developed a high throughput technique for directly resolving the genetic diversity of a viral population. This tech- nique avoids the recording of PCR and sequencing errors that Author contributions: C.B.J., C.D.J., J.A.A., and R.S. designed research; C.B.J. and J.A.A. performed research; C.B.J., C.D.J., and J.R. contributed new reagents/analytic tools; C.B.J., C.D.J., J.R., J.A.A., and R.S. analyzed data; and C.B.J., C.D.J., and R.S. wrote the paper. The authors declare no conict of interest. This article is a PNAS Direct Submission. 1 Present address: Discovery Medicine-Virology, Discovery Medicine and Clinical Pharma- cology, 311 Pennington-Rocky Hill Rd., 8A-1.14, Bristol-Myers Squibb, Pennington, NJ 08543. 2 To whom correspondence should be addressed. E-mail: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1110064108/-/DCSupplemental. 2016620171 | PNAS | December 13, 2011 | vol. 108 | no. 50 www.pnas.org/cgi/doi/10.1073/pnas.1110064108 Downloaded by guest on May 30, 2020

Upload: others

Post on 28-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Accurate sampling and deep sequencing of the HIV-1 ...Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID Cassandra B. Jabaraa,b,c, Corbin D. Jonesa,d,

Accurate sampling and deep sequencing of the HIV-1protease gene using a Primer IDCassandra B. Jabaraa,b,c, Corbin D. Jonesa,d, Jeffrey Roache, Jeffrey A. Andersonb,c,f,1, and Ronald Swanstromb,c,g,2

aDepartment of Biology, bLineberger Comprehensive Cancer Center, cUniversity of North Carolina Center for AIDS Research, dCarolina Center for GenomeSciences, eResearch Computing Center, fDivision of Infectious Diseases, and gDepartment of Biochemistry and Biophysics, University of North Carolina, ChapelHill, NC 27599

Edited by John M. Coffin, Tufts University School of Medicine, Boston, MA, and approved November 8, 2011 (received for review June 24, 2011)

Viruses can create complex genetic populations within a host, anddeep sequencing technologies allow extensive sampling of thesepopulations. Limitations of these technologies, however, potentiallybias this sampling, particularly when a PCR step precedes the se-quencing protocol. Typically, an unknown number of templates areused in initiating the PCR amplification, and this can lead to un-recognized sequence resampling creating apparent homogeneity;also, PCR-mediated recombination can disrupt linkage, and differ-ential amplification can skew allele frequency. Finally, misincorpora-tion of nucleotides during PCR and errors during the sequencingprotocol can inflate diversity. We have solved these problems byincluding a random sequence tag in the initial primer such that eachtemplate receives a unique Primer ID. After sequencing, repeatedidentification of a Primer ID reveals sequence resampling. These re-sampled sequences are then used to create an accurate consensussequence for each template, correcting for recombination, allelicskewing, and misincorporation/sequencing errors. The resultingpopulation of consensus sequences directly represents the initialsampled templates. We applied this approach to the HIV-1 protease(pro) gene to view the distribution of sequence variation of a com-plex viral population within a host. We identified major and minorpolymorphisms at coding and noncoding positions. In addition, weobserved dynamic genetic changes within the population during in-termittent drug exposure, including the emergence of multiple re-sistant alleles. These results provide an unprecedented view of acomplex viral population in the absence of PCR resampling.

drug resistance | genetic diversity | high throughput sequencing | HIV |population dynamics

High throughput sequencing allows the acquisition of largeamounts of sequence data that can encompass entire

genomes (1–4). With sufficient amounts of starting DNA, PCR isnot needed before the library preparation step of the sequencingprotocol. Sequencing miscalls inherent in high throughput se-quencing approaches are resolved using multiple reads over agiven base.Deep sequencing can also capture the genetic diversity of viral

populations (5–10), including intrahost populations derived fromclinical samples. This approach offers the opportunity to viewpopulation diversity and dynamics and viral evolution in un-precedented detail. One place where the presence of minorvariants is of immediate practical importance is in the detectionof drug-resistant variants. Standard bulk sequencing methodstypically miss allelic variants below 20% in frequency within apopulation (11, 12). Alternative assays can detect less abundantvariants that confer drug resistance, but require a priori selectionof sites and variants (13–23). Thus, deep sequencing approachesoffer the opportunity to identify minor variants associated withresistance de novo with the goal of understanding their role intherapy failure.Although screening for drug-resistant variants is a practical

application of the deep sequencing technology, this technologyalso addresses broader questions of sequence diversity andstructure for a complex population like HIV-1. However, therelatively high sequencing error rates of these technologiesartificially increase genetic diversity, which confounds the detec-

tion of natural genetic variation especially when sequencinga highly heterogeneous viral population. Moreover, the use ofPCR to amplify the amount of material before starting the se-quencing protocol adds the potential for several serious artifacts(24–27): First, nucleotide misincorporation by the polymeraseduring many rounds of amplification artificially increases se-quence diversity; second, artifactual recombination during am-plification occurs when premature termination products primea subsequent round of synthesis, which can obscure the linkage oftwo sequence polymorphisms (28, 29); third, differential amplifi-cation can skew allelic frequencies; and fourth, PCR amplificationcan create a significant mass of DNA from a small number ofstarting templates, which obscures the true sampling of the orig-inal population as these few starting templates/genomes get re-sampled in the PCR product, creating sequence resampling ratherthan the observation of independent genomes (30). Overall, thesebiases artificially decrease true diversity while introducing arti-factual diversity and also skew allelic frequencies, which can leadto incongruence between the real and observed viral populations.Most investigators use statistical tools to attempt to control forthe types of sequencing errors that are associated with each se-quencing platform.To make deep sequencing useful for complex populations, it is

necessary to overcome PCR resampling, which is mistaken forsampling of the original population, and PCR and sequencingerrors, which can be mistaken for diversity. As nucleotide mis-incorporation is largely random across sites and templateswitching/recombination is more likely to occur in the later cyclesof a PCR (31), strategies that create a bulk or consensus se-quence for each sampled template will call the correct base ateach position. One approach to sampling highly heterogeneouspopulations, such as the HIV-1 env gene, is through endpointdilution titration of the template before nested PCR, such thata single template is present in each PCR amplification (32–35).In addition to masking the misincorporations, PCR-mediatedrecombination produces recombinant templates identical to theparental sequence. Although highly accurate, this technique islabor-intensive and, as population sampling is dependent on thenumber of templates sequenced, this methodology does not lenditself to the identification of minor variants or to understandingthe structure of a complex population, nor is it easily adaptableto a high throughput approach.We have developed a high throughput technique for directly

resolving the genetic diversity of a viral population. This tech-nique avoids the recording of PCR and sequencing errors that

Author contributions: C.B.J., C.D.J., J.A.A., and R.S. designed research; C.B.J. and J.A.A.performed research; C.B.J., C.D.J., and J.R. contributed new reagents/analytic tools; C.B.J.,C.D.J., J.R., J.A.A., and R.S. analyzed data; and C.B.J., C.D.J., and R.S. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.1Present address: Discovery Medicine-Virology, Discovery Medicine and Clinical Pharma-cology, 311 Pennington-Rocky Hill Rd., 8A-1.14, Bristol-Myers Squibb, Pennington,NJ 08543.

2To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1110064108/-/DCSupplemental.

20166–20171 | PNAS | December 13, 2011 | vol. 108 | no. 50 www.pnas.org/cgi/doi/10.1073/pnas.1110064108

Dow

nloa

ded

by g

uest

on

May

30,

202

0

Page 2: Accurate sampling and deep sequencing of the HIV-1 ...Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID Cassandra B. Jabaraa,b,c, Corbin D. Jonesa,d,

create artificial diversity, and corrects for artificial allelic skewingand PCR resampling, revealing the original genomes in thepopulation. This is accomplished by embedding a degenerateblock of nucleotides within the primer used in the first round ofcDNA synthesis. This creates a random library of sequenceswithin the primer population. As primers are individually usedout of this library, each viral template is copied such that thecomplement (cDNA) now includes a unique sequence tag, orPrimer ID. This Primer ID is carried through all of the sub-sequent manipulations to mark all sequences that derive fromeach independent templating event, and PCR resampling thenbecomes over-coverage for each template to create a consensussequence of that template. Using this approach, we were able todirectly remove error, correct for PCR resampling, and capturethe fluctuation of minor variants in the viral population within ahost. We also resolved minor drug-resistant variants below 1% infrequency before the initiation of antiretroviral therapy, andwere able to correlate these variants with the emergence of drugresistance. The value of this strategy and its applicability to otherdeep sequencing protocols has been further emphasized througha recent parallel effort by Kinde et al. in short read sequencing ofhuman genomic material (36).

ResultsA cDNA Synthesis Primer Containing a Primer ID Can Be Used to TrackIndividual Viral Templates. A population of cDNA synthesis pri-mers was designed to prime DNA synthesis downstream of theHIV-1 protease (pro) gene, with the primer containing two ad-ditional blocks of identifying information (Fig. 1A). The firstblock was a string of eight degenerate nucleotides that created65,536 distinct sequence combinations (48), or Primer IDs. Thisregion was flanked by an a priori selected three nucleotide bar-code, creating a sample identification block so that multiplesamples could be pooled together in a sequencing run (7). Adesigned sequence at the 5′ end of the cDNA primer was used forsubsequent amplification of the cDNA sequences by nested PCR.Viral RNA was extracted from three longitudinal blood plasma

samples from an individual infected with subtype B HIV-1 whowas participating in a protease inhibitor efficacy trial (M94-247)

(ref. 37; Fig. S1). Approximately 10,000 copies of viral RNA fromeach sample were used in a reverse transcription reaction forcDNA synthesis and tagging using the Primer ID. The cDNAproduct was separated from the unused cDNA primers, and thenthe viral sequences were amplified by nested PCR and sequencedon the 454 GS FLX Titanium. Our data were distilled from totalreads of 20,429, 24,658, and 27,075 for the three time points (T1,T2, and T3, respectively). Raw sequence reads were assessed forthe cDNA tagging primer and a full length pro gene sequence(297 nucleotides long representing 99 codons), and when three ormore sequences within a sample contained an identical PrimerID, a consensus sequence was formed to represent one sequence/genome in the population (Fig. 1 B and C and Fig. S2).With these manipulations we generated 857, 1,609, and 2,213

consensus sequences, respectively, for the three time points (Fig.1C). The median number of reads per Primer ID was 6, rangingfrom 1 to 96 (Fig. S3A). The distribution of identical Primer IDsdid not form a normal distribution as would be expected if alltemplates were amplified equally. We saw a higher than expectednumber of single reads of Primer IDs; although we do not knowthe reason for this, such a result is consistent with different cDNAtemplates entering the PCR at different cycles. Because eachtemplate is individually tagged the different number of reads is anindication of allelic skewing, as noted this can be nearly 100-fold.In an analysis of a number of low abundant variants we saw a 20-fold range of representation through allelic skewing, with half ofthe variants up to 2- to 3-fold more abundant than the mean, andthe other half up to 5- to 10-fold less abundant (Fig. S4).We conservatively estimate the combined in vitro error rate of

the cDNA synthesis step by reverse transcriptase (RT) and thefirst strand synthesis by the Taq polymerase to be on the order of1 mutation in 10,000 bases, or approximately one mutation per33 pro gene sequences, based on an RT error rate of 1 in 22,000nucleotides (38) and a Taq polymerase error rate of 1.1 in 10,000nucleotides (39) but reduced by half because only the first roundof synthesis is relevant and a misincorporation at this step gives amixture. Later rounds of Taq polymerase errors should be largelylost through the creation of the consensus sequence. Thus, wewould expect 139 sequence misincorporations to be present inthe data set of 4,679 total sequences representing T1+T2+T3,and with an excess of transitions. These would be expected tooccur as 113 single copy single-nucleotide polymorphisms(SNPs) and 13 SNPs that appeared twice. We observed 98 singlecopy SNPs in the data set with a threefold excess of transitions,and with three-fourths of them being coding changes, which isconsistent with random mutations. We expect there to be lowfrequency SNPs in the viral population from rare but persistentvariants that are fortuitously sampled, and from the intrinsicerror rate of viral replication (the error rate during one round ofviral replication would represent approximately one mutationper 150 pro gene sequences; ref. 24). However, we cannot dis-tinguish real polymorphisms from the inferred background errorrate associated with the first and second rounds of in vitro DNAsynthesis. Thus, we have limited the analysis of population di-versity to SNPs that appeared at least twice in the data set (i.e.,linked to at least two separate Primer IDs), either at the sametime point or at multiple time points in the overall data set(Table S1). We have not corrected the data set for the presumed13 SNPs that appeared twice that are expected to be present dueto error even though this represents 33% of all of the SNPs thatappeared twice (13 of 39). Overall, 80% of the SNPs (i.e., anysequence change from the consensus that appeared at leastonce) in the total data set of 72,162 sequence reads were re-moved as error. Also, 60–65% of the sequence reads wererevealed as resampling. Finally, allelic skewing of up to nearly100 fold was corrected (Fig. S4).

Longitudinal Sequencing of the HIV-1 Protease (pro) Gene in anUntreated Individual Reveals Dynamic Changes in Genetic Variation.We analyzed the sequences of the pro gene populations to assessallelic frequency at the two sampled time points, separated by 6mo

pro pol

NNNNNNNNNNNNNNN NNNNNNNNNNN NN NN NN NN NNNNNN BARBARBARreverse complement Primer ID Barcode PCR priming site

vRNA 3’

primer 5’A

B CPrimer ID Barcode

CATAATAC TAGCATAATAC TAG

CATAATAC TAGCATAATAC TAG

CATAATAC TAG

CATAATAC TAG

CATAATAC TAG

Consensus sequence

Raw sequence readsSample

Ritonavir

Totalreads

Consensussequences

T1

-

20,429

857

T2

-

24,658

1,609

T3

+

27,075

2,213

Fig. 1. Tagging viral RNA templates with a Primer ID before PCR amplifica-tion and sequencing allows for direct removal of artifactual errors and iden-tifies resampling. (A) A primer was designed to bind downstream of theprotease coding domain. In the 5′ tail of the primer, a degenerate string ofeight nucleotides created a Primer ID, allowing for 65,536 unique combina-tions. An a priori selected three nucleotide barcode was designed for thesample ID. Finally, a heterologous string of nucleotides with low affinity tothe HIV-1 genome was included in the far 5′ end for use as the priming site inthe PCR amplification. (B) PCR biases and sequencing error are introducedduring amplification and sequencing of viral templates. Repetitive identifica-tion of the barcode and Primer ID allow for tracking of each templating eventfrom a single tagged cDNA. As errors are minor components within the PrimerID population, forming a consensus sequence directly removes them, andcorrects for PCR resampling. (C) HIV-1 RNA templates isolated from plasmasamples from two pre- and one postintermittent ritonavir drug therapy weretagged, amplified, and deep sequenced. Tagged sequences containing full-length protease were used to create a population of consensus sequenceswhen at least three sequences contained an identical barcode and Primer ID.

Jabara et al. PNAS | December 13, 2011 | vol. 108 | no. 50 | 20167

MICRO

BIOLO

GY

Dow

nloa

ded

by g

uest

on

May

30,

202

0

Page 3: Accurate sampling and deep sequencing of the HIV-1 ...Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID Cassandra B. Jabaraa,b,c, Corbin D. Jonesa,d,

and before ritonavir (37) drug selection (Fig. S1). The combinedsequence population from the two time points (T1 and T2) beforetherapy consisted of 492 unique pro gene sequences with 155SNPs. About 4% (i.e., 21) of these unique gene sequences wereabove 0.5% abundance, and these 21 unique gene sequencesrepresented 67% of all sampled genomes, with the genome rep-resenting the overall consensus sequence comprising 21% of thetotal population (Fig. S5 A and B). The relatively small number ofunique gene sequences above 0.5% frequency in the populationcontained only 7% of the 155 detected SNPs. Thus, a large pro-portion of the viral population’s diversity was associated witha large number of pro gene sequences that were present at lowabundance (Fig. S5 A and C); conversely, the majority of thepopulation consisted of a small number of SNPs. Similarly, Taji-ma’s D statistic for T1 and T2 in this individual were −2.35 and−2.31, respectively (Table S2), indicative of a population structurethat has an excess of low frequency polymorphisms. This pattern isconsistent with but more extreme than that observed in a priorshallow intrahost survey in which a metapopulation model wasproposed to explain the pattern of Tajima’s D statistic (40). Fig. 2shows the encoded amino acid variability and synonymous nucle-otide variability present in two or more individual genomes acrossthe 99 codons in the pro gene for these samples.Synonymous variability. There were 57 codons (with 63 variants/SNPs) that contained synonymous diversity that appeared inboth pretherapy time points, and 30 codons (with 31 variants)that appeared in only one time point. Taken together, 75 of the99 codons contained some level of synonymous diversity (Fig. 2and Table S1). Of the 63 variants that were present in bothuntreated time points, 92% were transitions. Of the 31 variantsthat appeared in only one of the time points, 71% were tran-sitions, representing a significantly smaller fraction of transitionsthan among the synonymous variants that appeared at both timepoints (P = 0.012; Fisher’s exact test). This suggests that syn-onymous transversions are selected against over time.Nonsynonymous variability. There were 26 codons (28 variants) thatcontained coding variability that appeared in both pretherapytime points, and an additional 28 codons (33 variants) withnonsynonymous changes found in only one of the time points.Taken together, 49 of the 99 codons contained some level ofnonsynonymous diversity (Fig. 2 and Table S1). For the 28nonsynonymous variants detected at both time points, 22 weretransitions, and these mostly represented conservative aminoacid changes. In the case of synonymous mutations two-thirds ofthe variants were present at both time points, whereas in the caseof nonsynonymous mutations, less than half were present at bothtime points (P = 0.012; Fisher’s exact test). This observation

suggests that, at this level of sequence sampling, we are able tosee a difference in stability within the population in comparingsynonymous and nonsynonymous substitutions.Genetic fluctuation. We compared the stability of minor SNPspresent at both T1 and T2. A total of 14 of the 91 SNPs (syn-onymous and nonsynonymous that appeared at both time points)had significant changes in abundance between the two timepoints (χ2 test with a false discovery rate of 0.05). Of the 14 SNPswith significant changes in abundance, 11 had a decrease in theabundance, with an average decrease around 7.5-fold. Therewere three SNPs that had a significant increase in abundance, allof which were synonymous, ranging from a 4- to 47-fold increase.Although a majority of SNPs that changed in abundance hada decrease in the frequency between T1 and T2, on a populationlevel, there was not a large change in diversity between the twotime points (T1 π = 0.0080, T2 π = 0.0079; Table S2). However,the trend of increased abundance at the three sites may be drivenby selection of cryptic epitopes in an alternative reading frame(see Discussion).Significance of rare variants. We observed two extremes in terms ofbiological relevance in the untreated population among variantsdetected as at least two independent sequences across the threetime points. At one extreme was the detection of nonviablegenomes in the form of a coding variant at position 25, whichmutates the active site of the protease, and the detection of ter-mination codons at positions 42 and 61 (Table S1). At the otherextreme was the detection of the L90M and V82A variants (attime points 1 and 2, respectively) that became the major resistancepopulations after ritonavir therapy was initiated (see below, Fig.3); in addition, V82I and V82L were detected at T2. We foundtwo more examples of primary resistance mutations at low abun-dance, K20R at all three time points and M46I at two time points,but these did not grow out in the presence of ritonavir (Fig. 3 andTable S1). Similarly, fitness compensatory mutations were alsodetected at low abundance (L10F, M36I, L63P, A71T, and V77I),all below 1%, and only L63P increased (modestly) in abundanceafter exposure to ritonavir. More generally, of the 28 substitutionsmost closely associated with protease inhibitor drug resistance (41,42), we found 10 such variants, half of which were detected at bothpretherapy time points (Table S1).

Assessment of Linkage Disequilibrium (LD) Within the HIV-1 pro GenePopulation. We measured LD for the sequences in the T1 and T2populations. We identified very few examples of LD at these twotime points using the Fisher’s exact test with a Bonferroni cor-rection. Of the 103 polymorphic sites in T1, only three pairs werein significant LD. Similarly, in T2 with 118 polymorphic sites, only

+ * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

+ * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99

0.1 1

10

0.1 1

10

+ * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

0.1 1

10

0.1 1

10

NS

S

NS

S

S

NS

Fig. 2. Frequency of codon variation across all 99 positions in protease over three time points. Within a codon position, the first two bars represent untreatedtime points 1 and 2, respectively. Bars 3 and 4 are the third time point split based on the presence or absence of the resistance mutations to ritonavir. Bar 3 isthe population of susceptible genotypes (defined as not V82A, I84V, or L90M), and bar 4 is the major resistant variant, V82A, population. Upward facing barsare nonsynonymous changes (scale in regular typeface), and downward facing bars are synonymous changes (scale in bolded typeface). Within a codonposition, different shading represents different SNPs.

20168 | www.pnas.org/cgi/doi/10.1073/pnas.1110064108 Jabara et al.

Dow

nloa

ded

by g

uest

on

May

30,

202

0

Page 4: Accurate sampling and deep sequencing of the HIV-1 ...Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID Cassandra B. Jabaraa,b,c, Corbin D. Jonesa,d,

four pairs displayed significant LD. A positive D (i.e., linkage) wasfound for six of the seven pairs in the untreated populations, withone pair associating at a lower than expected frequency. Overall,LD did not appear to play a significant role in defining the pro genepopulation in this late stage individual, with only a single pair ofSNPs showing linkage in both of the time points.

Detection of Multiple Drug-Resistant Alleles After Exposure to Selec-tion by a Protease Inhibitor. The third plasma sample we examinedfrom this subject was from a time point (T3) after the initiationof therapy with the protease inhibitor ritonavir. It is apparentfrom the cyclical pattern of viral load and self-report that thisperson had incomplete adherence to the drug regimen (Fig. S1).Thus, we expected selective pressure from the drug to disrupt theviral population but not to select for the more homogeneouspopulations that are associated with virologic failure solely dueto the appearance of drug resistance. The choice of this sampleallowed us to look at the evolution of resistance and the per-sistence of polymorphisms in both the resistant and nonresistantportions of the population. Over two-thirds of the sequencesfrom T3 carried a resistance mutation, with ∼50% of thesequences carrying the V82A allele, the most common resistancemutation associated with resistance to ritonavir (43).There were two divergent paths for population diversity at the

third time point. For the large V82A-containing populationthere was a general trend of decreased diversity (π = 0.0069),consistent with the expected bottleneck associated with fixing adrug resistance mutation. In contrast, the diversity in the coex-isting drug sensitive population was higher than the drug-re-sistant population and comparable to the earlier time points (π=0.0082; Table S2).Although V82A is the most common resistance mutation asso-

ciatedwith ritonavir resistance, the I84V allele andL90Mallele canalso be selected and in combination with V82A can confer a higherlevel of resistance (44). We detected all three of these distinct drugresistance alleles in the T3 sequence population, collectively rep-resenting 69% of the total T3 population: V82A (50% of the

population), I84V (5%), and L90M (14%). These three resistancemutations appeared on different genomes, with only a single ex-ample of a sequence with two of these resistance mutations (V82A/L90M). In total, there were 136 unique sequences carrying theV82A mutation (all with the GCC Ala codon), 29 uniquesequences carrying the I84V mutation (all with the GTA Val co-don), and 36 unique sequences carrying the L90M mutation.There were also small groups of pro gene sequences in T3 that

appear to be the result of selection by ritonavir. Two othersubstitutions at position 82, V82I and V82L, were detected at alow level at T2 and also seen at T3, but now representing 1.3%and 1.1% of the population. V82F was also detected as 0.14% ofthe population at T3. Finally, the compensatory mutation L63Pwas detected at T1 and modestly expanded at T3, with half of thesequences in the V82A background (Table S2).An important issue is the number of times each of the resistance

mutations evolved in the presence of drug selection. The data areconsistent with the major V82A variant (42% of the V82 sequen-ces) growing out from the preexisting variant detected at T2. Forthe six genomic variants of V82A that each accounted for greaterthan 2.5% of the V82A population, all were on the background ofthe consensus except for the three different polymorphisms atpositions 19 and 70 (Fig. S6B). In total, these represented∼71% oftheV82A population and presumably arose via recombination withthe founding sequence (Fig. S6A). The remaining 29% of theV82A-containing genomes vary in relative abundance from 2.3%to 0.1%, including over 100 unique sequences that each appearedonce but to a large extent represent the variation seen at T1 and T2added on to the predominant V82A genotypes.The composition of the I84V and L90M populations were

similar to the V82A population. In each case there was a pre-dominant population defined by a 5′ polymorphism: the majorL90M lineage (69% of the L90M sequences) was on the G16G/L19V background (Fig. S6 C and D), whereas the major I84Vlineage (35% of the I84V sequences) was on the consensus se-quence background for the 5′ polymorphisms (G16/L19) (Fig. S6E and F). The next three most abundant I84V lineages, repre-senting 28% of the I84V sequences, differed from the mostabundant sequence by other 5′ polymorphisms (Fig. S6F). Sim-ilarly, the next three most abundant L90M lineages, representing14% of the L90M sequences, differed from the most abundantL90M sequence by 5′ polymorphisms (Fig. S6D). With the ex-ception of the 5′ polymorphisms and the resistance mutations, alleight of these lineages were in the consensus sequence back-ground. The remaining sequences are accounted for by the lowlevel variability added onto these major lineages.As noted above, the major V82A lineage was detected at T2 (as

a single genome), and this population was likely clonally amplifiedto form the large proportion of the drug-resistant population seenat T3 (Fig. 3). L90M was also detected on the same pro genebackground in the therapy-naïve environment at T1, and waslikely also clonally amplified to form the large proportion of theL90M sequences (Fig. 3 and Fig. S6D). In contrast, V82I andV82L were detected in the pretherapy time points on backgroundsequences that did not become the predominant sequence whenthese mutation modestly expanded at T3, although these twopopulations have complex mixtures of the 5′ polymorphisms,which may indicate low level persistence and recombinationduring the period of drug exposure. Finally, I84V and V82F werenot detected in either pretherapy population (Table S1).

DiscussionComplex viral populations can form within a host (45–47). Highthroughput sequencing technologies allow for extensive samplingof these populations (1–3, 5, 22, 48). However, these technologiesare severely limited when a PCR amplification precedes the se-quencing protocol, as each sequence read has the potential to bereported as an independent observation without properly con-trolling for PCR resampling, PCR-mediated recombination, al-lelic skewing, PCR-introduced misincorporations, and sequencingerrors. When working with pathogenic agents in clinical samples,

T1T2T3

V82

V82A

L90M

I84V

V82I/L/F

0.0030

TGG

TGTG

A_13_T3_V82A_G

ACC

CC

AT_15_T3_V82A_

AATGCCGC_10_T3_V82A_

ACGTTTAC_39_T2

TCGAAATA_21_T3_L90M_

CTCCCAGA_17

_T3_V82A

_

TGTC

TCTA

_35_

T2

CCCGATAT_29_T2

TGC

TAACA_15_T3_V82A_

TCTC

ACCC

_30_

T1

TCTGAAGC_29_T1

TCC

ATACA_15_T3_V82A_

GG

CTAA

GG

_17_

T1

CGACCGGC_7_T2

GGACATTT_12_T2

CACG

CCGT

_22_

T1

TTGG

TCCG

_42_

T1

CCGGTTCG_44_T1

TGAT

CCTA

_40_

T1

CAGCCCCG_19_T1

AACGTTCG_7_T2

TCGC

AGAT

_51_

T1

CTGT

CAAC

_23_

T2

TGCGTCTC_19_T3_V82A_

GAAATAGT_14_T3_L90M_

GG

ATTCAC_9_T3_V82A_

TCGCGCCG_13_T2

AAGGACTT_11_T3

AAGACCAG

_11_T3_V82A_

ACTATAAC_7_T3

TTCACAGT_

14_T3

_V82A

_

ACACCAAC_9_T3_

V82A_

AGCTTTAG_18_T2

CCGCGACG_51_T1

GAATAGTG_11_T3_V82A_

TCCGGTTC_28_T2

2T_0

1_GA

CCCC

CT

ACACGGCG_13_T3_V82A_

CTAT

CGCT

_3_T

3

TCCCGCGG_14_T2

AAATCAAG_14_T1

_A28V_3T_82_AATTG

GAC

CAAT

CCTC

_16_

T3

AAAC

GGAC_1

1_T3

_V82

A_

ATAGAAAC_15_T1

GACC

GCCG

_25_

T1

GC

ATGTAG

_15_T3_V82A_

TGCGCTTA_7_T3_I84V_

GTATAGCT_8_T3

CTGCCTGC_12_T2GCCCCCAC_16_

T2

CGACAGCC_41_T1

ATTGGGGC_11_T2

ATGCACAA_79_T1

GCA

TCG

GT_

22_T

2

AAAC

AATA

_4_T

3_V8

2A_

CTCCGCGA_15_T2

_A28

V_3T

_42_

TCT

TAAT

A

TTTGTAAG_42_T1

GGTCCCCC_7_T2

CCGAGAAT_24_T2

TCTAAG

AA_26_T3

CTCGCCCC_16_T2

CCGCCCGC_32_T2

CCCG

CTCC

_40_

T1

CCCTCCCA_27_T

1

ACTGCTCC_24_T2

GGAGTTTG_5_T3

CCATACCT_15_T3_I84V_

GAGCGTTA_29_T3

AAATGTAT_7_T3_I84V_

CCTT

CTAT

_45_

T1G

ACTTCCA_8_T3_V82A_

TAAG

ACAT

_12_

T3

GTCTCGCT_26_T2

GAT

GTA

GC_

6_T3

GCACCGTC_19_T2

ATAC

CAAT_

10_T

3_V8

2A_

AGGAT

AAG_6_

T1

GAACATAG_7_T3TTAACTCA_21_T2

ATTCTCCA_20_T3

ACTATGG

C_7_T2

CTTA

TCAT

_12_

T1

AATACCCA_8_T3GCCGACCT_3_T3

CCTG

CCGC

_14_

T2

GCCGCCCG_4_T2

AAATGGCC_3_T3

CC

GG

AAAT_13_T3_V82A_

CTCCCCGT_2

4_T1

ATAAGCGC_9_T3

TACG

AAAC_13_T3_V82A_ _A

28V_

3T_3

2_TT

CTT

CGA

CCCAGTCG_15_T2

CCTC

GCG

A_3_

T3_V

82A_

GGAGTCCT_8_T3

GCGCCCGC_54_T1GAGGTTTC_16_T3_V82A_

TATGGAAA_35_T1

GTAAATCC_3_T3_V

82A_

CATGTCCC_11_T2

TCG

CTTG

C_10

_T3

CAAGATTA_15_T3_V82A_

CTAACAGA_4_T3GCCCGCTA_12_T2

CGAACTAA_26_T1

GCCCCACC_28_T2

CTTG

GAC

C_33

_T2

AGGCCTTT_3_T1

AGCCTTGT_10_T3_V82A_

ACCT

CCCA

_14_

T2

CCCCCCCA_6_T2

CCCGATAA_10_T3_V82A_

GTA

CTG

AG_2

1_T2

CCCTAGCT_1

7_T1

GCGCGGAC_4_T3

TCCCATGT_11_T3_V82A_

GG

ATCTAT_3_T2

ACTGCACT_4_T3_V82A_

GCTTCGTC_38_T1

TTGCGGGT_32_T1

CGCACCTC_8_T2

_A28V_3T_72_CTATAATA

ATCTC

CG

T_13_T3_V82A_

GTAG

TTTG

_20_

T3

GCTG

GTCT_31_T1

ATTG

CTTT

_30_

T1

CCCGCATC_5_T2TCATCTTG_10_T3_V82A_

TTTGGTAA_9_T3_V82L

TTCTCAGC_18_T2

CGTAAGCA_24_T3_V82A_

AAAGTATA_16_T3_V82A_

GAG

AGCTA_9_T3_V82A_

ACTA

CTTT

_20_

T1

CG

ATTC

TC_3

4_T3

_V82

A_

CTAA

TTAA

_29_

T1

GGTCGGCC_8_T2

CTGTTCCT_6_T1

CGATAACC_45_T1

GCACAACG_12_T3

CTCGTAAA_15_T3_I84V_

AGAGTATG_4_T3_V82A_

ACCCTGCA_6_T2

AACG

GAAC

_14_T3_V82I

GATCGCCA_3_T1

CCCT

GCCC

_14_

T2

GTG

TGC

TA_13_T3_V82A_

CCAAAACC_15_T2

GAAAAAGA_8_T3

AGG

CCCT

C_4_

T2

GCGTAGCG_21_T2

GCCCACCG_17_T2

GAGTTAGA_14

_T3_V

82A_

GGTAGGGA_38_T1

ACCACGTG_23_T2

AAGCGCTG_12_T3_L90M_

TTCCGCCT_16_T2

CGACCTAT_23_T1

TGAATGGT_21_T3_V

82A_

GTTCCCTT_

5_T3_V

82A_

AATCG

CC

T_19_T3_V82A_

TTTAATTA_16_T3CGCT

GG

AA_1

4_T2

CAGCCTTC_3_T

2

CTTGTC

TT_5_

T1

TACTGTTA_10_T3_L90M_

AGAACAAT_9_T3_V82A_

GCGC

GCGC

_4_T

2

GACGTTTT_17_T1

TACC

TCAT

_8_T

2

TAATAGTA_5_T3_L90M_

AGCA

GCGG

_9_T

3

CCAACAAG_17_T2

GTGAGAGT_10_T3

GCTTGGCG_24_T1

GTAT

TGGC

_6_T

3_L9

0M_

TCGAGGGA_3_T2

CGAT

GGCC

_13_

T2

GACC

CACG

_5_T2

ACCTAGCT_36_T2

CCAGGGTG_18_T2

TAGCGGGA_5_T3_V82A_

CCCG

ACCA

_3_T

2

TACTTACC_4_T3_V82A_

CCCA

TTTC

_6_T

3_V8

2A_

TGGTCTCC_14_T2

GAGC

CACG

_25_

T2

GTGAAAGT_17_T3_L90M_ GTT

TTC

AA_1

5_T3

CTGGTGTC_4_T1

ATTTTGAC

_14_T3_V82A_

TTCCTCGA_21_T2

AATTGCTG_29_T2

TACC

GG

TT_2

5_T2

ACTCGACA_18_T3_V82L

CCCCATAT_7_T1

AGAC

CAAT

_6_T

3_V8

2A_

CTAGACAA_17_T3_V82A_

GCC

AGAT

A_28

_T1

AAAACACT_11_T3

AAGG

CAGC

_35_

T1

CAAC

CACT

_23_

T2

TACGCTCA_12_T3_V82A_

CCCCTTTC_14_T2

AATG

AAGT_

11_T

3_V8

2A_

GTCCAAGT_15_T3_V82A_

AAAAGAAA_4_T3_L90M_

CG

ATTTAG_23_T3

CGG

CGCA

T_17

_T2

CTTAAC

AA_16_T3_V82A_

CCAT

GAAT

_41_

T1

CGCC

CCTT

_3_T

2

TGC

GAATA_12_T3_V82A_

TGCG

GTCT

_22_

T3

TGAGGTTT_20_T3_V82A_

TCAG

ATAT

_41_

T3_V

82A_

TAG

CAAA

C_8_

T3

_A28V_3T_52_AC

GGATTA

TGG

TCCC

T_21

_T2

CGTTTA

GG_11_

T3_V

82A_

CGGACTTC_26_T2GCCCACCC_23_T2

CGGACAAA_3_T1

CCCAAGAG_5_T2TGTGGCCG_22_T2

GAATATCA_3_T3_L90M_

TCCA

CGGG

_4_T

1

TTAAACTG_18_T3_L90M_

CG

AATA

AC_1

4_T3

_V82

A_

TGTCCCTC_36_T2

AAAAACTA_4_T3_L90M_

AGGA

TACT

_3_T

1

TGTC

AGCT

_23_

T3

CACGTGTA_11_T3_L90M_

TGCCTCG

A_13_T1

CACCGCCA_7_T2

CCATAACC_16_T2GTCCACCC_12_T

2

AACAATGC_13_T3_V82A_

TTCGTTCT_18_T1

TCAAAAGG_20_T3

ACATTTCA_38_T1

CCAACGCC_14_T3

TATCTACT_41_T2

AATC

ACCC

_28_

T1

TATATTTG_10_T2

TGCCTA

TA_9

_T3_

V82A

_

CAGAGTTG_15_T3_V82A_

GTTCCACC_3_T3_V82A_

CGGCCGCC_9_T2

GCCCCGCC_11_T2

GGCC

CGTC

_3_T1

AGGGAAAC_17_T3TTGTGCGG_9_T3_I84V_

AAAGACGG_15_T3

CTACCATA_31_T1

CCAG

CCAG

_24_T

2

GACGGACA_7_T3_V82A_

GGGGTAGT_7_T3_V82A_

TACCACGC_11_T2

GCATCAAG_27

_T1

GGGGCCCC

_19_T1

GCCTCCG

C_17_T2

AGTAG

AGG

_18_T3_V82A_

TAGCGTGA_37_T1

TTGTGCAG_4_T3_V82A_

ACTACGAC_41_T1

TTCCTACG_42

_T1

ACGGCAAT_20_T1CCAC

CGTG

_38_

T1

GCAACACC_14_T1

CCCCCTCG_15_T2

TGTCAATA_33_T1

ACGC

CGAA

_7_T

2

ATAA

CAT

T_14

_T3

CTCCCGCG_3_T2

GG

TTGG

TG_19_T3_V82A_

TGCGGACC_36_T1

TCGTGTTA_25_T1

GG

AATA

GA_

15_T

3_V8

2A_

ATATTGCC_3_T3_V82A_

TAGAG

TCA_14_T3_V82A_

TTAG

CTCA

_4_T2

GAG

AAAT

A_6_

T3

TGTCCGAC_30_T2

GACC

CCAC

_25_

T2

TTGGCCCC_4_T2

GG

GAACCC_7_T2

TCCT

CCG

T_6_

T2

TACGAGTC_20_T2

CCCGTCAC_16_T2

GCCCGGGA_14_T2

GACCGCCC_13_T2

TTATACCA_13_T1

GTAT

AGGT

_30_

T3_L

90M

_

TTATATTA_10_T3_V82A_

ACGCACCG

_17_T3_V82A_

ATAA

TCCC_1

0_T3

_V82

A_

GTGGTCGC_13_T2

GCCCTAGT_6_T3_V82A_

GCTACCAG_40_T1

CCCT

AGCA

_22_

T3

CCCCGTCA_15_T2

CCCATATG_27_T1GAGCGGCG_20_T2TAACTCAC_17_T2GGAGGGAC_4_T

2

CCGCGCAG_11_T1

CCGACTGC_20_T3

CGGTG

CAC_

6_T1

CCCGGCGT_25_T2

ATG

GG

AGA_

7_T2

TTTAACGA_7_T3_V82A_

GGTCTCTC_12_T2

CTCC

CCTG

_16_

T1

CGTCG

TAT_47_T1

TCAGTTTT_7_T2

GGGA

AATT

_11_

T2

CTG

ACCT

A_5_

T2

GCTCGACC_22_T2GCACGATC_25_T2 TTTCTCTC_19_T2

CTAATACT_11_T3

TTTCTTCT_26_T2

CCCCCCCC_20_T2

ACAAATTA_6_T3_V82A_

GCGCGCCT_19_T

1

ACTGAATA_10_T3_V82A_

TGCGGCGC_16_T3_V82A_

CTAAATTC_36_T1

ATTC

TAG

G_1

4_T2

AGAG

AGCT

_36_

T1AG

GTC

CCT_

28_T

2

GGATGCCT_19_T3_L90M_

CCCCTTTT_20_T1

TAGTA

AAA_32_T1

GCGAAACT_10_T3

CATAGCGT_29_T3_L90M_

TAAACCAT_2

2_T1

AAGGTTAA_16_T3_I84V_ATTCCCTG_31_T2

TATCCTTT_25_T1

CACACACC_4_T2CCAGACGC_32_T2

GCGGAAAG_10_T3_I84V_

GC

ATCAC

A_14_T3_V82A_

GAGACGAG_12_T3

TATA

AGAA

_5_T

3_V8

2A_

TCAAGACA_17_T3

GCTCCCAC_12_T2

ACTC

ACTG

_18_T3_V82A_

GACGCCAT_16_T2

GCGA

CCCC

_15_

T3

CAGAATCC_6_T3_V82A_

CCCG

GCGA

_15_

T2

ACCTACCC_3_T3

ACAT

CC

AC_2

0_T3

_V82

A_TA

TATT

AA_7

_T3_

V82A

_

CGCCCCCT_19_T2

AAACTAG

T_15_T3_V82A_

GTTC

AGTG

_20_T3_V82A_

ATTCCCGG_16_T2

ACAGGGCT_13_T3_I84V_

AACG

GAT

G_5_

T3

TCAGGCCA_14_T3_L90M_

TGTC

TGAT_2

1_T3_V

82A_

GAACCG

GG

_19_T1

CTCATAGA_10_T3_V82A_

CGATTT

TG_7

_T3_

V82A

_

CGG

CGAG

A_9_T3_V82A_

GCGCCCCT_17_T

3_V82A

_

CTCCGCGC_13_T2

ACTC

GG

AC_1

4_T2

CCCCACCC_6_T2

GAAACCG

T_10_T3_V82A_

CGGCCCCG_11_T2

CCTCCACC_3_T2

CCGCCCCG_36_T2

AACGCACT_32_T2

TGTA

CCTC

_9_T

2

TGCGCGCG_12_T2

ATGCCTTT_3_T3

CAAGAC

AT_1

2_T3

_V82A

_

TCGTGCCA_36_T2

TTTCCGAT_9_T3_L90M_

GTTAATTA_9_T3_V82A_

CAC

ATACA_17_T3_V82A_

GGAACCCT_8_T2

AACCCCCT_9_T3_V82A_

CACTAGTG_9_T3_L90M_

ACAGCGCA_29_T1

TCCCCTAG_5_T2

GACTTAAG

_9_T3_V82A_

GAGCCCTT_

15_T1

GGTTCCCG_48_T1

CCCGCACT_11_T2

AGTC

GC

GC

_17_T3_V82I

3T_71_ATTAATCT

CTCT

CACC

_12_

T3

GG

AAAATA_5_T3_V82A_

CTGCT

AGC_

17_T1

AACGGGGA_1

7_T1

TACTTAAT_8_T2

TTGAC

TAA_

8_T3

_V82

A_

CCAGAGAC_28_T1

CAAAGTCT_7_T3_V82A_

CTTTCTGC_11_T2

ACCCGTCT_31_T2

CCAC

ACAC

_37_

T2G

CTAG

TGG

_10_

T2

AAGCCATA_22_T1

CATC

CACC

_18_

T1

TCTCTGAA_14_T3

GTG

CCG

TT_1

5_T2

TCTA

CGGG

_28_

T1

CCGA

CCCG

_9_T

3

CCTCTCCT_12_T2

AGCCTCTT_25_T2

CCATTCCC_5_T1

ATCT

CGCC

_21_

T2

CCAA

CGCC

_34_

T1

CC

CC

ATGA_16_T3_V82A_

GTTCGGAG_12_T2GTTTATAA_6_T3

GAC

GTC

TT_7

_T3_

V82A

_

AGCGTCAG_17_T2

GATCG

AGT_

5_T3

_V82

A_

CCCC

GAC

T_16

_T3

GCCCCAGA_16_T2

CGTACGGA_12_T3

_V82A_

TTCGCCGT_18_T1

ACCCCCAG_10_T1TACTAGCC_12_T3_V82A_

CGAC

CTTG

_28_

T2

AGGTA

AAA_

3_T3

_V82

A_

TACCACTC_5_T3

CTCCATGC_5_T3_V82A_

CGTGGCTA_4_T3_V82A_

CGTCGGAA_17_T3

TCG

CGG

GT_

5_T2

GTG

TTTAC_13_T3_V82A_

CTTGCCCA_10_T2

CACAATCA_24_T2

GAATAAC

A_15_T3_V82I

CTCGCTG

G_14_T3_V82A_

GG

CCTG

TC_9

_T2

ACTA

CAAC

_14_

T1

TAGAGCGT_31_T3

ACC

TACC

A_14_T3_V82A_

GTTG

AGAT_5_T3

TCCGGATG

_29_T3

_V82A

_

CGAA

CCAC

_11_

T2

GTT

TTTA

T_3_

T3_L

90M

_

ATTGGCCA_14_T3_V82L

TCGGTGCC_18_T2

TTTGTGAC_9_T2

CGAGTAGA_7_T3_L90M_

GATGGTA

A_21_T1

ACTCCCCG_18_T2TGAG

TTCG_12

_T1TCGCTGCG_3_T2

GGTA

CCTC

_7_T

2

GCCCACCT_11_T2

CCGCACTC_51_T1

CCCA

CTCC

_7_T

2

GCACCCCA_12_T3_V82A_

TTTC

GACC_11

_T3_V

82A_

GGACGTCG_31_T1

TCG

GTA

AC_3

_T3

CCACCCGT_47_T2

AGTTGTCG_4_T2

TCCTGACT_11_T3_V82A_

ACTT

GAT

A_11

_T3_

V82A

_

GTATTG

CG_25_T1

AACCCAGC_3_T1

CAAA

CCCT

_16_

T2

CTTTTGCA_10_T3

TTAC

CGAC

_7_T

2CG

CATC

AT_7

_T2

ACTGTCAA_4_T3_V82A_

TCAATGAT_18_T3

GCCGCCCC_6_T2

CGTTTTCT_4_T3_V82A_

CCTGAATA_25_T3

CCCAGCTG_8_T2

TCCAGGCG_4_T3_V82A_

CCGC

TGCG

_20_

T2

GATGCGTG_7_T2

ATC

TCTT

A_11

_T2

CCCAGTAG_3_T2

AAGCTCCT_15_T3_I84V_

TGTCGGGC_16_T2

GCCG

TGCC_16_T3_V82A_

TGCGCGCT_6_T2

TCG

GAA

CC_2

9_T2

TAGCGATG_3_T3_V82A_

CCTCGGTC_15_T2

CCCC

ATAC

_3_T

3_V8

2A_

TCGCC

GTG_7

_T3_

V82A

_

TAGG

CAAG

_25_

T2

TACAATCT_7_T3_V82A_

CTTT

GCGC

_6_T

1

CTACCCCC_30_T2

CCCCCCAC_26

_T1

CATC

CC

AT_20_T3_V82A_TTTTC

AAA_19_T3_V82A_

GCTAGGGT_23_T2

ACTTCCAG_9_T2

CCAATCCG_5_T2

GGGCCCAT_12

_T3_V

82A_

GG

GATC

AT_15_T3_V82A_

AAGAAGAA_26_T3

CGAGCTTA_7_T3_L90M_

TGCG

AAAG

_10_

T2

CACT

GACA

_3_T

3

CCTG

GTCT

_18_

T1

CCTG

CCAA

_9_T

2

TGCGCCCC_18_T2

TGATTAGC_21_T2

CCCCCGCC_18_T2

TGGGGCGG_17_T2

CTAT

CTCG

_34_

T1

CTCCTGTG_21_T3

AGAT

GCAC_8

_T3_

V82A

_

CACAAGAC_3_T3_V82A_

GGTCACTC_16_T2

GC

CATG

TA_13_T3_V82A_

GGAGACAA_3_T3_I84V_

CATT

AAGC_

6_T3

_V82

A_

ACTA

CATC

_14_

T3

AATGCAG

C_6_T3

GAGCCTGA_6_T3_L90M_

_A28V_3T_7_AG

CAGATA

GAGC

TTTG

_40_

T1

TTGGACCG_13_T3_V82A_

TACT

TACT

_18_

T2

CTCCCATA_35_T1

CCCACCGA_33_T

1

TGAG

AAAA

_9_T

2

CACG

CCCG

_3_T

1

AACAACTG_9_T3_V82A_

AGCC

ACCT

_28_

T2

GCTC

GGTC

_7_T

3

CTGG

ACGT_7_T3

CGG

AACC

C_20

_T2

GTC

CCCT

T_6_

T2

CCTC

GGGT

_11_

T2

GTCC

CCGG

_8_T

2

GCTCACCG_13_T2

CGG

ATAAC_12_T3_V82A_

GCCTCCGT_6_T3_V82A_ACTGAACC_24_T1

TGAATTAC_4_T3_V82A_

TTCTCCAA_4_T3_V82A_

GACG

GCCT

_20_

T2

CATAT

CAA_9

_T3_

V82A

_

TCTTGACC_8_T

2

AATACTCT_5_T3_V82A_

CTCCCCAA_6_T1

TAGTAACC_15_T3

CGGATAGC_8_T2GGGCAGGG_14_T2GATGTTAC_3_T3

ATGTACTG_14_T3

GGGACCCT_15_T2

GCAGCCAC_29_T2

CAGCCGTG_9_T2

AATGTTG

T_7_T3_V82A_

_A28V_3T_71_TAC

CAAC

G

ACCCACTC_21_T3_L90M_

TTGCAGAG_3_T3_V82A_

CATT

CGGC

_9_T

2

AAACTAGT_44_T1

TACCAC

GA_8_

T3_V

82A_

GACCACAC_3_T2

GCCCGGCC_25_T2

CGTCTAGT_55_T1

GCCTGCCA_4_T2

AGCC

TCCG

_18_

T2

CCCGCCTG_51_T1ACACGGCC_18_T3_I84V_CTGTTGCT_7_T1

CTCT

TTAA

_20_

T3

GCAAGCCA_3_T2

CCCATCTT_40_T1

GTATTGTG_17_T1

TCACCCCT_18_T2

GACT

GTAA

_35_

T1

GTTTATTT_17_T3_V82A_

GGAG

CCCC

_17_

T2

ACCAA

AAT_

13_T

1

GCCTTCAG_6_T2

CGCTCCCG_5_T2

TCTC

TTCC

_8_T

2

TTGT

CGTT

_6_T

1

AAAAGAAT_7_T3_V82A_

GTTTACGT_29_T2

TTGAGCCC_11

_T3_V82A

_

CGGGAGCC_17_T3

GCAC

TCGA

_35_

T2

TGAGTCAA_39_T3_V82A_

CGAT

GATT

_27_

T3

TCCGCGGG_19_T2

TATGG

ACC_8_T3_V82A_

GC

GC

TAGA_15_T3_V82A_

CTCC

TCAC

_22_

T2

CCGCACCC_11_T2

ATCTTAAT_23_T3_L90M_

ACGAGGGA_3_T2

CGAAATCA_9_T3_V82A_

GTAAGCGA_22_T3_I84V_

AACA

TTCT

_5_T

3

TAGAGGTA_20_T3

ACTTTTAT_25_T1

TCCC

TCTC

_7_T

3_V8

2L

CGTCCGGT_27_T

1

TTCA

CCGA

_13_

T3_L

90M

_

GACTCCCA_2

8_T1

AAAT

ACC

T_9_

T3

GCTG

GACT_5_T3_V82A_

CCACTGGC_7_T2

ACCCCGCC_15_T3_V82I

GTA

GG

TTC_

17_T

3

CCG

GAC

TC_1

8_T2

TTAA

AAGC_10

_T1

CCCCACAC_12_T3_L90M_

CGATGGCA_5_T1

TCTCGCGC_17_T2

AGACTCTA_41_T1

TTTTCCGT_17_T2

CCGG

TGCA_5_T1

CTTGCGAG_4_T1

ACACCATC_11_T3_L90M_

TCAC

CAG

A_12

_T3

CCTATTCC_17_T3

ACTA

TTGC

_42_

T1

AGCG

GCAC

_11_

T2

GACC

CCCG

_7_T

1

GTAGTAAC_29_T1

AAAG

GCGG

_26_

T2

CCGCCCAA_14_T3_V82A_

TAGACCCC_14_T3_L90M_

TGCTG

AAG_19_T3

_V82A

_

CG

TTCTC

A_13_T3_V82A_

TCTA

TAAT

_18_

T3

TTCC

TTCT

_6_T

1

CAGCACCT_11_T2

GTAAGTGG_16_T3

ATATTTAC_10_T1TC

ATTT

CC_1

9_T2

GAGGAAAA_23_T2

CAT

AGAC

C_1

4_T3

_V82

A_

CCGCGGGA_5_T2

CACTGCAC_26_T1

CTCGCATC_13_T2

CCGTGGCC_18_T2

CTACCCGC_15_T2

AAAA

TTGT

_3_T

3_V8

2A_

ACCAG

AAC_13

_T3_V

82A_

ATCATGCC_9_T3

CCCG

CCCC

_25_

T1

GGGA

CGGC

_12_

T3

GTTTAGGA_22_T2

CATCTCAA_43_T1CTCTAAGA_18_T3_V82A_

GAATATTT_35_T1

TAGTC

GTA_3_T3_V82I

CAGGGCGG_24_T2

CAAATTTG_18_T2

TCCCCGGT_14_T2

GAC

TTCC

T_14

_T3

TCCCGAAA_3_T1

TGTTTATC_9_T3_L90M_

CCCC

CTAG

_3_T

2

CTGGTGTT_38_T1

GCCCTCCC_24_T1

AATT

TTAT

_5_T

3

ACATTTCA_6_T2

TATCCGCC_16_T3_L90M_

CGGGCGTT_22_T2

TTTT

GTC

C_13

_T2

TAATG

TTC_1

2_T3

_V82

A_

GAGCACCC_12_T3_V82A_

TGGC

CTAG

_23_

T1

CACGCCGA_38_T2

ACCAATGC_6_T3_L90M_

AAAGCACG

_8_T3_V82A_

GGTTGGTT_18_T2

GCGTGAC

G_22_T3

_V82A

_

GCCGGGCT_4_T2

AAGTCG

GA_8_T3_V82A_

GATATGCG_19_T1

GATATCCT_5_T1

TCTGTAAC_10_T2

TCTGAGCT_33_T1

ATAAGCAT_3_T3_V82A_

AGTGATA

G_21_T3

_V82A_

CGACAGCC_8_T2

AGTC

AAG

C_7_

T3_V

82A_

_A28V_3T_52_AAACA

GGA

AGCA

CATC

_34_

T3

GCAG

GG

AC_23_T1

AGCATCAA_8_T3_V82A_

CCTGGGGC_7_T2

AGTACATC_3_T3_L90M_

CGAGGTTA_8_T3_V82A_

CCGGCTTC_13_T2

GGTCGGGA_10_T2

GCG

TCG

CC_3

1_T2

TCGGCCAC_5_T2

TAAAGTAT_15_T3_V82A_

CCCCGCCG_19_T2

GCGCCCGC_8_T2

AACTCTCC_39_T1

GGGTAGAA_3_T3_V82A_

GCCCAG

AA_27_T1

TATC

ACCT

_32_

T1

ACCTGCGG_32_T2

CTGCGTGG_10_T3_L90M_

_A28V_3T_92_AATAG

CC

G

GACATCGT_12_T1

CGTATGCC_18_T3_L90M_

AGGATCCG_18_T2

ATAAACGT_20_T3

AACACACG_9_T3_L90M_

CTGACCGC_30_T2

GGCCCCTT_25_T1

CCGG

TAAT_9_T3_V82A_

ACCACCGG_14_T2

TAGCATTC_3_T2

CTTAGTGC_9_T1

CCTACCTC_5_T3_V82A_

CGGCGAAT_6_T2

AGATTGTT_12_T3_I84V_

AATT

CATG

_19_

T3

GGGACAGC_23_T2

CGCC

GACC

_7_T

3

ACTA

GGGA_12_

T3_V

82A_

CTAAATCA_18_T3_L90M_

CC

TGTTAC

_14_T3_V82A_

TGCCCGTA_10_T3

GCC

TTTG

C_15

_T2

ACACTGCC_10_T3_V82A_

CAGTAAAA_16_T3_V82A_

ATTATCCC_3_T3

TGGACAAA_35_T3

GGGT

GAAC

_35_

T1

TATTAAGC

_17_T3_V82A_

CTCACACA_6_T2

GACAACCC_17_T3_L90M_

GC

GTTG

GT_21_T3_V82A_

AACCAGTT_13_T3_V82A_

GTC

ATTG

G_9

_T3_

V82A

_

GACGCGCC_9_T2

TGCC

GGGA

_15_

T1

GACGCAAC_70_T1

AGTCCGAA_12_T3_L90M_

TTAATATT_5_T3

ACCGGAGG_17_T2

ATGATGAG_31_T1

CCCT

TCCC

_11_

T2

CGAG

GAAA_10_T3_V82A_

3T_81_CACAAGAC

TTTCGCCG_3_T2

TATGG

ACG

_21_T3_V82A_

CTAAAGCC_17_T3_V82A_

TCCC

ACTG

_13_

T2

CGTGCGTT_37_T2

AGTTTTG

T_12_T3_V82A_

TCCCGCCC_7_T2

TGAT

CATA

_3_T

3_V8

2A_

GCACCTAA_23_T1

TTAG

TAAC

_22_

T1

GGGATCGT_9_T2

TTATG

ACC_32_T1

GCCCCCTC_9_T2

TGAAGCCT_14_T3_L90M_

ACTG

CGTC

_21_

T2

CCCC

CCGT

_6_T

2

GCCTTTCG_11_T2

AGG

ACAC

C_5_

T3

TCTA

CAAG

_16_

T3_L

90M

_

TTGC

CAC

A_19_T3_V82A_

TCCACTGA_3_T1

3T_4

_TAC

TCCC

T

CGAACCAC_8_T3_V82A_

ATTG

AAAG

_3_T

3

TCCATCGT_8_T3_I84V_

ACCACCCG_11_T2

GCGCGCCC_3_T2

CTCC

CCAG

_24_

T2

GCAGTACT_5_T3

GGTCATTC_19_T2

TCAATAAC_13_T1

CCGGATAC_19_T3_L90M_

CTTT

TAGG_1

0_T1

GGCCAGCT_9_T2

AATTGC

GA_15_T3_V82A_

TAAACTTC

_15_T3_V82A_

ACATTCTT

_19_T

3_V82A

_

CCCGTCCC_13_T2

TCGTATCA_31_T1

AATCCGTT_13_T3_L90M_

GTGGCGCT_13_T2

ATCTATAC_17_T3_L90M_

CGGCGCGT_11_T2

TTCCCATG_23_T2

AAAGAATG_12_T3_L90M_

GGCCAGCC_4_T2

TGG

CCCGT_12_T3_V82A_

TTATGTCA_5_T3_V82A_AATGG

GG

C_8_T3_V82I

CGCG

ACTC_13_T3_V82A_TCCCACTA_5_T3_L90M

_

_A28

V_3T

_32_

TG

CAT

GG

G

CCCTGCTT_3_T2

CGAGACAT_7_T1

TAGGCGGA_18_T2

GATCTACT_34_T1

AATACGAC_7_T3_V82A_

ACGGTG

GC_17_T1

ATTGTGTG_11_T2

ACACGGTG_8_T3_V82A_

CTCCACCA_13_T2

GTCCGTCC_4_T2

GCCACCGG_4_T3_V82A_

CGTATCGG_7_T2

GGAGGGGA_8_T2

2T_3_CACTCCCC

TTTCCTGA_13_T2

CGACCAGC_3_T2

AAGGCGTA_3_T3

ACTCACTT_23_T3

TCGCACCG_5_T3_V82A_

CCTG

CGGC

_3_T

1 TCCC

AAAC

_31_

T1

CGTTGACG_43_T1

GCGCAACG_30_T1

CGTA

CCG

T_15

_T3

AAACACTA_24_T3_V82A_

CGTCACTC_23_T1

ACCC

GCCC

_11_

T2

TCCAAGAT_6_T3_V82A_

GGGT

CCGC

_18_

T2

CCAAGCAC_10_T3_V82A_

GCCT

CCCG

_24_

T2

GTCTGCCC_33_T2

TGGT

ATGA

_17_

T3

GCTATGAA_4_T2

GGCCGCAA_9_T2

TACTTAAT_13_T3_I84V_

GATCTG

AA_3_T3_

V82A_

ATCCTAAA_11_T3

TGAATTCT_22_T3_L90M_

CATGCATT_4_T3

CCCTCTGA_29_T1

GCCGGAGA_8_T3_V82A_

CACA

ATAA

_16_

T3

ATTA

TTGG_1

0_T3

_V82

A_

ACAA

GCA

A_25

_T3

ACCCACAG_3_T1

ATTACTTA_5_T1

TACC

TACT

_22_

T2

CAAT

TCAA

_32_

T1

TGAGGGAC_10_T2

ACGTCCAC_17_T2

CCGTTGGT_3_T3_I84V_

TTACCAGC_35_T1

TTG

ACAA

T_26

_T3_

V82A

_

AAGG

TCCC_10_T3_V82A_

CTCTGCTC_24_T2

CGAG

TCAT_8_T3_V82A_

ACCT

TTAC

_4_T

3

CCAGGTAC_39_T1

AGAATTG

T_8_T3_V82A_

AAGAACG

A_19_T1

TTTATATA_11_T2

CCCCGCCA_4_T2

CCCTGTCA_22_T2

TGCCCGAA_3_T2

TACGGCCC_5_T1

ATTG

GC

AC_1

5_T3

_V82

A_

ATGAACCA_19_T3_I84V_

TACCCGGC_9_T3_

V82A_

TGTCACCG_9_T2

AGGA

AGAC

_8_T

3

CCACCCGT_8_T3_I84V_

ATACTGGT_9_T3

TGTTCGTT_15_T2

GTTTCGGA_16_T3_I84V_

_A28V_3T_14_GTA

CGT

GA

CGTGCACG_17_T2

GAAAACCC_21_T3_V82A_

CTTATCTA_18_T3_L90M_

ACCAT

AGC_11

_T1

CCACCGTC_48_T1

TCCC

TCTG

_5_T1

ACTTATCC_15_T2

TGGTTCCA_8_T2

CGACACGG_15_T2

AACT

TTAC

_5_T

3

CCTTTAGG_13

_T3_V82A

_

CCGCCCCC_9_T1

AAGCGCGT_16_T3_L90M_

CCGATCTA_36_T1

GAATAGCT_5_T3_V82A_

AACCCCCG_26

_T1

TTGATTCA_6_T3_V82A_

GATACAAA_6_T3_I84V_

2T_3

1_GT

GTCC

CC

ATCAAAAT_7_T3_V

82A_

AATCCTTG_11_T3_L90M_

CCCG

GG

TA_5

_T3

TTTTGATC_17_T1

GAA

AGC

TT_1

7_T2

TTCCGATT_3_T2

GCCACAGC_7_T1

GATGCACG_14_T2 GCACCACT_61_T2

GGAAAACC_24_T1

CTTTCCTT_8_T1

ATGTTTAC_16_T1

GAAGTT

TC_9

_T3_

V82A

_

_A28

V_3T

_42_

GAA

GG

CTT

ACCCAAAC_22_T1

CTTAACGC_22_T2

CAGCTATA

_17_T

3_V82A

_

GATCATGC_14_T3TTCAAGGC_15_T2

ACCACGCG_28_T2

ACACGCCT_12_T3_I84V_

TTTGCCGT_35_T1

GACC

TACG

_3_T

3

TTGC

GCCA

_3_T

2

TTGTC

GAG_8

_T3_

V82A

_

TGGCACGA_12_T2

CCCCCAAC_3_T2

CCTC

ACCA

_3_T1

ATAAGG

AG_13_T3_V82A_

TCGTTATA_5_T2

CGCTCCAT

_14_T1

GCCTCTTC_28_T2

GCTCGAAA_6_T3_L90M_

CGAC

ACAA

_7_T

3_V8

2A_

ACCTGTAC_32_T3_V82F

CTTC

AAG

T_17

_T3

TTCG

GGGC

_11_T

2

TCCCGGTT_5_T2

GAAACTAT_11_T3_I84V_

ACCTTCAA_9_T3

CACAGGTG_11_T2

AGCAAATA_31_T1

GCGTGTGC_15_T2

TAGTTCAT_3_T3

TAAA

CCG

A_15

_T3

ATAT

CC

AA_9

_T3_

V82I

CACAAGGT_16_T1

TCTGCAAT_16_T3_L90M_

GCCACGAA_9_T1

CTGCCAAG_24_T2

GGGGGTGC_8_T1

CCCAAAT

T_17_T

1

TCGG

CCCA

_21_

T1

GG

AATTTG_13_T1

CGGGCCGA_8_T2

GCCT

CATT

_14_

T3_V

82L

ATAGTC

AA_3_T3_V82A_

TATACAG

G_13_T3_V82A_

TCTGCATG

_17_T2

GGAATGTT_7_T2

GTAGCCAG_14_T2

GCCCGCTT_27_T1

AAAAAATC_5_T3_V82A_

TTAACTCT_13_T3_I84V_

CACTAAGT_2

3_T1

ATTCTGCA_31_T1

ACGC

AAAT

_18_

T1

CCTAGAAC_9_T3_I84V_

TAACGCAT_16_T3

CGTAAAAA_6_T3_V82A_

GCGTT

AAT_

5_T3

_V82

A_

GTATTACG_10_T3_L90M_

ACCACATA_37_T1

ACCCAGCT_7_T2

TCGATGTT_21_T2CGACCCGC_13_T2TTCCGCGA_21_T2

TAG

AATG

G_1

6_T3

_V82

A_

GCGGGTCT_4_T2

GG

AAAG

GA_

5_T3

_V82

A_CC

ACTC

GA_

6_T2

GG

ATCA

TA_5

_T3

CGACTAAT_19_T3_I84V_

TCCCCATA_19_T3

GTTAG

CC

T_16_T3_V82A_

ACGGGCCC_11_T2

TTAT

ATC

A_18

_T3

GGTGCCCC_17_T2

AAC

CAT

CA_

3_T3

_V82

A_

ATCTTTAT_3_T3_V82A_

GTGTGAGA_24_T1

TGCTGGTC_28_T2

ATG

CAAT

C_20

_T3

ACGG

CGCC

_17_

T2

AGCGTCTC_32_T2

TTCCCTAT_9_T2

CCCC

AACC

_26_

T3

TGGAGATT_16_T3_L90M_GCAAGTCT_6_T3_V82A_

TGCGCCGA_5_T2

AATTATAT_8_T3_V82A_

CCCGGCCC_11_T1

TGATG

CCC_19_T2

ACTC

TGG

T_7_

T3

TCTC

ACAG

_29_

T1

2T_7

_ATT

CCG

GA

AGCAGTAC_40_T1

GCTC

GGAG

_19_

T1

TAATGTAC

_14_T3_V82A_

GCCG

TGTA

_38_

T2

CACAACCG_35_T2

GCGATGTT_6_T3_V82A_

GCAGGAGC_10_T3_L90M_

GCCC

ATCT

_9_T1

GATATGGG_7_T3_I84V_

ATACTTTG_3_T3_V82A_

GCC

ATCC

C_21

_T3

AGCCGCAA_4_T2

TAGCCGGC_17_T2

CCCA

AGAT

_3_T1

CCCG

CGG

A_10

_T3

TACACATG_16_T2

CCGCGTCG_5_T2

CCGCACGC_12_T2

GTCG

CTAC

_38_

T2

CTCCGGAC_5_T2

GTACATTG_4_T3_V82A_

TTAG

TGAG

_22_

T2

CTCCCTCG_5_T2

AACTCGTC_31_T2

CCACACGA_8_T3_V82A_

ACCC

CCCC

_12_

T1

CCACTTAC_20_T2

TTCGGTAG_9_T3_V

82A_

ACCGTGAA_9_T3_L90M_

TGGGTTGT_9_T3_I84V_

GTA

GAC

AA_3

9_T3

_V82

A_

TAAGACAA_26_T1

CGGC

ACCC

_16_T

2

ACGCGTCT_5_T2

GCCCCGCT_3_T2

CTCAATAT_15_T3_L90M_

CG

TAGC

AA_17_T3_V82A_

AGCACTAC_30_T3

CGGATGTA_4_T2

TGAC

TGAT

_17_

T3_V

82A_

GGCCCAAC_9_

T3_V

82A_

TCAAGCTT_27_T3

TCCGGACC_15_T2

AGTCCAAA_20

_T3_V

82A_

TTTC

TCCA

_21_

T1

ACTCCATG_39_T1

AGCTTCTT_27_T1

ATCACCGA_4_T3_I84V_

TCACCCAT_40_T1

CGTATCCA_8_T3_V82A_

GG

GACATG

_9_T3_V82A_

CGGATGTT_22_T3_I84V_

ACTG

ACAT

_41_

T1

TTGCAG

TT_5_

T1

GAGCCCCC_15_T1TCGC

TCAC

_5_T2

AACACAAC_4_T3_V82A_

TGCC

TTCG

_26_

T2

GAGAG

AGT_

15_T3

_V82A

_

ACCCCTTC

_26_T1

CCCCACCC_5_T3

AAGGGGAT_17_T3

GAACTAG

C_10_T3_V82A_

TAG

AAAA

G_7

_T3

GCCATCCC_10_T2

ACTTCCCC_16_T2

GG

GCC

CTC_

19_T

2

CCCCCCAC_4_T2_V82A_

CTG

ATGAC

_17_T3_V82A_

TCAA

CTTT

_29_

T1

GTTCTTTA_42_T1

ATATTG

GT_7_T3

_V82A_

GTCACCCG_17_T2

CGGGTTCC_7_T2

CAATCCAC_40_T2CCTC

CCTA_24_

T1

CTGC

CCCT

_25_

T1

AGCT

ACTA

_7_T

3

AGAC

CCAA

_9_T

3

CTG

GCC

GC_

19_T

2

ATTTATCA_13_T3_V82A_

CCTCGTGA_5_T3

CGTC

ACCG

_17_

T1

TGC

GC

CTC

_21_

T3_V

82A_CC

TCAC

CG_2

0_T2

CAAG

CAG

C_7_T3_V82I

CCCCTCTC_13_T2

T2

TTACCAGC_10_T3_V82A_

TAAA

GGTC

_9_T

2

GGGG

GCCC

_3_T

1

ATCAC

ATG_3_T3_V82I

GCCCTGCA_19_T2

GTATT

TAT_

10_T

1

CGCTACAC_12_T1

ACTG

GCG

G_9

_T2

CGAC

CGCC

_19_

T2

TTCACCTC_11_T3_V82A_

ACGCGGCG_4_T3_V82A_

GCTA

CCCG

_20_

T2

AACCGTT

G_13_T

3_V82A

_

CCCACGTT_21_T3_V82A_

AGTGACCA_11_T3_V82A_

ATAA

CCAA

_5_T

2

AGTA

GTC

C_10

_T3

CCCG

ACAA

_29_

T2

TGG

CAG

CT_8

_T2 CCGCGCCG_17_T2

GG

ACG

GAA

_6_T

3

GCGG

GCTC

_5_T

2

TGGCCGTC_4_T2

ATCAAATA_5_T1

ACTCCGTT_14_T3_V82A_

AGAG

GG

CT_21_T3_V82A_

TACGGGAT_21_T2

GTGCACGA_17_T3

TCACCGCC_31_T1

AATTAGAC

_11_T1

TTTA

TATC

_11_

T2

TGGT

CCGG

_16_

T2

TGTCGCCT_11_T1ACTGTCGA_11_

T2

ACCTATCT_36_T1

CC

AATG

AT_1

7_T3

_V82

A_

CCCATATG_14_T3_V82A_

TTCTA

GAG_3_T3_

V82A_

GATAAAAG_9_T2

ACTCGGCT_27_T2

TTTT

TCTC

_12_

T3_L

90M

_

GTCGCCGA_7_T3_L90M_

TGTCGCGC_13_T2

AGCGACCC_28_T2TGGGGTGC_16_T2TGCCACCC_7_T2

CTCC

CCCC

_7_T

2

ACACTGAT_1

0_T3_V

82A_

CTTAAAAC_3_T1

CCAG

CCCT

_13_

T1

GACCCACC_6_T3

GTCA

GACC

_13_

T3ATTGACGT_3

0_T1

CTGCCGCG_19_T2

TCC

ATGC

A_17_T3_V82A_

CGG

TAG

AA_1

4_T2

GACTCGTC

_5_T3_

V82A_

AGGAGAAC_17_T3

ACAT

GGCG

_37_

T2

GGTCTCAG_27_T1

ACACCAAT_5_T3_L90M_

TCACTTAA_12_T2

GCGCA

CAC_

18_T

3_V8

2A_

AGCTGCAT_38_T

1

CATGATTA_24_T3

CATG

CCCG

_43_

T1

ACTCCCCG_4_T3_V82A_

TTCGGTGG_19_T2

TTAT

AGAT

_14_

T2

TGGGGCCC_3_T2

TAACTGCA_8_T3_L90M_

GCGAGTGG_12_T2ACTGCATG_16_T2

ACTA

TGG

G_1

2_T2

TGCGGGAG_27

_T3_V82A

_

TTGGCTCT_31_T1

TAACCGCT_23_T3_V82A_

ACCGTGTC_24_T1

GCCACGGC_3_T2

TTAC

TGTT

_10_

T3

ACTCGGAC_13_T3

_V82A_

GCCAGACT_26_T1

TTCAGGAT_29_T1

TGAC

AGGC

_9_T

1

TAATTAGA_6_T3

GAGTGGGG_30_T1

TAAGCCCC_8_T3_V82A_

ATGCCAGT_16_T3_L90M_

GCCACCGG_25_T2

TTCTTCGC_5_T2

CCCCCGTG

_17_T2

ATCAT

CCC_8_T

3_V8

2A_

GTCACCGC_8_T3

ACTC

TGAA

_13_

T3

AACCGCCC_17_T2

CC

TGTTG

G_19_T3_V82A_

GG

CCTGTT_8_T1

AACA

TGTA

_17_

T1

CCCC

TCCA

_22_

T1CC

CTCA

CA_1

7_T2

ATTACCCG_14_T3

ACGAGCCA_31_T1

ATTGTGGG_10_T2

TACTTAGT_14_T3_I84V_

CGCC

CACC

_29_

T1

CGGCGGTG_21_T3

CTCTTTCT_22_T1

CCTGTGGC_7_T1

CCCA

CAGC

_17_

T3

AAACGCCG_7_T1

GACGGGGG_7_T2

ACTACCCC_12_T2

TGACCAGG_26_T1

CCACTTCT_9_

T2

TTC

GAA

GT_

10_T

3_V8

2A_

TTGTC

AAT_35_T

1

AACTGCTG

_11_T2

TCAT

TAGG

_7_T

2

TGAGACTT_8_T3CCAAAACT_5_T3

TGTTTACA_3_T3_V82A_

TCTT

ACCT

_28_

T2

AAGCGGCG_23

_T1

CTGC

TCGT

_22_

T2

TCTCCCCC_9_T2

CCAACTTT_17_T3_V82A_

CCGAATAA_25

_T3_V

82A_

ACGGGTTA_6_T3_V82A_

AGAA

ACG

C_15

_T2

GCCCCCTG_20_T2

GGTC

CGTA

_15_

T3_L

90M

_

GGAGAAGC_5_T3_V82A_

CCACCTCC_96_T1

CGCCGCGT_6_T2

TCTG

GAG

T_11

_T3_

V82I

TCGTATAC_14_T3

TACC

ATTG

_5_T

3

GTGGAGTG_7_T3

AGCC

CCGA

_27_

T2

TCC

GC

AAA_14_T3_V82A_

TCTG

AGTG

_29_

T3_L

90M

_

AATAGTGG_16_T3_V82A_

TTCCACTA_6_T2GCTATGCG_11_T2

AAAA

CAGT_

14_T1

GACACTGC_14_T2

TTCATG

GT_5_T3

_V82A_

AGCG

CATT

_15_

T1

CCCCCCGA_14_T3_V82A_

TAGCC

AGT_

11_T

2

CCCACCTA_36_T1

GAC

ACTC

T_18

_T2

CGTAAACA_8_T3_V82A_

_A28

V_3T

_22_

TACA

CG

CA

CTCTGCCC_25_T2

ATCTTGGT_16_T2

GCCGCCGG_24_T2

TCGGCTCT_15_T2

GTAGGCAG_17_T1

GAGTATAA_3_T3_V82A_

TTAAATGC

_12_T3_V82A_

ACCCGCAC_11_T2

TGAGCCAC_10_T3_V82L

AAGCACCG_9_T2

GCAATCAA_16_T3_L90M_

CCTG

CATT

_14_

T1

AAATGTTA_14_T3_V82A_

CCTC

TTG

A_15

_T3

CTCC

AACG

_33_

T1

TTGCCTTA

_3_T2

CCGCTGAC_14_T2

CCTG

ATTC

_12_

T2

CCAGCACC_15_T3_V82A_

AAACCAAA_13_T3_V82A_

CCATCGCC_36_T1

CGAT

TCTT

_47_

T1CA

GCGG

GT_7

_T2

TTCGAGTA_18_T3_V82A_

ATGACTTT_7_T2

CC

ATAATA_18_T3_V82A_

GGCTCCGA_9_T3

TAAT

TAG

G_7

_T3

TCAGCGAA_13_T3_V82A_

TGTC

TTTA

_19_

T2

AGCCCTAA_21_T3

CAAAA

TAT_

5_T3

_V82

A_

ACGT

GTAT

_3_T

3

CGTT

TCGA

_31_

T1

CCGA

TTCC

_21_

T2

ATTGAGCA_12_T3

AGTCTACC_18_T3_V82A_

CCCCATAA_32_T1

GATGCCTG_16_T1

GCACCCCG_8_T2

3T_0

2_CT

ATC

GTA

CCCGGATA_3_T3_V82A_

GGATCCCT_10_T3

TCAC

CCCG

_5_T

1

GGCCTATA_8_T2

CAAACATC_6_T3_V82A_

TACGAAAC_12_T2

CCGCCGCT_17_T2

CTCTTGAG_37

_T1

CATC

AACT

_6_T

2

TTCCACTC_15_T3

CGTGTTGA_11_T3_V82A_

TCCCGTCG_29_T2

CTCTGACG_3_T3_V82A_

TTGTTAGG_13_T3

GATG

GAAA_11_T3_V82A_

CGGCATAA_34_T1TAAAAGAA_16_

T3

ACCCGATC_10_T2

TATTGGAA_23_T2

CTTGGGAC_21_T2

CGCC

CTGA

_8_T

2

CACAGAGC_9_T2

CATTCTAC_19_T3

ACACGAAC_20_T3

_V82A

_

TCCATCCT_41_T

1

GCGTTCCC_17_T2

ATGACGGG_14_T2

ACACCGCG

_11_T2

TTTA

GG

TG_2

0_T3

_V82

A_

CCTCTGCA_29_T2

GAA

CGAG

C_17

_T3

ATATCATC

_18_T3_V82A_

CGCT

TCTA

_34_

T1

TGATCAAA_19_T3_L90M_

ATCG

GGGT_7_T1

ACGTTTGT_41_T1

GAGAGAGT_29_T

1

TTATCGCG

_3_T2

CATTTG

GT_14_

T3_V

82A_

CCGTTCCC_15_T1

TAC

GAG

TA_1

7_T2

GGGC

CCTT

_6_T

2

GTTACCTC

_29_T1

TATGCGCT_17_T2

TCCGTTCG

_11_T3_V82A_

CCGTCCCC_7_T2

ACAA

ATCA_

7_T3

_V82

A_

GGCG

GAAG

_23_

T1

TGTACCGG_7_T2GCCT

CCCC

_17_

T2

GGACTGCG_35_T1

AAGCACCA_12_T3_V82L

TCCCCGTT_18_T2

AGGCT

AAG_

14_T

3_V8

2A_

TTGG

CCAA

_16_

T3

ACGGATCT_15_T

3_V82A

_

CGCT

ACCT

_24_

T1

TCGTAACT_9_T2

AATTACTT_4_T3

ATCGATG

G_14_T1

GGAC

ACCC

_12_T

2

ACCGCCAA_5_T3_V82A_

GGGA

GAAG

_14_

T2

TTAACCCC_5_T1

GGACGCGG_19_T3_V82A_

GTACCGAC_20_T3_L90M_

CAGG

GTGA

_12_

T3_L

90M

_

GGGGACCA_7_T3_V82A_ ACCC

CGCA

_15_

T3_L

90M

_

GAGCTTGG_3_T3

ACACCCGT_3_T1

AGG

TTCCC_10_T1

CTCACTAT_11_T2

CGAGGCGA_4_T1

ACTG

ATGC

_6_T

3

CTGCCCTC_34_T2CCAC

CAGC_17

_T1

CGCA

TACG

_17_

T2

CACCAAGG_11_T2

GGACATTG_31_T3AGCGGTCT_36_T2

TCCCCCCG_16_T1CCGAGCCG_39_T1

TAGCATGT_12_T3

TTCCCCGA_3_T1

AGCTTCCC_10_T2

GAAAC

TTT_11_T3_V82I

ACTGTCCC_8_T2

GG

GG

GTC

A_5_

T2

GTGAGACT_12_T3_V82A_

GG

GAG

AAG

_36_

T1

CCG

CATT

T_4_

T2

TCTCGGTC_32_T2

CTTAAGGA_16_T2

GTCCACGC_7_T3_V82A_

TGCACCTA_3_T3_L90M_

TGCG

TAAG

_21_

T3

GTCCCCGC_23_T1

TCACCCAC_21_T3

AGG

TGTC

T_11

_T2

CAGGTGGA_40_T2

CAACAACG_9_T3_V82A_

TCCGAGAT_28_T2

GCCCG

GG

A_11_T3_V82A_AAGGCGGG_26_T2

ACCCCACG_7_T2

CCCCGCCC_6_T2

ACCA

CAAC

_14_

T2

GATCCGTT_28_T1

CCCC

GCCA_1

3_T1

ATTGACTT_20_T3_V

82A_

GCACGAGT_19_T

3_V82A

_

AAACTTG

T_13_T3_V82A_

ACCCCAAT_12_T2

GTTTACGT_3_T3_L90M_

TTTGGGGT_17_T2

GTAGCTTA_9_T3_V82A_

ATTCCGCC_3_T3_V82A_

CCGTTCTC_11_

T2

TTAC

CGCG

_5_T1

TGGAGTAC_9_T3GAGAGGAG_18_T3

GAC

CG

TCA_18_T3_V82A_GGACCAGG_7_T3_L90M_

CACAACCC_11_T3_L90M_CCTCACCA_5_T3_V82A_

_A28V_3T_8_AG

CTG

GTA

ATGCTACC_7_T3_V82A_

ATTGGGCA_6_T3_V82A_

TACAC

TTG_13_T3_V82A_

GCGTTTGT_11_T2

2T_9

_CCT

CCTC

C

TTCTACGT_4

3_T1

TCCCCCGC_22_T2

CACT

CGCC

_11_

T2

GTAACGAG_24_T1

ACTC

CGTC

_16_

T2

CTAC

GGCA

_37_

T1

TTTAGGTG_13_T1

CGGT

CCGG

_7_T

2

TGCC

GG

GA_

9_T3

AGTA

ATAA

_15_

T3_I8

4V_

CGGCCCAC_27_T2

ACCG

TCTG

_30_

T2

ACACGTTC_20_T3_L90M_

CCGTGCTT_17_T3

GGTCTACA_5_T2

CTCGCGGC_7_T3_V82A_

TAAG

AATT

_10_

T3_L

90M

_

AGTTTGCA_10_T3_I84V_

TCATATTA_23_T3_V82A_

GCGT

AAGC

_24_

T1

CCCTGGAG_47_T2_V82I

ACATACTT_3_T3

ATTT

AAGA_

7_T3

_V82

A_

TTCTCGCG_41_T2

AACGCTAC_9_T3_V82A_

AACC

GACC

_25_

T1

TATGTAAT_25_T3_L90M_

CCTGTGTA_3_T2

CAGCCGAG_9_T3_

V82A_

CGAG

TAGG_33

_T1TC

CGTT

TC_1

1_T3

TACTA

AAA_21_T

3_V82A

_

TCGAGCCA_30_T1

CCAGGGTC_19_T3_V82A_

CC

CG

TCTT

_10_

T3_V

82A_

CATTCTTA_23_T2 TGCGGTGG_17_T2ACCAGATT_14_T

2

CCATTACA_22_T3_V82A_

CCCCCGTG_16

_T3_V

82A_

CCCT

CTAT

_20_

T2

AAAC

TGCT

_6_T

1

TAGGTAGA_40_T1

ATACAAG

A_13_T3_V82A_

CGCAGATT_31_

T1

GGTTATGG_17_T3_V82A_

ACGC

TACG

_11_

T2

CGCGGGCT_13_T2

TCC

AAAT

G_1

4_T3

_V82

A_G

AAC

CAG

T_15

_T3_

V82A

_

AGCGGCTT_37_T2

ACCACCGC_31_T2CGTGTCCC_11_T2

AACAATTT_26_T3_L90M_

TTAGCGTG_19_T3_I84V_

TGAAAGTC_3_T3_L90M_

AACCAT

TG_7

_T3_

V82A

_

CCAA

CTG

C_6_

T3_V

82A_

AAAACGCG

_9_T3_V82A_

TCACCATA_13_T3

TCCGCGTG_8_T3_V82A_

GGGC

ACAG

_9_T

1

GCCA

GGCT

_3_T

1

CTG

CTCT

G_1

5_T3

CCCGCTTC_22_T2

TATT

GG

TC_5

_T2

TCCTGGTC_15_T2

CTTCCGGT_9_T2

ACAGGCTT_8_T3

TTATAGTC_3_T3_V82A_

CTCCTCAT_3_T3 CGGCCGCA_31_T2

CAAC

AGGA_

6_T3

_V82

A_

TCTT

GTGG_15

_T1

AAAATAGT_7_T3_I84V_

CCAGAGGA_26_T2

AAGATAGA_5_T3_I84V_

AGGATCCC_27_T1

GGGACACG_25_T2

AGGAGCCC_4_T2

ATACAAGG_13_T2ACTTACTC_22_

T2

ACTG

CCG

C_8_

T2

CTAC

ACAG

_9_T

2AACGTTCA_4_T3_L90M_

ACTGTGCG_13_T2

CGCG

CCTC_13_T2

TATTGATG_3_T3

_A28

V_3T

_22_

TCT

ACT

AA

GATAGAGT_8_T3_V82A_

ACCCCCAG_11_T2

TAATTTGG_15_T3

AGCGTGTA_20_T3_L90M_

AACCTGAA_29

_T1

CCCCGACC_3_T2

AGAG

AGG

T_24

_T2

GG

CACT

TC_2

4_T2

CGGCCCCC_5_T2

AGG

AATAA_7_T3_V82A_

ACCCCTCC_15_T2

GGCTATCA_7_T2

ACTCTTGG_11_T2AC

GCCCC

C_6_T

1CCTC

GGGG

_3_T

2

AAATAAAA_12_T2

TTTT

AGAT

_17_

T2

CGCCCTTC_17

_T1

TCCCCGGC_49_T1

GTCGCACA_7_T3_V82A_

CTCCTTGT_4_T2

TAAGAC

AC_14_T3_V82A_

CCGAAGGC_3_T3_I84V_

CCACAGAG_22_T2

CAGCTAAC_14_T1

ACAAAAGA_14_T3_V82L

TCCC

ACCC

_4_T1

GTACACGG_15_T2

CCCG

CATG

_26_

T2

AATT

AATA

_30_

T1

CAATTTGA_9_T1

TAAAGCAA_3_T3_V82A_

CCGGTGCA_11_T2

GG

GTT

ATA_

5_T3

GAC

ACTT

A_17

_T3

AGAG

TGCG

_17_

T1

TTTA

TATG

_8_T

3_L9

0M_

TGCA

GCCC

_3_T

2

ACCACTCT_10_T3_V82A_

CCGCTGCT_6_T2

TTG

TCAA

C_4_

T3

ACTG

TCTG

_11_T1

GGCCTCCC_33_T2

GAG

TATGC_42_T1

CTTCGGAC_35

_T1

AGTACACA_6_T3_V82A_

GCCGGATA_15_T2

AGCG

CCCC_27_T1

GAAATCGA_14_T3

TCTGCGCG_5_T2

TCCTAGGA_23_T3_V82A_

AGTG

ATAA

_4_T

3

TTCC

ACAC

_17_

T2

CGGTACAC_4_T3_V82A_

GAGG

TAGC

_35_

T1

CATCTAAA_5_T3_L90M_

AAAACGCT_10_T3

CCCG

CACC

_31_

T2

CAGA

CCCA

_35_

T1

ATAA

CTTA

_15_

T2

ATCACGTC_15_T2

CCCCAGCT_15_T2

ATGC

GCCT

_31_

T2

CCAT

CTG

C_9_

T3

GAAATTTT_5_T3_V82A_

ACGGTTAC_6_T3_L90M_

AAACGAAG_3_T3_V82A_

CCCCGTAC_14_T2

_A28V_3T_03_GA

GATAG

C

CTAGGGTT_5_T3_V82A_

GGTC

CGAC

_19_

T2

CCGA

CCCT

_7_T

2

GCCACCCG_10_T3

CCGC

CCTG

_16_

T2

TGGG

GGAA

_17_

T1

TTCGTTTT_11_T2

ACGTGAGG_21_T1

AAAC

CAT

T_10

_T3_

V82A

_

GTGTATAG_15_T3_I84V_

TGTCTCCT_19_T2

CAATGACG_5_T3_V82A_TTCGACGT_18_T3

TATTCAAC_13_T3_L90M_

ACCT

AGTA

_7_T

3

CAGCTCTC_13_T2

ACCGG

ATC_36_T1

ACC

CG

TGG

_17_

T3_V

82A_

ATTG

TGGC

_14_

T2

AAG

AATG

G_1

2_T3

_V82

A_

ACTGACCT_18_T

3

ACAACCAG_5_T3_

V82A_

TCC

TCAC

C_20_T3_V82A_

CAAGGGAC_8_T3_V82A_ACGT

AGGT

_30_

T1

ACCAATAA_17_T2ATTATTGT_13_T3

AGGTG

CTT_8

_T3_

V82A

_

TATCGACC_6_T3_V82A_

GCTGCTTC_25_T2

3T_6_CCTTGCCC

CTG

GAG

TT_6

_T2

CGGGACGA_2

2_T3_V

82A_

GAGC

CGCC

_20_

T2

TTGC

TCAC

_18_

T2

CAGAACAT_15_T3_L90M_ ATG

ATAC

C_5_

T2

CCGGGGCG_9_T2

ACCGAT

TA_17

_T1

AATTACTA_11_T3

TAAA

CGCT_15_T

1

CGTTTGCT_18_T2

GTTCCCAA_8_T3

_A28V_3T_4_ACTAATA

G

CTTC

CTCC

_13_

T2

TACCCGAC_7_T2

ACGCCTTC_6_T3_V82A_

AGAA

CTA

G_1

3_T3

_V82

A_

TCCCCATG_7_T2

CAGA

ACCC

_11_T

3

CG

ACATTA_19_T2

AGCA

CTTT

_11_

T3

GTT

CCG

CG_1

3_T3

_V82

A_

AATA

ACGC_

6_T3

_V82

A_

CGTC

CGCA

_29_

T1

CCAGGACC_18_T3_V82A_

ACCCGACA_10_T3

_V82A_

CCCCCCCG_39_T2TCACCCAT_18_T2

AGCTAGTT_3_T3_V82A_

GGTCTTAC_29_T1

2

TATA

AATT

_6_T

3_L9

0M_

CGCC

GTCC

_17_T

1

ATCA

AGGA

_25_

T3_L

90M

_

TGAACCCA_3_T2

CCCC

CTGC

_3_T

2 GTGC

CCAC

_21_

T2

ACACGATG_21_T3_I84V_

TCTG

TATG_15_T3_V82A_

ACAA

TGGA_

6_T3

_V82

A_

TCAG

GACT

_21_

T3

AACTATG

C_12_T3_V82A_

ATGAGCCC_12_T1CAAGCTCA_12_T3

TTCCGACA_19_T2

CCACTCAC_18_T2

GGGG

TTGC

_6_T

2

ATGAAATG_8_T3_V82A_

CCCCCTGA_3_T3_V82A_

ACCC

CTAC

_10_

T2

CAAAGATG_7_T3_V82A_

CTGC

ACTG

_18_

T2

ACCCAACG_22_T2

AAAGACCG_5_T3_V82A_

CGCGTACC_5_T1

TCAAGCAC_6_T3_L90M_CCCATCTC_7_T1

CGACTG

GT_8_T3_V82A_

TATG

CTA

A_9_

T3

ACCCGAC

C_21_T3

_V82A

_

ACCA

CCCC

_8_T

2

CAAA

TCG

T_7_

T3

TCGT

ACCC

_3_T

2

GCCCGCCT_13_T2

AGCAGCCT_3_T3_V82A_

CGTTACTG_11_T2

CTAA

ATCA

_19_

T2

3T_01_TCGCA

GGT

CCTTGTTC_15_T2

TAATAAAA_10_T3

2T_9

_TGT

CCCC

GG

AAGTCCA_30_T1

ACCGTCCG_4_T2

GGCACGCA_2

2_T1

TTATGCCG_11_T2

GTGC

TCAA

_35_

T1

CCTC

GTTT

_11_

T2

ATGCTGAA_20_T3_V82A_

AAAGG

GG

C_10_T3_V82A_

CCCC

AGCC

_6_T

2

CCCGGTGC_3_T2

CAGGTT

AG_13

_T3_V

82A_

TCCCATCG_11_T2TCGATTCC_8_T1

CATTGGTG_34_T2

CATCACGT_18_T3_V82A_

ATCCCTAC_48_T1

CCTA

CACC

_31_

T2

CTCTAACT_19_T3_V82A_

CTCCCCCG_17_T1

CTACTATC_7_T3

ATGACCCT_12_T3_L90M_

AGTCTAG

G_9_T3

GATGATTA_50_T1

CGTGTCCA_7_T1

TTGCCGTG_7_T

2

TTTACTAT_11_T3_L90M_

TCC

CG

GAT_16_T3_V82A_

CCTT

GG

TT_2

6_T2

TGCT

AATT

_14_

T2

TCTCTACA_5_T2

TGCC

CGGG

_47_

T2

GGCGGTGC_24

_T1

GCCC

TGAG

_5_T

2CC

AGGC

AA_4

_T1

TTCGCACC_14_T1

CTATTCTC_7_T1

AGCC

ACAG

_4_T

2

CCCTGCAG_51_T3

AACTTACT_6_T1

ACGAC

TCG_21

_T1

ACCCCGCG_30_T1

GAATGGCA_9_

T3_V82A

_

CACCCCAC_18_T2

GAAC

GGGA

_9_T

3_L9

0M_

CCGA

GCCC

_6_T1

ACTGACAC_8_

T3_V82A

_

CCGC

CAAT

_15_

T1

GGATTTCG_33_T1

TATGGGAC_8_T3

ATACATTA_9_T3_I84V_

CCTCCTCC_11_T1

TTCG

GATG

_8_T

1

CCTAGCCA_

18_T3_

V82A

_

ACAA

GAC

A_11

_T3

ACTATATT_14_T3_V82A_

ATACCGTG_3_T3

TTCGCCAC_4_T3_I84V_

CGCATGCG_23_T3_L90M_

CTTTTCCT_15_T2

ATAG

CTG

A_8_

T3_V

82A_

CGTC

CGCA

_38_

T2

ATAGGTG

T_21_

T3_V

82A_

CATG

GCCT

_18_

T2

GTTTAG

CC_12_T3_V82A_

CATTCTTC_7_T2

ATACGTGC_6_T3_V82A_

GGCCGCCG_7_T2

CACACCGC_20_T3

_V82A_

TTCGTT

CT_12_T

3_V82A

_

CTCC

CGGA

_33_

T2

GCCGCACG_18_T2TAGGGATA_9_T3

3T_4

_TTC

TTCC

A

CGGTTGTA_24_T3

GGCCCACG_12_T2CCCCTATC_8_T

2

GTATCCAG_7_T3_V82A_CAACTGTA_25_T2 ACGAAATG_4_T3_V82A_

CCACGGCC_16_T2

TAGCCCAC_14_T2

AAAA

ACCT

_10_

T3_L

90M

_

ACGCGCAT_4_T3_V82A_

CCGGAGCC_3_T2

AGAG

ATAA

_36_

T3_V

82A_

TCATCTTC_31_T3

GTTTAAGA_23_T3_L90M_

ACCCACCT_10_T1

CTGCAC

AA_13

_T1

TCCGCCAC_13_T2

CATG

TCG

C_44

_T2

ACAC

AACC_20

_T1

GCCCCG

GG

_13_T2

ACCC

CCCT

_6_T

2

TATGCCCC_15_T2

GGTC

CGGT

_21_

T3

GAG

CACA

G_1

3_T3

GAAGCAAC_17_T3

AATTACTG_17_T3_L90M_

ACCA

TGAA

_7_T

2

TGCACCCT_11_T3_I84V_

CTCCTTAA_20_T3GTTGTCAG_5_T3_I84V_

AACGTCCT_12_T3

CTCTGCCT_20_T2

TCATGTGC_16_T3_I84V_

CCTCCACC_15_T3

AGCTCATA_16_T2

CTAAGATA_16_T1

AAACTAAA_11_T3_L90M_

AACC

CGTA

_15_

T3_L

90M

_

TCTGGTAT_57_T2

CCTC

TCGC

_13_

T2

GTGTAAAG_11_T3_L90M_

CACT

TCCT

_23_

T1

GGATGACA_28_T2

GG

TTG

TGA_

21_T

2

GAGATTTC_16_T3_V82A_

AACGTATT_10_T3_V82A_

TGTTTATG_6_T3

TCGCCTGA_35_T2

GCGCGGCG_5_T3_V82A_

TCGCGAGC_5_T2

AGAACCAC_12_T3GATGACAT_4

6_T1

CCCCTTGC_13_T2

CCTCACGG_16_T2

TTACCGCG_8_T2

TGACCACA_37_T1

AAATTAAT_5_T3_V82A_

AGG

GAAAA_11_T3

AATCCCCC_8_T1

ACGGCAAA_9_T3

ATCCATCC_12_T1

GCCCGGGG_17_T2

AGAG

TTTG_12_T3_V82A_

CATTTCCC_34_T1

GGCAAAGT_3_T3

CAACACGT_9_T3

TATC

GCGA

_18_

T3_L

90M

_

AGGACGAC_15_T3_V82A_

ACCG

TGG

C_8_

T2

TCCCGCCC_21_T1

ATAA

ATTG

_8_T

3

GGACGGGA_25_T2

CGGCCGGG_11_T1

3T_1

1_CC

GCTT

CT

CTT

TCAC

A_20

_T2

GCTCGGCC_23_T2 CGAGATAT_6_T3_V82A_

TTACCATG_7_T2

ACTTTTGG_5_T

2

CAGC

TGCT

_19_

T3

CGTAGCGA_8_T2

CGGACCCG_3_T2

ACTCACTC_21_T2

ATAGGTCG_13_T3

_V82A_

GATTTCAT_10_T3_L90M_

CCATCGAG

_26_T1

TCCAAACA_17_T2

TCGCTCGG_5_T3_V82A_

GCTGATCC_37_T1

CCTAGTAG_45_T1

GCC

GTA

TG_1

3_T3

TAAGTTAC_20_T3

CGAAGTCG_17_T3_I84V_

AATTTAAA_6_T3

ATCT

GG

CC_4

2_T3

TCAT

CGG

G_1

9_T3

_V82

A_AGC

TAACC

_21_T3_V82A_

ATTAGAAC_11_T3_I84V_

CTGAACAT_5_T3

AGTGCCCG_17_T2

TCTGGTCG_9_T2

CTCTAATT_3_T1

GCG

AACAT_12_T3_V82A_

ACCCGCCT_5_T2

GTGC

CAAT

_39_

T1

ACCC

CACA

_18_

T2

TTTACC

CT_11_T3_V82A_

CCCTCTCC_35_T2

GGCTACAC_12_T3GACCCCGC_6_T2

AGAA

GCGC

_9_T

3

TAAA

GTC

A_29

_T3_

V82A

_

ACACGTAA_59_T1

CGCCCAAA_13_T3

GTAGAAAG_8_T3_L90M_

TTGAGGGA_7_T3_V82A_

TCTCCCCT_26_T1

2T_82_TTCCAGCT

CCGG

TGG

A_11_T1

AAAACCCC_9_T2

CTAAC

TAC_12_T3_V82A_

TCTGTTCT_11_T2

CG

CG

ACAT

_13_

T3_V

82A_

CCCC

CCGC

_3_T1

CGTT

TTGC

_17_

T2

GCAAGACT_23_T2

CTCGCCCA_6_T2

GG

TAAT

TC_5

_T2

GCCTCCTT_8_T3

ATACTATA_18_T3_V82A_

CAGA

GTCT

_45_

T1

CGGGCAAG_42_T1

CTCGTACA_23_T1

CTTA

AGCG

_3_T

3

TGCG

CCTG

_14_T

1

CCTC

ATCA

_6_T

3

CCAACAGT_6_T3_V82A_

ACG

GAA

AA_1

2_T3

CACCTCCC_5_T2

CG

TATTTT_5_T3

GCACGTTG_9_T2

CGGCGCGC_12_T2

AGAC

GCC

T_16

_T2

GAATAGCG_6_T1

TACT

ACG

C_10

_T3

GTCG

GTCT_13_T3_V82A_

AAGTCATG_17_T2

TCATGACC_32_T2

TGTAG

TTG_7_T3_V82A_

CCG

GAA

AC_5

_T3

GCCC

GACG

_13_

T2

TTCTCTAT_18_T2AACAC

CCA_27_T

1

TGTT

GAAC

_5_T

3

AACACCCC_5_T3_I84V_GCC

CACA

C_42_

T1

CCTATTGG_13_T2

TATTTAAA_3_T3_V82A_

ATTAGTAG

_8_T3_V82A_

GCGA

GGGT

_3_T

2

CAAG

AGCT

_29_

T2

GTGCACGT_3_T3_V82A_

ACGGGATC_10_T2

CGCACTTC_3_T2

GATCAAAG

_11_T3_V82A_

TGTC

AAG

C_1

5_T3

_V82

A_

ACGGAG

GG_14_T1

GTCGCCAC_9_T1

TTTT

CGTA_19

_T3_V

82A_

AACCCCAC_5_T3_V82L

TGAT

GCCG_9_T3

_V82A

_

CTGCCATA_54_T1

CGCGAGCA_7_T3_V82L

TTGATCAG_8_T3_L90M_

AACAAAAT_12

_T3_

V82A_

ACCCCTCC_15_T3_V82A_

AACATGTA_10_T3

ATCAATAC_4_T3_L90M_

ATTG

TTTT

_9_T

2

GCGGTATC_31_T1

CAGTCGAC_12_T2

AGTCGCGG_23_T3_V82A_

CTATAGTC_14_T2

GACTGCCG_25_T2

CTC

GTAG

A_14_T3_V82A_

CGATGTTG_24_T2

CCCG

GGCG

_22_

T2

CTTGTTCG_15_T2

CTCA

CCG

C_3_

T3_V

82I

TATTCACC_11_T3_L90M_

CACCCGAG_24_T1

CGAAACCA_11_T3

_V82A_

AGCAGGAG_22_T3ACCTAAGT_7_T3

CGAAATGT_15_T3_L90M_

TTACTGGA_6_T2

ACCCTG

CA_21_T

1

TATTAGTC_23_T3_V

82A_

AAGAAATG_6_T3

GCGTAAGT_9_T3_L90M_

GACCAACC_15_T3_V82A_

TCCGGCCA_6_T2

TATTGACG_14_T3_L90M_

TAAG

AGGC

_23_

T2

TAAC

CACA

_11_

T3

CCGCCATT_2

0_T1

GACCCCTA_9_T3_L90M_GGATGAAT_18_T3_L90M_

CCGCCCCA_18_T2

CCATTAAT_2

3_T1

GCGCTTAA_15_T2

GGTAACGT_3_T3_L90M_

GATCCTCT_32_T1

AGCGGGAC_38_T1

GGCACC

CC_5_

T1

AGATATAT_15_T3_V82A_

GAAAACAG_9_T3_V82A_

CGCTCTAG_10_T2

GAC

GC

GC

T_17_T3_V82A_

ATG

AAAA

A_3_

T3

CTTAGCCC_25

_T1

TGAA

GCC

A_17

_T2

CGTAATTA_7_T3

CTAAGATG_14_T3_L90M_

TTCGCACC_9_T2

CGCAACAA_8_T3

TCCCGTTT_15_T1

TCCGCGGT_7_T3

TAGTTAAA_8_T3_V82A_

TTACTCAC_26_T2

CTCT

CTG

C_4_

T3

GAC

TATCT_14_T3_V82A_

GATCTCCA_8_T3

TCTT

TCG

A_48

_T1 CCGCAGCA_13_T2

GTG

TCTA

G_1

7_T2

CGCCCCAC_3_T2

AGGC

TTCC

_8_T

2

GGGTGTGC_12_T2

TATG

CACT_15_

T3_V

82A_

CAGGAAGT_5_T3_L90M_

CCGGACCG_16_T2

CCAT

TCGC

_7_T

3_V8

2A_

TGTA

TTAA_1

6_T3

_V82

A_

TCGCACAA_3_T1

ATTC

GTT

C_14

_T3

ACCGCATG_31_T1

AAACTCGC_32_T1

TAGCCGGA_30_T1

CCGAT

TTT_

32_T

1

CCAGGTTG_3_T3_V82A_

CTACAACC_47_T1

GG

GTTAAT_3_T3_V82A_

ACCCACCC_12_T2

GG

ACAT

GA_

12_T

3_V8

2A_

GGTGTATA_3_T3ACTT

GG

CC_8

_T2

ATACCCGC_19_T1

CTC

CAC

GA_18_T3_V82A_

TCCCGATC_6_T3

TGCAGCGG_16_T3_L90M_

GCAGGACC_23_T2

TCGGGGGA_9_T2

ACCT

AGCC

_20_

T3_L

90M

_

ATCCCTAG_7_T3

CC

GATTAC

_17_T3_V82A_

AACACCCA_12_T3_V82A_

GTGGCGCC_18_T2

TCG

TGAA

G_8

_T2

AACCCCAG_5_T1

CCCC

CGCG

_12_T

1

TCACTACG_14_T2

GAC

CGCT

T_7_

T2

GCCCGCGC_24_T2

AGAAAACA_7_T3

CATAACCC_9_T3_V82A_

ATCC

ACGT

_31_

T1

TTGCACAC_12_T3

CAATAGAC_7_T3_V82A_

AGCT

GGGC

_26_T

2

AGTCCCCT_4_T3

AAAAGTCC_5_T3_V82A_

TCCCGAC

T_25_T

1

TCGCACG

G_15_T3_V82A_

ATGCG

ACC_37_T1

ATTTATGC_8_T3_V82A_

TTCACTGA_19_T1

CTAG

TTGG_18

_T1

TGCACCCG_17_T2

GACTTAAT_15_T3_L90M_

GTG

ACTG

T_19_T3_V82A_

GGAGCTGC_13_T2

GTCA

CTCC

_22_

T3

GCCGGACC_15_T2

CGCC

CGCT

_19_

T2

ATTA

CCAA_9_

T1

ATTTTACT_4_T3_V

82A_

AGTTGTCG_5_T3_L90M_

CCTCTCAA_32_T2

AATCAAGA_7_T3_L90M_

TCCCGGTC_21_T2

CCCAAAAT_5_T3_V82A_

AATAAAGG_9_T3_I84V_

ATAGATAA_10_T3_V82A_

ATCT

CCTT

_7_T

2

ACGGCCTC_15_T2

TGTCTGTT_3_T3_V82A_

GCT

CTG

TA_3

_T3

ACTG

AAGC

_18_T3_V82A_

CGTACACA_12_T3_V82A_

TTGTAAAT_24_T3_V

82A_

TGCTTGGT_15_T2

GTTAC

AAG_3

_T3_

V82A

_

CTG

ATCAG

_16_T3_V82A_

TACC

CGGT

_16_

T2

GTTTCCGA_24_T2

CC

ATTG

GC

_16_

T3_V

82I

TAGGGACG_23_T3_L90M_GCCATCCT_11_T3_L90M_

CAACACAC_36_T1

CTCCCGTA_18_T2

TACTACAC_20_T1

CGGTGCTA_4_T2

TTTCGTCA_13_T2

CCTTCATA_7_T3_I84V_

CCCACCCG_21_T2

CCTCATCC_5_T2

AATACG

AT_17_T3_V82A_

GAACCCAT_19_T3_V82A_

GCCTAAGA_13

_T3_V82A

_

GCCGGCCC_3_T3_L90M_

GGTCCTCT_22_T1

CTCCATTG_42

_T1

AACC

CCCA

_13_

T1

CTG

TATTT_16_T3_V82L

TCTTCAAT_5_T3_V82A_

TGCC

CAGC_

22_T1

AAGGCTAG_6_T1

CGACTCCC_28_T1

GTCCCTGC_13_T2

GTCCCCGC_8_T2

CGAGGCCC_24_T2

TATCCGTA_38_T1

TCCCCCGT_13_T1

TGTGCTTT_4_T3_L90M_

GCTTCCGT_16_T2

GCT

GTC

GC_

14_T

2

GTCCGAT

G_16_T3

_V82A

_

CTG

TTCC

G_2

5_T2

CGGAGGCC_24_T2

CCCTCCGC_5_T2

ACTTATTT_12_T3

TGCG

GCTC_12_T3_V82A_

CCACCAAC_23_T3_L90M_

TTGGGGCC_7_T

2

TACTAGTG_10_T3_V

82A_

CTGA

CCCG

_3_T

3

TGTGGTCT_6_T3_V82A_

ACCCCCAT_17_T1

TTGTAAC

T_13_T3_V82A_

CTCAGCAC_22_T1

CTTCTGCA_7_T1

CCCC

TACT

_12_

T2

CTTACTG

A_14_T

3_V82A

_

TCCCCCCG_10_T2

GTCACCCC_5_T3_L90M_

CCATAGAA_3_T1

CAATATCT_3_T2

CAAGTAAC_10_T3_L90M_

CTCC

GCGC_

11_T1

AAAGTTAA_11_T3_L90M_

GCGCACCC_16_T2

CTTCGATC_21_T3

CAGA

CGAG

_8_T

2

GGGGTCGA_5_T2

GC

AGAT

TT_1

5_T3

_V82

A_

AGGGCCGC_22_T2

TTTT

TGCG

_7_T

2

CACA

AAG

T_5_

T3

GAGTTCTC_44_T1

TCCC

GTC

T_6_

T2

ATTG

TCGT_

16_T

2

TTGTT

CAG_23_T3

_V82A

_

AAGACCGC_22_T3

CTCCTC

GC_10_T

3_V82A

_

ACCCAGTG_11_T3_V82A_

ATACCTGC_36_T1

TCGGAG

TC_1

2_T3

_V82

A_

AGG

GC

AAG_19_T3_V82A_

ACAGCCCA_36_T1

GAATGTTC_16_T3_L90M_

TCTCCGCC_3_T2

TCGGAGAA_6_T3

AACCCTGC_38_T1

TCTCGTTC_14_T2

CGCG

TTGA_

9_T2

CTTTGTCG_3_T1

GCCTGCCC_10_T2

TAGTCCCC_9_T2

CGCA

CAAT

_6_T

2

_A28

V_3T

_22_

TATA

AGA

G

CCCT

CCCC

_29_

T2

CAACTAGT_18_T3_L90M_

AAATAAAA_15_T3_V82A_

CCGG

TGCG

_24_

T2

CGCCCGCA_12_T2

CCAA

GAGG

_42_

T1

ACTGAG

GT_24_T3_V82A_

GAACGGCC_8_T2 AAAGAAAC_10_T3

CAAG

CGGA_

7_T3

_V82

A_

GCCTTGCG_9_T2

CGGGGATT_21_T2

CTAAG

GTG

_21_T3_V82A_

AGGC

TGCG

_3_T

3_L9

0M_

GAAAG

ATT_16_T3_V82A_

CTG

CACC

G_7

_T3_

V82A

_

GAATCTA

A_9_T3_

V82A_

TCGACCCA_19_T3

_V82A_

AGATAC

CA_18_T3_V82A_

CGGATAAA_18_T3

CTCTCCAT_54_T1

ATAAAAAG_9_T3_V82A_

CCGATCTA_10_T3_L90M_

GGACGGGA_9_T1

CAGATGCT_4_T2GTTACCCC_22_T2

TTGT

CTAG

_36_

T1

GGAG

CGAG

_3_T

3

GCCACTCG_41_T1

ACGG

GATT_12_T3_V82A_

CGACGTCG_21_T1

GCGTGATC_36_T1

CTACCCAA_9_T3_L90M_

CAGC

TCCC

_21_

T2

GGTCCGGT_22_T1

TAC

TAAT

A_13

_T3

TCCGGTTA_43

_T1

CGGAGGCC_10_T3

TCCGCACG_12_T2

CACCCACA_19_T2

_A28V_3T_92_CA

CTTCA

G

GCCTCCTG_18_T3

GGACTGGG_6_T3

3T_51_ACT

GGAAA

TGCATG

CC_13_T3_V82A_

ACATTTGC_5_T1

GCCT

ATTG

_31_

T1

ATCACAAT_22_T3_V82A_

TTAAACCC_4_T3

CGGC

CCGC

_26_T

2

GGTCTACC_8_T3

TGTG

CTTG

_20_

T2

TAAC

CGCG

_28_T

1

CCCCGTAT_15_T2

GACAAG

CT_9_T3_V82A_

CTCTCCCG_20_T2

ATCCTT

CA_19_T

3_V82A

_

CTCCCCCG_24_T2

AGCGAATT_11_T3_L90M_

GACCGTCC_4_T2

CGTCACGC_14_T2

CACT

TTG

A_19

_T1

GAATAATA_12_T3_V82A_

AACGCCCT_13_T2

GATCATAC_14_T1

TTTTGAAC_38_T1

GATTTCTC_31_T2

TCGAT

TTA_

5_T3

_V82

A_

GCCCGGCG_36_T2

CAAGAACC_10_T3_I84V_

GTCAAACA_19_T3

ATTG

GCAG

_37_

T1

_A28

V_3T

_22_

TAG

CC

CAC

CTTCGTGT_19_T3

AGCG

AAAG_18_T3_V82A_

GCCG

AAAA

_12_

T3_V

82L

CGTTCTAA_32

_T1

GG

ATAG

AG_5

_T3_

V82A

_

2T_4

_GCG

GGCG

G

GTCGGAAT_4_T3_L90M_

GTGCAACG_4_T2

AAGA

CGGG

_6_T

1

AGCT

TAGT_

4_T3

_V82

A_

ACTA

AAC

A_21

_T3_

V82A

_

GCTGCGAC_23_T2

CTGTTACG

_11_T2

ATG

ACTC

T_3_

T3_I

84V_

ATTG

TAG

T_16

_T3

AAGTG

GG

G_6_T2

GCACCTTG_52_T1

TGCC

CTTT

_38_

T1

CTGTCTG

T_11_T3_V82A_

CGAC

TCGC_

13_T1

AAACATAA_16_T3_L90M_

CGCCCTCC_21_T1

CTCCGCCG_35_T1

AATATACA_33_T2

GGTTGTGG_8_T

2 CACGCGAC_14_T2

GTC

TTAAG_21_T3_V82A_

GTCCCCCT_11_T2ACCCTCTC_3_T1

ATCAAGGG_9_T3_L90M_

TGAACCCA_32

_T1

GC

ACAAAG

_17_T3_V82A_

TCTGCGAC_22_T2

CCCC

CCAG

_4_T

2

TAAGTGTC_3_T3_V82A_

ATAA

GCC

T_12

_T2

AGGGGTTT_10_T3

GGACAAAA_31_T3

_V82A

_

ATTC

TGTC

_19_

T2

_A28

V_3T

_32_

AAAT

CGT

G

ACTGTTGA_39_T1

CGGA

AACC

_9_T3

CCACCGCC_35_T2

GCCCGGCC_31_T1

CCCTTAGT_8_T2

CAGGCGTG_4_T3_V82A_

CGGTCCTA_16_T2

A28V_3T_13_AC

CC

GTAT_

ACAACAGG

_33_T1

TATGG

ACA_17_T3_V82A_

2T_3

_ACT

CGA

CC

CTGCCTAT_3_T3_V82A_

GCGATCCG_17_T2CGTACCCC_9_T

2

ACCTCCCG_9_T3_V82A_

TATAATTT_15_T3_V82A_

TCAA

ATTA

_5_T

3

CCAGCAAT_30_T2

TACCCCTA_27_T3

ATCACTTA_19_T3_L90M_CAAATGCC_8_T3_L90M_

ACGCAACA_20_T3_L90M_

AGGG

AGCG

_7_T

3_L9

0M_

GTAACG

TA_11_T3_V82A_

AGCCTACG_21_T2

GG

GTC

TAA_

6_T2

AGGCG

CCC_1

7_T1

TAAACTGG_9_T3_L90M_

AACACTGT_3_T3_V82A_

CATTTTAA_6_T3_L90M_

GTCCGCGG_4_T2

ATGGTGGT_13_T3

ACGCA

CCC_

13_T1

CATATTTA_13_T2

GGGGAAAT_5_T3_I84V_TTG

GCGCC_6_T2

TCCCTCCC_6_T2

ATGTTGTT_3_T3_V82A_

TGG

ACCTG_43_T1

CGG

CGCCT_12_T2

CCCTGAAC_8_T3_L90M_

CCCC

AGCG

_5_T

2

GAT

TAAG

A_10

_T3

ACATGATT_16_T3_V82A_

CTAAAACG_3_T3_V82A_

CGCACGTG_16_T2

AATA

CTTC

_27_

T1

CCTCCGCG_17_T2

ACCCGCGC_7_T2

GG

GAC

CCT_

3_T3

AAACCCGA_11_T1TT

TGCAT

T_36

_T1

GATTGCAT_19_T3_L90M_

CCGC

ACCG

_23_T1

CTATTATG_4_T3

TGTA

GGTA_2

3_T1

CATAGGTG_10_T3

GACAAGGG_13_T3_L90M_

GGTGGATG_29_T1

GATA

CCCC

_40_

T1

CACTTGG

A_30_T2

TTAGTGCG_33_T1

ACCT

GCTA

_8_T

1

TCCCTGAC_27_T1

ACTACACC_9_T3_V82A_

GTAATG

CC_10_T3_V82A_

CCACGGTG_3_T3_L90M_

GTCCTCAC_28_T1

CCACTCTC_18_T2

AGGGCCCC_4_T2

ATCTGGGG_25_T2

TCTC

GATC

_32_

T1

CTGTCTTT_13_T2CGCGGGGT_3_T2

AAGGCGTG_29_T1

GAA

CTCG

C_10

_T2

GAGCAGAA_22_T3_I84V_

GCG

ATCCT_21_T3_V82A_

TAAG

TGGA_

8_T3

_V82

A_

ACG

ACTG

C_16_T3_V82A_

AAAG

GAC

A_9_

T3_V

82A_

ATTACAGC_14_T3_I84V_

TCCG

TTCC

_3_T

1

CACACGAG_17_T2

AGCAGCCC_21_T3

TGTA

GCGC

_5_T

1

CCCCCGAC_22_T2

GCCGGTAT_33_T

1

TCCTGTGG_3_T2

GTTTAATG_13_T3_L90M_

AGCCCCAC_3_T2

TACG

CAAG

_9_T

2GGATCCTG_25_T2

CTCT

TACC

_27_

T1

CCGA

GTGT

_42_

T2

ATCT

CCAT

_5_T

3

TATAGATT_24_T3_L90M_

AAGA

CCTC

_22_

T2

CCACCCGC_29_T2

GAAGTCAG_9_T3_L90M_

GACG

ATGC

_18_

T2

CCGTCGGG_42_T1

_A28V_3T_72_AG

GCATAT AG

CCCC

AA_4

_T3

GCAGTTCG_34_T1

GCG

TTCC

G_1

3_T2

CGAGACGG_16_T1

3T_4

1_AA

GAA

GGT

CCTGACCC_5_T3_V82A_

GGAATTAC_7_T3_L90M_

GTCTCGCC_65_T1

CTTCTCCT_22_T2

GGATGCAG_30_T1TCCTATCC_11_T3

ACCGATGT_5_T3

AGGGCCAT_18_T1

GG

AGCCG

C_14_T2C

TGTC

TAC_7_T3_V82L

TCCT

CGCG

_9_T

2

GCCCCCCC_5_T2

ACAACCAT_13_T3

GCGTCTAG_5_T3_V82A_

CAACCTAC_29_T2

CGAG

GATA

_22_

T1 GTCCAGGA_3_T3CCTG

GCT

T_6_

T2

TAAG

GAT

G_3

3_T2

CATCCCGG_17_T2TGTTC

CGC_15_T1

ATACACCG_13_T3_V82F

GAGCATCC_18_T2

GCG

GG

GAT

_4_T

2

TCAC

TTGA_18_T3_V82A_

GCCACCGC_14_T2

GGCGCCTG_8_T2

AAGCTGAA_18_T2

AGAAAGCG_9_T3_

V82A_

GTCCCCAC_26_T2AGTCGACC_20_T2

GGCCCATT_14_T3_L90M_

AGCCGTTA_17_T3_V82A_

ACGTG

GAT_10_T3_V82A_

AGTGCCCT_49_T

1

CCCA

GCC

T_12

_T2

TCAT

GCGC

_28_

T2

ACATACCC_19_T2

GTA

TGCC

T_31

_T2

TTTCGAGC_10_T3_V82A_

GCCGCCAC_5_T1

CTT

CC

CG

A_5_

T3_V

82A_

CCAAAAGC_18_T3_V82A_

ACGCATAT_10_T1

TCCTGATA_5_T3

CGCCTCCC_7_T2

CAACACCG_35_T1

CCACGTTC_3_T3_I84V_

CGATCCCC_10_T2

CAGGACAG_8_T2

CCGCGGCG_3_T1

AAGTCCAT_12_T2

CCCCCGCC_8_T1

GCAT

TTGA

_27_

T2

ATG

TATG

G_9

_T3_

V82A

_2T_22_TTC

GTGCT

TACGTCCC_3_T3

CTCACCAG_13_T3

_V82A_TTG

CCTGC_14_T2

GCCTAAGT_22_T3_V82A_

GAAGAAAT_10_T3

AGACCCAC_12_T3_V82A_

CCCGATTG_3_T3_I84V_

ACGTCCAG_4_T3_V82A_

CCCCCGAG_9_T3

CCCCTGCG_17_T2

CCCACACC_14_T2

ACCCCCCG_13_T2

AGAC

CCAC

_31_

T2

CGCC

CTCG

_9_T

2

CGCG

TCTC

_9_T2

TGTAATG

T_16_T3_V82A_

ATAAACAA_3_T3_V82A_

AGCTCGCG_21_T2

CATCCGCA_4_T1

CCACATCC_5_T1

CTGTACGC_17_T2

GTACCCAA_3_T3_

V82A_

TGTCGCCC_3_T3_I84V_

TGATTCCT_19_T3_L90M_

CCGTAATA_10_T3_V82A_

ACTC

AACC

_35_

T1

CCCTAGCC_41_T1ATTAGACA_8_T3

GGGC

AGCG

_12_

T2

ATACGCGG_28_T2

CCGCAG

GA_4_T2

CCCGACCC_31_T1

TCCTCCCT_30_T2

GTTGATGC_22_T3

ACCTTCCT_11_T3

CAACCTCG_11_T2

CTCGACGG_8_T3_V82A_

GATACACC_22_T2

CACTTAGC_12_T2CTATCGCC_13_T1

_A28V_3T_82_AG

CAAC

GT

ACTTTGTA_19_T2

CAGG

CGAC

_23_

T2

CTGCAGCT_19_T2

CTTTAATT_25_T1

CTTC

CCCT

_13_

T2

TATGGGTT_9_T3_L90M_

CCTC

CGGG

_11_

T2

CCACCCCA_13_T3_V82A_

TCTT

GCGC

_11_

T2

GCTCCCTA_10_T2

AAACCTAC_21_T1

GAACCCAC_5_T3_L90M_

CAAA

GTG

A_16

_T2 GCGTACTG_16_T2

GAGTTGAC_12_T3_V82A_

TGAA

CTCA_

11_T

3_V8

2A_

GCAACCTC

_25_T1

TCC

TATT

A_10

_T3_

V82A

_

GG

TATTGG

_15_T3_V82I

CGCC

CTTC

_25_

T2

ACGCCGCA_11_T3_V82A_

CGGCGCCC_21_T2CCCCCAGC_21_T2

TTTCAC

AT_16_T3_V82A_

ATAC

ACCC

_16_

T2

TAAA

CACG

_21_

T2

CCGCCCAC_11_T2

GCT

TCCG

G_2

3_T2

AGTTATA

A_16_

T3_V

82A_

TACAAGAG_19_T1

TTGCCTTC_14_T2

AGCCGACC_9_T3_V82A_

CCAGACTC_17_T2

TGTTGCGG_33_T2

CCCGTGGC_20_T2

_A28

V_3T

_42_

AG

CTTT

CA

TATCGCTT_8_T3_V82A_

CCTA

GTT

C_3_

T3

GTAAAATG

_6_T3

GGTGTACG_15_T2

GTCCTTCT_7_T3_V82A_

CCTCGCCG_33_T1ATTTGCTT_19_T2

CCCC

CCGG

_7_T

2

CTTTTGCG_9_T1

GGTCCATT_15_T1

GTCCCAGT_39_T1

GCTCT

CTC_

12_T2

CCCCCGGC_39_T2

TCTATATA_10_T3

GTACTCG

G_16_T2GT

ACGG

GA_1

2_T3

CC

TGAC

AA_7

_T3_

V82A

_

CGGCCGGT_21_T2AATA

GGGC_8_T1

GAGCCGGA_12_T2

ACCGCACT_33_T1

CAACGTCG_7_T2

CAAT

CCAT

_17_

T3

GCGCGCGG_27_T2

GCTTCGAG_8_T2

AGCG

CTCA

_3_T

3AG

CCAA

CC_7

_T1

TCCTCACC_3_T2

GTAG

GGAA

_9_T

2

TGAA

CGCA

_18_

T2

TGGAAAGA_7_T3_L90M_

CCCC

GAGC

_5_T1

CAGCCCTC

_15_T1

TCATGTCA_36_T1 TACATAGT_11_T3

AAAGAGAA_12_T3

ACCCTTTA_5_T3_V82A_

GCGCCGCT_29_T1

CGATTTCT_4_T2

GCCATG

GG

_13_T3

TCGCGGCC_3_T2

ATCC

GCGC

_10_

T3_V

82L

AACGAGTT_18_T3_L90M_

TTCCATTG_6_T3_V82A_

CAATGAAC_19_T3_L90M_

GAGTAAAG_8_T3

TTTGTAAG

_15_T3_V82A_

TATG

GAGC_10

_T1

TTGTAAGC_11_T3_L90M_

GCGG

TGGC

_17_

T1

CCCA

CAAC

_11_

T2TC

CGAA

GT_

17_T

2

TAAA

GTC

C_13

_T3

TAAATTAA_6_T3

ATCGTG

TG_7_T3_V82A_

ATGCCGCC_9_T2

GACCATGC_5_T2

CGTTTTTT_9_T2

CCGCCCTA_34_T1

GAGG

CCCA

_29_T

1

GAG

TTAGA_36_T1

CGGC

TGCG

_11_

T2

AGAG

AACA_19_T3_V82A_

CCCC

GCAC

_7_T

2

AGG

AGG

AC_17_T3_V82A_

TGAT

TCAT

_16_

T3_L

90M

_

TCTACATC_14_T3_V82A_

2

GTAATAAT_20_T3_V82A_ACGTCTGT_9_T2

CCGTCCTA_26_T2

CAATCGTA_11_T3_I84V_

GCGG

TCAG

_7_T

2

GTTGGCCG_21_

T2

CCTG

AAAC

_10_

T3

ACGTTATG

_32_T1

ACGTCGCA_28_T3

_V82A_

CATAAC

TA_14_T3_V82A_

GCA

GCT

CC_5

_T2

CATGATAT_18_T3_L90M_

TATG

GAC

T_12

_T3_

V82A

_

AACTATGT_6_T3_V82A_

GCCTCTCG_7_T3

TCAA

TACG

_6_T

3

ATTG

GAC

A_17

_T2

ATG

GTG

TA_9

_T3_

V82A

_

TAGTA

AGC_35_T3

_V82A

_

TGAG

CCGC

_24_

T2

CCCCTCCC_38_T2

AGGC

GACT

_12_

T3

AGCCGATC_18_T3_V82A_

CG

GG

TCTT_7_T3_V82I

ATTCGACC_17_T3

CAAAGATC_11_T3_V82A_

CTAG

TCG

G_19_T3_V82A_

ACTG

TTTG

_6_T

1

AAGCTCGC_3_T1

TGTGTATC_14_T3_I84V_

ACGC

TGTC

_44_

T1

TAGGCCCC_4_T2

TTGG

AGTT_14_T3_V82A_

TCAGCTTG_5_T3_V82A_TAAACAG

C_11_T3_V82A_

GTA

AGCG

G_1

4_T3

CACC

AACA

_3_T

2

AAAA

TGAG

_9_T

3

GACCCCCC_3_T3_V82A_

ACAGAGTT_10_T3_L90M_

GACTAGTA_8_T3

CGCGAACT_5_T1

GTCCCATG_31_T2

CCCGGCGG_4_T2

TGCGCGCC_18_T2

TGTGGTAG_12_T3

CGAT

GCA

G_1

7_T2

ACGTCATC_13_T3

ACCCTATA_8_T3_V82A_

CCCGCCAG_12_T2

CGGTTGCA_13_T3_L90M_

CCCC

GCA

A_6_

T2

CGCACAAC_15_T2

GGCCCACC_28_T1

ACTTAG

TA_19_T3_V82A_

ATTG

TGGT

_3_T

1

GGAGGCAT_25_T1

TTAC

GG

GT_

8_T3

CAAA

CGAG

_5_T

3

ATCCAGCC_30_T1

TCAAC

CAA_13_T3_V82A_

CCCCCCCA_7_T3

AAACACAT_38_T3_V82A_

GTCGGACT_7

_T3_V82A

_

TTCCTCCG_10_T3_I84V_CTAGCACC_6_T

2

GTCTACTG

_15_T1

CCGC

AGGT

_7_T

3_L9

0M_

ACACGTGC_66_T1

CCGACGAT_24_T2

GCCCCTAT_10_T2

CCGGGGCC_25_T1

TCCA

AAGC

_42_

T1

TAAGAGGG_13_T2

CCTGCTCC_30_T2

TAGG

TGGC

_9_T

3_L9

0M_

CCGGGACC_12_T2

ACGCCTCC_7_T3_V82A_

CCCA

GCGG

_7_T

2

TTCGCCCC_29_T2

CGGGGCGC_19_T2

CAAGATAC_19_T3

CCGTGAGA_4_T3_L90M_

CGTAGCAG_9_T3_L90M_

TCTC

GTC

C_22

_T2

CCAGACCA_18_T3_I84V_

TAACTTAT_6_T3_L90M_

ACTA

AATG

_10_

T3_V

82A_

AGCATATG_17_T3_I84V_

CGGT

CCCT

_4_T1

AACTTG

TC_15_T3_V82A_

AGGCGCTC_15_T1

TATC

CCCA

_38_

T2

CCTGTCGT_6_T2

GTAAAATT_15_T3

AACGCGAA_27_T3_V82A_

GG

GCTATG

_9_T3_V82A_

ACCGGCGA_29_T2

TCACTTCT_12_T2

GGTGCGCC_13_T2

ATTATTTG_12_T3_V82A_

GCCGGTGC_19_T2

CCGGGGGT_10_

T3_V

82A_

CTTGACAC_11_T2

CGATAA

AC_1

1_T3

_V82

A_

TGCATAAA_39_T2

AGTGGTTG_23_T3_I84V_

GTG

CTTCG_17_T2

CCCC

CAAA

_9_T1

ATTCAAAC_3_T3_V82A_

GAT

TGTC

A_20

_T3_

V82A

_

ATCCTACT_13_T1

CACA

TGTC

_11_

T2

CATG

GCT

T_10

_T3

AGTACGTT_19_T2

ATGACACC_32_T1

AATT

AAG

A_3_

T3

GCTATGGC_11_T3_V82A_

TAAACTAG_8_T3_V82A_

AGCACGAG_31_T1ATCTGGGA_47_T1

GCTG

CCAA

_9_T

3_L9

0M_

CTTTGTCC_18_T3

ACCCAAAT_3_T1

CAAAAG

CA_15_T3_V82A_

CTGACCCC_5_T2

TTTT

TATG

_9_T

3_V8

2A_GG

TCTA

TC_1

7_T3

AAC

TCAC

T_3_

T3_V

82A_

ACAC

CATA

_3_T

3

GCGCCCCC_7_T2

GCGCCAAT_15_T2

CCACCGTG

_4_T3_V82A_

GTCATCTG_9_T2

TGCCCG

AC_16_T2

ATACTCAA_5_T3_L90M_

CACTAGTA_4_T3

CACGTGGA_14

_T3_V82A

_

CCCCCGAT_3_T2

CCACCTCT_3_T2

ATCA

GG

TA_4

_T3

CC

TGATG

G_12_T2_V82L

GGTACAAC_11_T2

CACACCGA_24_T1

TTGACGGG_8_T3_L90M_

GTA

GTC

CC_1

7_T2

GCCCAACC_10_T3_V82A_

GTAG

CCCG

_5_T

2

GTAATCGC_12_T3

_V82A_

ACCGCCCC_3_T1

AGAT

AGGC_8

_T3_

V82A

_

GCGGCCCC_4_T1

TATG

CGGT_27_T

3_V82A

_

CGCTTATG_11_T3_L90M_

CATCGCCG_18

_T1

CCCATTTT_3_T2

TCCTGCCG_28

_T1AACT

GCCC_

10_T2

GCCGGCGT_26_T2

CTCTTCAT_19_T2

GTAGATAC_3_T3_V82A_

CGGACGAG_20_T2

CGAAGCTG_21_T1

CGCA

CGGG

_3_T

2

CCTGCGAG_10_T2

TTCAAACC_3_T3_V82A_

TCACTTCA_27_T2

TCCTGTAC_10_T3

AATA

GGGA

_26_

T1

CCTAGTAT_7_T3_V82A_

TAAGATAA_8_T3

GTC

TGTG

T_36

_T1

TCTC

TCG

T_23

_T2

ATGAAAAC_4_T3_I84V_

TGCCCGCC_28

_T1

CTCAAGCC_10_T2

TGCC

CCTG

_22_

T2

TTGCTCCG

_10_T3_V82A_

AACACAAT_18_T3_L90M_

GGAAGACC_9_T2

TTGAAATA_13_T3

TTGGCGTA_4_T2

TCGCGACT_23_T2

GGCCACGC_5_T3

GTGTGAAC_11_T3

AGTATATT_15_T3_L90M_

GCCGGCAC_34_T1

ACCCCGCG_14_T2

CTCACCTT_11_T2

CTTTCTTT_7_T1

GCTG

CCGC_21_T2

CTGTAG

GG

_10_T3_V82A_

CCGA

TTGA

_21_

T1

AAGAGATG_19_T1

TCACTCAC_10_T3_V82A_

ATGC

GGAC

_16_

T3_L

90M

_

AGCGGAAA_15_T3_I84V_

CCAGACTT_7_T3_L90M_

TCTCTACC_23_T2

ATCCCTTC_20_T1

TGAAAGTG_3_T2

AAACGTCA_6_T3_L90M_

CATCCTCG_20_T2

CACTTAAT_5_T2

CCATGCGC_21_T2

GCGTCCGC_9_T2 GACTAACC_11_T3CACGTGGC_3_T3_V82A_

CATTATGC_19_T3

CCGAAATG_20_T3

CAACACAA_11_T3_V82A_

CGTC

GCT

G_1

3_T2

GACC

CTCC

_5_T

3

TGAG

CGAA_11_T3_V82A_

CCGTCAAG_9_T3_V82A_

ATCTCAGA_27_T3GCAAGATG_15_T3

CAC

AAAAA_21_T3_V82A_

ACTA

ACCC

_4_T

3_V8

2A_

TTACACCC_14_T2

GCACTTCC_17_T3_V82A_

ATTTAACA_17_T3_V82A_

ATATGAGG_4_T3_V82A_TAGTAGCT_14_T3_L90M_ CT

ATCG

TC_7

_T2

GCACCCTG_4_T2

ATCT

GGAG

_10_

T2

CCAGTCGC_38_T1

ATAG

AATC

_9_T

3_V8

2A_

CTTCGGGG_26_T2

GTCT

CTCT

_17_

T2

TCCTTATA_4_T3_L90M_

ATGACTGT_8_T3_I84V_

GTGGGCCG_16_T2TCGTTCTT_15_T2

ATGTCG

CG_3_T2

AAGACGGT_5_T3

GGCAGGCG_8_T2

GTG

TTG

AC_2

7_T2

CTAGACCC_27_T2CCCGCCAC_16_T2TATGCCTG_6_T2

TTGACACG_51_T1

AACAG

ATG_22_T3_V82A_

GTCG

CTGC_10_T3_V82A_

GAG

CGAT

C_15

_T2

TCAACTAA_45_T1

GTG

GG

TGT_16_T2

ATCATCTT_5_T3

CCTT

ATG

G_2

0_T1

CAACGAA

T_9_T1

CACCGTAT_7_T2

TCTGGGCC_31_T2

AAAACTGA_5_T3_V82A_

AACCACGG_34_T1

AAAT

GGTA_12

_T1

CTGAG

ATT_30_T1

ACCATGCA_16_T3_V82A_

CCTTAAAC_4_T3

AAGG

GTAG

_9_T

2

TTTATCTT_34_T1

ACACACCA_9_T3_V82A_

AGG

TCTTA_18_T3_V82A_

CTCC

TAGT_

25_T1

TGGCCGCA_13_T2

TAACGGAG_5_T3

TCCAACAA_9_T3_L90M_

TATCCAAA_15_T3_V82A_CTCCCGGG_9_T2

CCCT

CGCC

_15_

T2

CCGCCGCA_30_T2

CCTTATCT_7_T2

CTTAAATC_10_T3_I84V_

TCTT

CGCC

_41_

T1

ACAACAAT_11_T

3_V82A

_

CATGGGGT_23_T3_V82A_

CCAG

GCCG

_30_

T2

TGTAAGCA_19_T2

GCG

CCATT_29_T2

TTGCTATT_7_T3

GTT

ACCC

C_5_

T3

CATCGATA_7_T1

CCGTAAAT_3_T3_V82A_

ACCC

AGAT

_19_

T3

CACGCGCT_13_T3_L90M_ ATAT

TTC

A_4_

T3

GAGAAAAC_33_T1

GG

AACACC_24_T3_V82A_

GCAGCAAC_19_T3_L90M_

AAACATC

C_13_T3_V82A_

GGATTT

AT_6

_T3

GCTTCACT_7_

T2

TGAG

TTGA

_25_

T2

TGTTTACC_3_T3

GCAA

ATGG

_18_

T1

GCC

TCTT

T_24

_T2

TCAACATA_11_T3_V82A_

CGATCTG

T_46_T1CCG

CTG

TA_1

6_T2

AGACTAAG_5_T3

AGTAGCGG_14_T3

TATTGTGG_9_T2

AAAA

CG

AC_3

_T3_

V82A

_

CAGG

GCTT_48_T1

GGTATGAG_15_T3_V82A_ _A28

V_3T

_32_

TC

CTT

GGA

TCGCTCCC_35_T2

AATT

CTTC

_10_

T3

ACCGACCG_19_T3

_V82A_

GCG

TCAT

A_19

_T2

GCGGCGAA_9_T3_V82A_

TCAGGTCG_3_T3_V82A_GGTTGGCA_33_T2

CGTG

CATG

_17_

T3_L

90M

_TAC

ATAT

A_12

_T3_

V82A

_

CCGG

CTGT_18_T2

TGATTAAA_14_T3

CCTCATAA_8_T3_L90M_

CAACCAAT_38_T1

TGAACAGT_24_T3

GGGTGGTA_24_T2

GGTTAAGC_3_T3_L90M_

CCTC

CACC

_6_T1

CTATTATT_38_T2

CGGTGTAG_21_T1

CCACCCTC_11_T2

TCTG

GTG

T_11

_T2

CTCC

GGCG_6_

T1

TGCA

TCCT

_17_

T2

CATAAGGG_9_T2

CCTG

TAGC

_3_T

3

AGCCG

GG

C_12_T2

TTGG

GGGA

_5_T

2

TAAC

GCAC

_3_T

3

ACGCACCA_66_T1

TTCCAATT_10_T2

AGGG

TGTA

_17_

T2

CACC

GGGA

_33_

T1

CACTTCCC_5_T3_L90M_ ACAT

CTGC

_9_T

1

GC

ATTTAA_7_T3_V82A_

TGTG

AGAT_40_T1

GTTCCATA_4_T3_I84V_

ACTCCATC

_22_T3

_V82A

_

TCAC

AACA

_23_

T2

GACACCAG_3_T3

GTGTAGAG_8_T3_V82A_

CATCCTT

C_8_T

3_V8

2A_

ATG

CCAG

G_1

1_T3

GCCCGGAC_8_T2

CCCG

CTGC

_11_

T2

ACGC

ACGT

_11_

T3

CCGTGGGT_11_T3

CGTGAAAC_16_T3_V82A_TAAAAGCT_6_T3_I84V_

TCAAACCT_9_T3_V82A_

GTTCTATA_23_T3

GCCC

CGCG

_9_T

2

CATGTTCC_3_T3_V82A_

TACG

CCAT

_4_T

3

AGAGAGAC_18_T3_L90M_

CCCCCAAA_4_T3_V82A_

ATCCCCGC_8_T2

AAAC

TTCG

_3_T

3

TCAAAGCG

_30_T1

TCCGACTC_27_T2

GCAGGTCC_48_T1

CTGGTTGC_13_

T2

GCTGCTAC_13_T2

TCCATCGT_13_T2

GG

GTA

CAA_

17_T

2

CATACGTT_13_T3_L90M_

3T_7_GA

GGA

GTA

GTCATCAG_5_T2

CCCG

CACG

_13_

T2

ATTTGTG

A_17_T3_V82A_

AAGCGCCG_9_T3_V82A_

TGGG

TGGG

_7_T

3_L9

0M_

CTCC

TAAC

_16_

T3

CTCGTCTC_17_T3_I84V_

CGACTGAT_11_T3_V82A_

ACTA

ATGA

_29_

T1

GAGTACAA_5_T3_V82A_

GG

AAATGA_9_T3_V82A_

TGTGATTT_10_T3_V

82A_

CCAG

AAG

C_15

_T3

TCCCTTTC_28_T2

CCTCTGCT_21_T3

CCCTCCCT_1

7_T1

CCCCTTGG_13_

T2

AGAA

ACCA

_9_T

2

CTCTTTCG_4_T3

CGGC

TTCC

_41_

T2

CCCCTCGA_17_T2

ACAG

CCTC

_14_

T3

GTG

AGAT

C_8_

T3

CTG

TATC

A_8_

T3_V

82A_

GATATACC_9_T3_V82A_

CTGTAAAA_11_T3_V82A_

CAAACACC_34_T1

AATCACTA_15_T3

TGCATTTG_11_T2

CTAAAGTA_9_T3_L90M_

CCCTTTCA_9_T3_V82A_

ACAC

CG

GA_17_T3_V82A_

CACC

CCCG

_12_

T2

CTGCCCGC_24_T2

GCCTGAAC_10

_T3_V82A

_

CGCCTCCA_15_T3_L90M_

GTATCATT_33_T2

GAT

ATG

AA_1

0_T1

TGAGTGTT_13_T3

AACTTGCT_9_T3_V82A_

TCTGCTGC_20_T2

TCTG

TACT

_15_

T2

GGCATATT_3_T3

_A28

V_3T

_42_

AC

CCA

AAC

CGCTGCAC_8_T2

TGG

GATAT_8_T3_V82A_

CTTT

CATA

_17_

T1

ATACTTAC_14_T3_V82A_

TACT

TCCT

_19_

T2

GCTCTCCT_9_T2

TAAAGTTC_6_T3_L90M_

CAG

AGG

AA_1

0_T3

GTAGCTAA_16_T3

TCTCCCGG_6_T2

CCAG

AGGG

_8_T

3TC

TTAG

CT_1

5_T2

AAGCACTT_3_T

2

TTC

CTA

AC_2

0_T3

_V82

A_

CATGCTTT_9_T1

2T_2

2_G

GCCT

GTG

GCCCTGCG_3_T1

CACATTCT_22_T2

CATATTTT_4_T3

CCAAAAGT_41_T3_V

82A_CTTTCTTA_5_

T2

AGC

CTA

TT_1

1_T3

GACCGAT

A_9_

T3_V

82A_

TCGG

ATGG

_4_T

2

TTCCTTTC_5_T1

GCAC

GCAA

_3_T

1

_A28V_3T_03_G

CGA

GTG

C

TAAC

TTTG

_7_T

3_L9

0M_

AAAT

CTTA

_10_

T3

CCTATCCC_6_T2

AGCA

CTG

T_17

_T2

GTCAATGT_3_T3_V82A_

GTTCTAAC_23_T1

TTGATAGA_11_T2

TACC

CCCT

_4_T

2

TCCAACAC_20_T3_L90M_

AGAC

CATT

_44_

T1

ACGATCTC_33_T1

CTGTCCTC_33_T2

AAAT

TATG

_16_

T3_L

90M

_

GTCTTG

AT_26_T1

TTAT

GAC

T_10

_T2

GTCATCAA_9_T3_V

82A_

GC

TACC

AT_16_T3_V82A_

GTCTTTCC_10_T3_L90M_

AGTTGGCC_15_T3_I84V_

CACGTATG_19_T3_V82A_

GTCT

GCCC

_39_

T1

GGCCGTGT_12_T2

CACCCAGC_27_T2

AACG

ATTG_19_T3_V82A_

TCCGCCCA_16_T2ACGCGATC_21_T3

AAATCCGT_8_T1

AAGAGAAC_11_T3

_V82A_

CTTATCTC_38_T1

GG

CTCGAC_12_T3_V82A_

CCTCTCCA_3_T1

CCCA

CCTC

_23_

T2

CTTGTCTT_9_T3

ATCGCACA_15

_T3_V82A

_

ACCAAGAG_49_T1

AACCGCTG_25_T3_L90M_

CGG

ACAC

C_15

_T3

ACCCATCC_15

_T3_V82A

_

CCACACAC_10_T3_V82A_

GCTCCTAT_9_T2

_A28V_3T_03_TATG

GAGT

CCAGGCGT_14_T

3_V82A

_

GC

CC

ATCT_18_T3_V82A_

TCAA

TACC

_17_

T3

AACTC

TCG

_13_T3_V82A_

TGAC

TAG

T_18

_T3_

V82A

_

ACAACATT_24_T3_I84V_

CTTG

AAAC_17_T3_V82A_

AATCC

CTG

_15_T3_V82A_

AAGA

CACC

_6_T

2

ACCC

GCTC

_29_

T2

ACAAAGAG_24_T3

TGAGGCTT

_18_T

3_V82A

_

AGCGCCCC_12_T2

CACGACCT_13_T

1

AACC

AACC

_20_

T1

AACTTGCG_10_T3_V82A_

ACAGGTGC_9_T2

TCAAGG

CA_3_T3

CCCC

TCCG

_10_

T2

GGGCGTAA_21_T2

CAAG

AGCA

_22_

T1

ATTT

GCGG

_15_

T2

GCG

CTCGG

_18_T2

GTCTGGTC_17_T3

TGACCCCG_13_T2

TACTCCCT_38_T1

GTTTTTTC_5_T2

CCCCTTGC_22_T1

ATATATTA_33_T1

TATACCCC_3_T1

ATTATAAG_11_T3_V82A_

GTCGTAAG_4_T2

ATGCCACC_5_T3_V82A_

ACTCGCTA_16_T2GGTCCCCC_17

_T1

TAAAGCTG_7_T3CCCCTCGC_11_T2

CCCA

CATG

_7_T

2

TAGATACA_7_T3_V82L

GCTACTCC_23_T2

GTTTA

ATA_

6_T3

_V82

A_

CGGGCATA_11_T2

TGTCCTTC_17_T3_V82A_

ACGCGCCG_4_T3

ACAA

GCTT

_12_

T2

GCGGACCC_31_T2 CCTACTTA_13_T2

CAG

GCG

AC_3

_T3

ATCC

TTCC

_11_

T2

ACGACATC_20

_T3_V82A

_

CGGTTCAC_14_T2

GATGATA

T_14

_T3_

V82A_

TGGTCGCA_9_T3

GTT

GCG

CA_1

4_T2

TGATAGCC_6_T2

GGGTTCTG_22_T2

TCCC

GTTA

_4_T

1

TACTCCAT_19_T2

ACAA

AGCC

_48_

T1

CCGTGATA_16_T2

ATTTTTGA_3_T3

CCGGACAT_15_T3_L90M_

GCTCCGTG_8_T3_L90M_

TATGG

TAT_12_T3_V82A_

GAGCGCAA_5_T3_V82A_

TCTC

CATA

_40_

T1CT

GCGG

TC_2

8_T2

GGTCTTCA_10_

T2

ACCCACGC_38_T2

TCCT

GACT_

3_T1

AAGAATGG_5_T1

GAGCCAAT_9_T2

ACACCCTC_3_T3_I84V_

GCGCACCT_43_T

1

TCCTATGC_35

_T1

TAGG

CACA

_37_

T1

CGACGTGA_13_T1

CAC

CTT

GC

_14_

T3_V

82A_

ATCCGTTT_35_T2AAGCCGCG_7_T

2

GTCTT

ACT_

3_T1

CGCGTACC_3_T3_V82A_

ACCATACA_5_T3_V82A_

ATACCCGA_6_T2 AGTTCCAA_37_T3_V82A_

CACGTGCC_22_T2

AAAA

GTA

T_5_

T3_L

90M

_

ACTTTTGA_12_T2

CACCCGTG_7_T3_V82A_

GGTGCATG_5_T3_L90M_ AACA

GGGA

_5_T

3

TGTTTTCT_33_T1

TCCGCATC_14_T2

GCCCGCGG_19_T2

_A28V_3T_62_CTTT

GAGT

TCTCCCGT_30_T1

CTCA

CTCT

_17_

T2

CCGG

GTCC

_18_

T2

CGCCCCCC_13_T2CCATTTTG

_4_T1

TGTCTACC_14_T2

TGTG

CGTT

_15_

T2

TTGCCCGC_12_T2

CCCCTCAC_4_T1

GTAAAG

CC_12_T3_V82A_

CATCCCCA_13_T2

TCATC

CAT_20_T3_V82A_GGGCTCAA_5_T3_L90M_

CAGCAAG

C_22_T2ACCA

GAG

A_21

_T2

ACTA

ACAG

_9_T

3

ATCCGTGG_13_T2

CCGTAAAG_15_T3

TTCTGGGC_6_T2

CAGT

TACT

_4_T

2

TGTGCCGC_9_T2

CACTGAGA_15_T3_V82A_

CAAAACTG_13_T3_I84V_

CCGCTCCC_4_T2

AGACAAAT_8_T3_V82A_

GTAC

GAAT_19_T3_V82A_

AGGGAGTC_34_T1

CACTGTGT_9_T3_V82A_

CCG

AATA

A_14

_T1

GCCGCGCT_45_T1

GTTC

ACAT_15_T3_V82A_

CCCGCTTA_4_T2

TGCCGTTT_18_T2

CGGAACTG_19_T3_L90M_

TCC

GTTTA_12_T3_V82A_

GCCCAACC_24_T1

GTGTTACG_6_T3_L90M_

GAGT

TCCT

_11_

T2

ACCCACAC_9_T2

CTAGTACT_20_T2

CCAC

TGGT_3

_T1

TTAATCTG_20_T2

CAATGTAC_4_T2

GTCCCTTC_5_T

2

AGCCCCTT_26_T2

TTAA

ACTC

_25_

T1

CATATG

AC_25_T1

AGGTATTT_4_T3_V82A_

CCTGGACG_13_T3

TAAAAGAT_4_T3_V

82A_

GTAACACC_12_T3

ACATTCGC_6_T3_V82A_ACGAGTCA_27_T1

ATCACCAA_3_T3_V82A_

GAATGTAA_20_T3

CTTCTCGG_20_T1

TCGTCGAC_8_T2

CCGCCCAG

_14_T3_V82A_

CGAACTGA_17

_T3_V

82A_

CTG

CCAC

T_16

_T2

ACGAACTG_9_T3_I84V_

CATTCGGA_8_T2

GCG

GG

AGA_16_T3_V82A_AACATTAA_5_T3_L90M_

GAC

ACC

CA_18_T3_V82A_

TTCCCCCT_22_T2

GACTTTAA_14_T3_L90M_

CGTAGCCT_7_T3_L90M_

TGCGAGCT_3_T3

CAGCTCCG

_23_T1

AGG

GG

CAC_

16_T

2

GCGCGGGA_6_T2

ATGTCTAG_8_T3_L90M_

CGCG

GTTC_9_T3_V82A_

ACTA

AATT

_7_T

1

CGCT

GATA

_30_

T1 TCTTGG

CG_22_T1

CCGCGCCA_13_T2

ATAT

AGTC

_17_

T3

AGTTCTGA_3_T3_I84V_

AGTA

ACG

C_13

_T2

ACTA

TACA

_5_T

3_L9

0M_

CCCTACAA

_12_T

3_V82A

_

CATCCCGC_23

_T3_V8

2A_

TGGAATAC_12_T3AA

GGTCAG

_6_T1

GCCCCCTG_3_T1

CTTA

TACT

_17_

T3

ATTT

AGG

G_1

3_T2

ACCCCACT_21_T2

GCTAGACT_30_T1

CCCTCCCG_20_T2

CCCTGACC_10_T2

TATTTCC

T_19_T3_V82A_

GCCG

ACCC

_21_T

1

CCCC

ACAA

_12_

T2

GCTCCCTC_27_T2

GTAAAACG_14_T3

2T_3

_CAC

TGT

TT

GATCTT

CT_12_T

3_V82A

_

ACCCCCTG_3_T2

ATTT

ACCC_1

0_T3

_V82

A_

GGCA

CCCG

_38_T1

CCCA

CCCC

_23_

T1

TCTCTCCC_19_T1

AGGCCGCA_9_T2

GCCCCAAC_29

_T1

CCAGGCGC_27_T2

TCCGCTCG_29_T2

TGCG

GGCG

_24_T

2

CGAC

GGGT

_7_T

3

TGGACCAA_22_T3

ATCAATA

G_19_T

3_V82A

_

CATTCTCC_8_T3

TCTTAATC_12_T3_I84V_

TCCTATCA_17_T1

CGCTGAAA_8_T3_I84V_

GTAC

GTCT

_24_

T2

AACAACTG_17_T2

CTG

AAAC

G_8

_T3

ACCC

CTG

C_4_

T2GTGACAGA_11_T3_L90M_

GCCACACA_15_T2

CCAATACT_21_T3_L90M_AGCGCAAC_10_T3_L90M_

GGAGTAAC_35_T1

TGGGTGCT_26_T2

CGTG

CGCT_43_T1

GACACTCA_14_T3_V82A_

ACCCCGTC_9_T2

GAAC

TTAC_17_T3_V82A_

CCCCCACC_44_T2

AGGTGTTT_13_T2

CCTC

AACT

_35_T

1

TTGCTGAA_15_T1

TTTCGGTG_3_T2

TGCC

AATT

_24_

T3

CCTA

GCAT

_3_T

2

CTTCACCT_6_

T2

CTAG

ATAA

_52_

T1

GGATGCGT_37_T1

CGAAAGAC_5_T3_V82A_

CTTGG

AAA_11_T3_V82A_

3T_11_GCA

GCTTC

GCCCATCC_30_T1

TTAA

AATG

_13_

T3

AAGGTAAG_7_T3_V82A_

CCCT

CCCC

_24_T

1

GCG

TATAA_41_T1

ACTTAGAA_17_T2

TTTCTCCC_12_T2

TTGCAGTC_13_T3

CAAG

AACA_16_T3_V82A_

CAC

CG

TGC

_22_T3_V82I

TTTAGCAC_12_T2

CACCTCCG_16_T3_L90M_

CACCACAG_13_T3

TCTGGCCC_28_T2

CTAGGCCC_16_T2

CG

GTT

CC

T_22

_T3_

V82A

_

GATTGTAC_6_T1

GCGT

CGGG

_15_

T2

CCCCTTCC_46_T1

ACTTAAC

A_14_T3_V82A_

CCTG

AGAT

_7_T

3_V8

2A_

TGTGTGCC_6_T2

GCCC

TGCC

_22_

T2

GCCTGGGA_6_T2

TCACCTCC_26_T1

CATCACCA_20_T2

GAAAC

ATT_

7_T3

_V82

A_

TTCATCCA_27_T2

CCCA

TTG

C_6_

T3

CTAT

CCTG

_3_T

2

CGGTAAG

T_24_T

1

ACTTGAAA_9_T3

CCCC

GGGG

_5_T1

CCTCCCCG_40_T1

AACT

CACA

_9_T

3_L9

0M_

CCCCGCAT_21_T2

TCACTTTT_8_T1

ACAC

ACCG_13

_T3_V

82A_

AGG

CAACG_12_T3_V82A_

CGCAAATT_10_T2

ACGTAACA_15_T3_V

82A_

AGCCAATC_10_T3_V82A_

TGCC

GCCA

_51_

T1

ATCCTT

TC_9

_T3_

V82A

_

GCGTGACA_35_T2

TTGC

AACT_15_T3_V82A_

GTGCGTCG_26_T2

GAC

GTT

AG_9

_T2

CCCCCCCT_29_T2

GCGTCCCA_15_T2

AAAA

TGC

A_7_

T3_V

82A_

CCCA

TCGC

_5_T

2

GCTCCCAG_3_T1

TATT

TAG

C_10

_T3

TCCCCATC_15_T2

GTG

CAAC

T_12

_T3

TCCGGCGG_26_T1

CCTA

CAAC

_15_

T3

TGG

TTGG

C_22_T2

CTA

GTA

CA_

10_T

3_V8

2A_

TCCCCGCC_27_T2

CTCCTTCA_43_T1

CTAGAACG

_12_T3_V82A_

ATGGCCCC_16_T2

GTCG

GTGC

_21_

T2

CCACCCAC_3_T1

GCAC

TGGT

_20_

T1

AGTAATAG_7_T3

GTTGTATG_20_T3

CCCCACCG_8_T2

GCCTATCG_15_T2TC

CAACGT_1

9_T1

CCCC

CTTG

_4_T

2

TTACGTAC_23

_T2

TTAG

AAGC_17

_T1

TACA

TGAC

_12_

T3

TAAA

ACAC

_17_

T2 CTTT

GG

CA_9

_T2 GCCCTCCC_8_T2

GCAGCGTC_10_T2

ACTATCTT_20_T1

GTA

CACT

T_13

_T2

GC

ATCC

TT_21_T3_V82A_

GAGGGCAC_23_T2

TTGCACCG_12_T3_L90M_

CCCTTACC_3_T1AACCTCCC_5_T3_

V82A_

TTCCCTAT_9_T3_V82A_

CTAC

CCGG

_7_T

3_L9

0M_V

82I

CCGAGGCG_3_

T3_V82

A_

CTCATCGG_35_T1

CTCGTTCC_25_T2

GGAATTCC_9_T3_L90M_

TTATGACA_7_T3

CTG

TTG

AC_1

1_T2

AACGGACA_7_T3_I84V_

GCGCGCGA_14_T3_L90M_

TAGCTCTT_9_T2

CTTCACTT_11_T2GTTTCTAG_7_T

2

ACTCCCCT_16_T2

CACCTATT_17_T3

TTGTG

CAC_25_T1

TAACGTTT_33

_T1

AGAGTGGT_17_T3_V82A_

TCCA

CGCC

_23_T2

GCCTCGCG_14_T2

ACCCCCCA_15_T2

CGCGGACT_17_T3_V82L

TGATGCGG_41_T1

GTCGGGCC_4_T2

CACGTCCT_6_T3

CG

AGTTAC

_7_T3_V82I

GAACG

GAC_12_T3_V82A_

ACAACTCT_3_T1

TGGTGTGT_8_T2

CCCT

AATT

_11_

T3_L

90M

_

GCCGTCGA_10_T2

TAAAGACG_6_T3

GATCGAAC_12_T3

CTTAGG

TA_11_T3_V82A_

CCCCGTTA_21_T2

AAATTTAC_8_T3_I84V_

GCTGACCC_25_T1

TAGC

CTTC

_15_T3_V82A_

CGTCATTC_7_T3_V82A_

CCAAGTCC_21_T2

CCAT

GTCA

_26_

T2

AACGAACA_7_T3

TCCC

CGTT

_21_

T1

ACGATGTC_6_T3_V82A_

ATCCTT

TC_17

_T1

CACCACCC_16_T2

GGCCCCGT_6_T2

CCGA

CCGC

_17_

T1

CATA

CCTG

_5_T

2

GCGCTTTG_11_T2

ACGCATG

T_9_T3_V82A_

AGCCCCCA_1

5_T1

CACC

CGTG

_19_

T2

AATCTTCA_9_T2

CGG

ATCC

G_1

3_T2

ACTACACC_13_T1

CCACTGAT_23_T1AAACTTAC_33_T1

GAGAGGGT_14_T3_I84V_

GTAGGGAG

_11_T

3_V82A

_

TTTATTAA_22_T3

TCAAACTT_8_T3_V82A_

ACCA

ACTC

_6_T

3

AAGCTCG

A_10_T3_V82A_

CTAGTCCT_28_T1

CTACCATT_5_T3_V82A_

ATG

TTAA

G_1

2_T3

_V82

A_

CCCC

AACC

_7_T

2

TTTCGTCC_19_T2

TGAG

AAAG_12_T3_V82A_

CTCC

CCTA

_10_

T2

CTCACGTA_14_T3_V82A_

TGTCCCAA_19_T3_V82A_

CC

GTG

TCT_15_T3_V82A_

GTAT

CGGT

_9_T

1

CGCAGGAG_15_T3

AACCTGAA_19_T3

TAGGACAT_

14_T3

_V82A

_

CCCGACGC_26_T2

TACAGATT_5_T3_V

82A_

AGTACTTT_10_T3_V82A_

CCTC

GGTA

_25_

T2

AGTG

TCAG

_9_T

2

ACGCTGAT_23_T3_L90M_

CACACGGA_7_T3_V82I

GG

GTG

AGA_11_T3_V82A_

GTTAGCGT_28_T1

TGATCCAG_5_T3_V82A_

TTTAGTCC_7_T2

TTCGCCTT_

14_T1

ATACCGCA_15_T3

_V82A_

ATTC

GTGG

_9_T

2

GATG

GTG

T_11_T3_V82A_

TTAG

GAC

A_16

_T3_

V82A

_

GCTCCG

AA_10_T3_V82A_

TGGA

TCCG

_34_

T2

CTGCGATC_28_T2

CCGTTCTT_20_T1

GTAGTGCT_6_T3_V82A_

AACGTGAG_52_T1

TCAG

GAG

T_8_T3_L90M_V82I

GGAG

GGTC

_18_

T2

TCAG

GTCC_

7_T1

TCTA

TCTA

_33_

T2

GCGC

CTGC

_15_

T2

ATCCTGTT_11_T2

CGCTTCGG_4_T2

TCCCACAT_9_T3_V82A_

TTG

ATCT

C_5_

T2

CCCTCGTA_6_T2

GCCCCCCT_9_T3

ACCACCCT_11_T3_L90M_

GAAAC

TCT_18_T3_V82A_

CGTGCCGT_24_T1TAATGGGA_15_T3TCAACACC_12_T2

GGTTAAAT_5_T3_L90M_

TCGCTTTG_33_T2

GCCTTTTA_8_T3_V82A_

TCAGTTCC_17_T2

AAGCTGTC

_30_T1

TGTCCCAG_8_T3_V82A_

CGGT

GGAG

_29_

T2

TGCTCCCG_14_T2

AGCC

TATC

_29_T

2

CCCGGAGG_8_T2

CATGAAAC_13_T3_V82A_

TCGAGCCA_7_T3_V82A_

GAACGGTA

_8_T

3_V8

2A_

GTGTCCTA_23_T2

CCTCAGCG_7_T3_I84V_

TTACTTG

T_19_T3_V82A_

CCCTTCAC_18_T2

TAG

GCG

CT_2

8_T2 ATTCCCTC_12_T2

CGTACCTG_35_T1TGAC

ACCC

_8_T

2

AATC

CACC

_7_T

3_V8

2A_

CTGCGCGC_32_T2

AGAGCACA_17_T3_V82A_

ATCC

ACCC

_21_

T3

CACG

AGTC

_11_

T2

ATAC

CCCC

_9_T1

AGTAAG

CC

_17_T3_V82A_

CAGATACT_14_T2

TGG

CATC

T_7_

T3AT

CTCA

AT_1

9_T3

AGTAAAGA_19_T3

CAG

AGC

TC_8

_T3_

V82A

_

TCGA

AGAG

_8_T

3_L9

0M_

GAGG

CGGC

_18_

T2

ACTC

TAAG

_11_

T1

AAG

CAAT

G_2

0_T3

GCCGCGCC_26_T2

CCTCCTCT_24_T2

CTGTAACC_9_T3_L90M_

GACTA

CGT_

40_T

1

TTAC

GAGG

_5_T

1

TTTAGGCC_7_T1

ATCGCGTC_12_T2

GCCGGGCT_40_T1

TCAG

GG

AA_1

0_T2

ATTAAAGT_4_T3_I84V_ ATCCCCGT_2

2_T1

TATA

TAG

A_13

_T3_

V82A

_

CACAAATT_4_T3_L90M_

TAGT

GGTT

_3_T

3

GCTTGTAT_5_T3

2T_3

_CTC

CGT

TT

CTGGTTAC_16_T2

GG

ACTACT_10_T3_V82A_

ATCAAGAC_12_T3

CCCCCCGC_15_T2

TTTCCTTG_3_T3_V82A_

GGAAAATC_16_T3_V82A_

GAACCCGG_53_T1

AAGAAAAG_12_T3_V82A_

GACCACAG_31_T2

CAAGATCG_11_T3_V82F

GCGC

CCAC

_12_

T2

CTTG

ACG

G_1

4_T3

ACTCTTTA_10_T3_L90M_

ACCCAGCG_21_T3

CTGACCCT_20_T

3_V82A

_

CTCT

GTAC_

11_T2

CATT

ACCG

_14_

T2

CAATTTAC_20_T3

GTA

AGG

TA_5

_T3_

V82A

_

GCCCGCCC_29_T2CGTCTCCC_24_T2CG

CCCC

CC_7_

T1

TCAT

CAGT

_7_T

3_L9

0M_

GTCATCTT_12_T3

ACC

AATAT_21_T3_V82A_

TCTG

CATT

_3_T1

CCCTCCCT_24_T2

GCCAAAAG

_8_T3_V82A_

GACGCGCC_42_T1

CTTATGG

G_39_T1

AGGT

CTCT

_17_

T2

TCCCACCA_46_T1

TTCCAGAC_6_

T1

GTACTCCC_13_T3_L90M_

TAAG

TACG_1

1_T3

_V82

A_

CTGT

GGGT

_3_T

2

ACCC

CCAC

_12_

T2

CTAAACTA

_29_T3

_V82A_

CAGATTGC_3_T3_V82A_

GTATCGCC_9_T3

CCCCGTAA_4_T3

GTGGAT

CC_5_

T1

AGGG

AGAC

_9_T3

GATAC

TTA_

7_T1

TCGC

TGTT

_30_

T3

CACCGTCG_14_T2

TATTCGGG_17_T2

TTAG

ACGC

_6_T

2

TCAC

GCAC

_28_

T1

TCTC

CCAC

_7_T

3_V8

2A_ CCG

TTGTG

_17_T3

GAC

CGAG

T_12

_T2

CAAC

CGGG

_8_T

3_L9

0M_

AAGTGCAA_11_T2

CATGCACG_4_T2

GCGGAGGC_12_T2

CGTC

CCCA

_13_

T2

CACGGTAT_12_T2

ATGAAACA_35

_T1

GG

ACG

ATC_16_T3_V82A_

GTTATAAG_4_T3_V82A_

TCAT

TACT

_8_T

1

GTCGTGTC_15_T3

ATAGTCAT_19_T3_L90M_

CTCC

CCGG

_4_T

2

CTAAAC

TC_17_T3_V82A_

CTGC

CCCC

_14_

T1

CGGGCCCT_18_T2

CTGATGAC_13_T2 TAACCCTC_4_T3

CATCTGCC_43_T1

GACATTTC_6_T3

TCCT

GCGG

_12_

T2

CGCAGCAT_4_T2

TCTCCCCT_31_T2

ACATACCA_28_T1

CTAGCGGA_15_T3_V82A_

CTATC

ATT_7_

T3_V82A

_

ACCCACGT_28_T2

ACTAGCGA_9_T3_V82A_

AGGTACAT_30_T3_L90M_

CCTACAGA_11_T3_V82A_

CCAT

TCG

A_13

_T2

TGATTA

AC_3_T3_

V82A_

TCTGAATG

_6_T3_V82A_

TCCATGAT_17_T3_V

82A_ TC

TGAATA_34_T3_V82L

CCAACTCT_9_T3_V82A_

GCCC

GCCC

_10_

T3

CAGAG

ACG_23_T1

CCCC

GCAG_13

_T1

CCAT

ATG

C_7_

T3

CTCGTGGT_44_T

1

CCCA

TTCA

_3_T

2

GTATCTCG_9_T1

GCCC

GCAC

_9_T

2

TATA

TGTT

_23_

T1

GGTGGAAT_27_T

2

AGTCCTAA_38_T1

CGCC

GTCT

_13_

T2

GGGCTCCG_5_T2

CGCG

GTCC

_17_

T2

ACTTCCAT_5_T3_I84V_CATATACG

_8_T3_V82A_

TTTT

CCGA

_17_

T3_L

90M

_

CCAC

CCAC

_12_

T2

GTCTCCAC_3_T2CTCCCCTC_12_T2

TACC

ATAT_12_T3_V82A_

CCCGGCGC_15_T2

TTTT

TGCC_1

0_T3

_V82

A_

GCGTTGGG_5_T

2

ATAGGACT_11_T3_V

82A_

TAAC

GAA

T_6_

T3

GTCTTGGA_3_T1

GAGCGTGC_37_T1

GGAAACCG_8_T2

GTTTCTTC

_46_T1

GCCGCCTC_33_T1

GCAG

GAAG

_13_T3_V82A_

TTTGTACT_12_T2

TTCT

CAG

C_3_

T3

GGTTCCAA_5_T3_V82A_

GTCGTCTC_11_T3_I84V_

CGTC

CGAG

_25_

T1

TCTTGCCT_7_

T2

ACCCCCTT_5_T2

CCCCTCGA_19_T1

AATAGACA_41_T1

3T_01_G

GGCACAA

GTGAATCT_24_T3

CTCAACAG_16_T2

ATCTAGCC_15_T3

CACCAGCC_4_T1

CGTAGCGT_19_T2

TCCCTTCC_9_T2

CAAGACGT_27_T1

TCCCCACC_33_T2

CGGAATGT_11_T1

GAATCGGA_28_T1

CTCGCCG

G_25_T1

ATAG

CGGC_12_T

3_V82A

_

AGGACCGG_7_T1

TAAGTG

AT_17_T3_V82A_

CGGCG

GAC_14

_T1

CCGCCCCT_13_T2

AAAATATT_15_T3_L90M_

ATTGTG

GA_16_T3_V82A_

CGTCGCAA_8_T2

CCGAAATA_9_T3_I84V_

TCAAAGTA_7_T3_V82A_

GCAG

TGCT

_33_

T1

CGTTACGA_4_

T3_V82A

_

CACATACG_37_T1

ACGTCTG

C_3_T1

CCAT

AGCA

_3_T

2

CC

AGAAC

A_20_T3_V82A_G

GC

AGTC

T_30_T3_V82I

GGGGTTCG_18_T2

TCGG

TCTG_20_T2

CGGGATCC_29_T2

CCTC

CCTT

_15_

T1

AACGCCGC_8_T2

GGGGCCCC_10_T2

GG

TTCA

AA_9

_T3

CCGCG

TAA_10_T3_V82A_

CCAC

CCTA

_34_

T1

CGCACGCC_12_T2

CAAGGCAT_6_T3_V82A_

CTG

TAG

CA_1

5_T2

ATACTAAC_4_T3_V82A_

ACTCATTA_4_T3_V82A_TCGGCGAT_7_T3_V82A_

CTCCGATT_21_T3_V82A_

GATTGTAA_15_T3_L90M_

TGCTATTC_23_T1

AGCCGCAC_24_T3

_V82A_

TACCATAA_12_T1

CGGCGGGC_10_T2

ACGC

GCTG

_25_

T2

TGCC

GTTC

_4_T

1

TTTC

TATT

_30_

T2

AAACTACT_6_T3_I84V_CGCT

AGAG

_6_T

3_V8

2A_

CTCACCCG_13_T3_V82A_

GG

TCG

CAG

_9_T

2

TGCTTTTG_7_T2

CACGACTG_21_T3_L90M_

TTACAATG_14_T3_V

82A_

CTCT

CGAT

_3_T

1

TCCATTCT_3_T3

ATTACCGA_3_T1

TACC

CCTT

_22_

T2

TCCT

CCTC

_23_

T2

AAAGGCTG_7_T3_V82I

CGAGGGCG_9_T1

TACTATAT_15_T3_I84V_

AACTACAT_15_T3_V82A_

GCG

GG

TTT_8_T3_V82A_

_A28

V_3T

_32_

ATC

GAG

CA

GAGCTAGC_9_T2

GC

GTTC

AT_16_T3_V82A_

3T_4

_AA

GACC

GT

CCCC

TGTC

_15_

T2

TTTCGGAC_9_T3_V82A_

GCTGTGTT_10_T3_L90M_

TACAT

GGG_13_T

3_V82A

_

AATCCATC_11_T2

GCTACCGG_5_T2

GGTA

GCTC

_15_

T1

TAAT

TGG

C_17

_T3

CTAG

ACAC

_23_

T3

TGTAATCT_11_T3

ACCTGGGA_3_T2

ATCCGGGC_13_T2

ATGC

GACA

_9_T

2

ACCCAACA_8_T2 GCCCTCCG_10_T2

AAGCCCTT_11_T3

ACCCCCCC_17_T2TCGGCGCG_17_T2

_A28

V_3T

_52_

AC

GGA

GCA

AGAG

TCTG

_11_

T3_V

82A_

TCCCGTGC_11_T2

AGGAC

AGA_

11_T1

GCAC

CAGA

_17_

T3

TTCGAAGA_7_T3_V82A_

CATG

AGCG

_18_

T2

TGCTT

ATA_21

_T3_V

82A_

AATG

AGAT

_4_T

2

TTGCCCCT_17_T2TCAGCTGC_10_T2

CTATAAAA_15_T3_L90M_

GCTTCTCG_46_T1

CCCC

TCTT

_16_

T2

TCTC

AGCT

_21_

T1_A28

V_3T

_22_

ATT

GCT

AG

GGCC

CCCC

_18_

T2

AGGCATCA_20_T3

CGCC

CCTT

_15_

T1

GAGA

CCGC

_8_T

1

ACGCCGAC_7_T3_V82A_

GACCCGAC_3_T3_I84V_

TTTCC

GC

A_13_T3_V82A_

CCCTACGC_22_T1CCGGTGGC_5_T2

GTG

CCAG

A_8_

T2

CAATCACG_29_T2

CCTATCGT_19_T2

ACATTGTA_3_T3_V82A_

CGGGATTA

_9_T

3_V8

2A_

CTCTACAA_9_T3_V82A_

TGCCTAGA_9_T2

CCCGCCGA_42_T2

AAGCCTGA_19_T3_V82A__L90M_

GACCGGCC_3_T1

CCCACCCC_28_T2CCTG

CGTC

_5_T

2

GTT

CGAG

C_36

_T2

TTATCACC_10_T2

CAGTTCAG_12

_T3_V82A

_

_A28

V_3T

_12_

CC

CTGT

CT

GTAAC

AAA_

8_T3

_V82

A_

TCTT

CTA

G_1

3_T3

CTAT

TTAT

_21_

T3

CTCCTATC_30_T1

GGAC

GAGT

_9_T3

CCCAAATA_44_T1

ATTC

AACC

_3_T

3

CC

CTATAA_13_T3_V82A_

TCTTTC

CG

_17_T3_V82A_

CCCT

AGCA

_12_

T2

GACAC

GAC_7_

T1

TCCG

TCG

C_14

_T2

TCCC

TGGC

_21_

T2

GAATAACT_14_T3_V82L

GC

AGTA

AG_3

_T3_

V82I

CAGCAACC_10_T3_L90M_

GCGCACAC_18_T2

GCCGTCGC_16_T3_L90M_

CTAGGGCT_2

0_T1

TCCTC

CAA_17_T

3_V82A

_

CGTTCAAG_9_T3_I84V_

CCCC

TGTG

_9_T

3_I84

V_CA

CCG

TAG

_7_T

2

CCAACACC_5_T3_V82A_

CCTCTTCC_7_T2

CG

ACG

CAT_17_T3_V82A_

CCGGGTGG_3_T1

CCGC

CCCC

_5_T

2

CATG

TAG

G_1

5_T3

_V82

A_

TCCA

TCTC

_50_

T1

CCGTAGCC_3_T1

TTATTATC_17_T3_L90M_TATGGTTG_9_T3_L90M_

CGCC

CCAT

_6_T

2

GCTAGCCT_3_T2

ATGTCCCC_3_T2

TCTT

TATC

_26_T3

_V82A

_

CATAAGCA_3_T3_I84V_

CAATTCCT_46_T1GTGCTCGG_22_T2

CTTTCCTC_12_T3_V82LAGCCAAAA_9_T3 ATCCCCTC_10_T

2

TCGCCTCC_3_T1

CAAA

TTAG

_7_T

3

GG

GCG

CGC_

25_T

2

TTTCCCTT_15_T2

CCCCCGGA_18_T2

TTTA

AATG

_7_T

3_V8

2A_

ATCTCGCG_27_T2

AGCC

TAGG

_24_

T2

GGGCGTCG_33_T1

CGACACTA_9_T3_

V82A_

AAGGGATC_33_T3_L90M_

TTCTGTCG_3_T3_V82A_

ATACTAAA_60_T1

ACCTAATC_4_T2

AGGGTGGG_15_T2

AATAGAC

T_13_T3_V82A_

GAC

GG

TTC_

19_T

2

CACCCGCT_28_T2

GAAACACA_19_T3

_V82A

_

CATG

GTCA

_24_

T2

ATCAC

CCT_15_T3_V82A_

CACTAGAT_14_T3

GTGCATTG_8_T3

ACCGTCGA_8_T3_V82A_

TCCAAAAA_9_T3_V82A_

ATGAG

CTA_10_T1

AAATCCGC_27_T1

ACCCGGGC_35_T2CCGCGACT_11_

T2

CCCACGGT_17_T1

GTTTTGAG_12_T2

CTGC

CCCC

_4_T

2

GCCGAGAT_51_T

1

ATTATTAA_11_T2

TAATCAAC_37_T1

TCAAAATT_14_T3_V82A_

CTAAACTT_19_T3_V82A_

GAGTGCGG_11_T2

GTG

TTAG

T_9_

T2

CCAGGGAG_22_T1

TAGCGTTA_7_T3GTCGTGTG_21_T1

CCGG

TGGC

_3_T1

CCCG

CGCA

_20_T

1

GGAT

TGCG

_10_

T1

CTCCCTGT_3_T1_L90M_

AGTCCACC_7_T3_V82A_

TCTGAAGG_29_T1

GCCTCAA

C_9_T3

_V82A

_

TTTGTCCG_9_T3_L90M_CTCCGAAC_18_T3_L90M_

ACAC

ATCG

_11_

T3

AGCAG

TCA_11_T3_V82A_

GTAGCATC_5_T2

CCCAACTT_12_T2

TAATGTCG_8_T2CTCCCCCA_25_T1

GC

AAAGG

G_14_T3_V82A_

GGATGCCG_14_T2

ACGCATTA_8_T3_V82A_

ATCCAAGC_9_T3_V82A_

GCCCG

CTA_12_T3_V82A_

TGAT

CTAT

_14_

T2

TCCTCTGA_7_T3_V82A_

TCAA

CAA

T_9_

T3_V

82A_

CCGCCGCC_25_T2

CCTTGTCG_5_T1

GCCCCTTA_22_T2AGAACAGA_15_T2

TTAATCCG_10_T3_V82A_

AACT

TTG

A_5_

T3_L

90M

_

GAAAG

TAT_19_T3_V82I

AGCACTAA_15_T1

ACCC

ATCA

_5_T

2

TATA

ACAA

_9_T

3_L9

0M_

AGGGTCCC_16_T1

TACC

GCCC

_6_T

1

TAGATATA_13_T3_V82A_

TAGA

GTAA

_29_

T3

ATAGGAAC_13_T3

AATAATCA_14_T3_L90M_

TTGATC

TA_14_T3_V82A_

CCAA

ATGC

_14_

T3_V

82A_

ATAACAAA_7_T3_V82A_ACAGAACC_19

_T3_V

82A_

_A28V_3T_52_GT

GCT

GCA

TTCCTCCC_36_

T1

ACGC

GGGT

_11_

T2

GG

TAAACG_12_T3

TACGTACC_11_T3_V82A_

TGCG

TCGC

_19_

T1

CTACCTAG_7_T2

ACATTTCC_5_T3_V82A_

GCCACGGA_6_T3

CCCGCAAA_15_T1

GAAAT

AGA_

5_T3

_V82

A_

GCGAGCGC_33_T2

ACTTTCCT_9_T3CTATGCGA_18_T3

TCAG

AAGG

_11_

T3

TTCG

TTAA

_5_T

3_L9

0M_

3T_4_AACAT

GAT

ACCTGGGA_22_T3_L90M_

GTGCCCGT_41_T2

CGCC

GACG

_24_

T2

CACG

CGCG

_9_T

2

CATGGAGC_21_T3_V82A_

TAAGGCTC_8_T3_V82A_

CTTCCTAT_6_T3

AGAATG

GC

_15_T3_V82A_

CGGATAAG_3_T1

ATTGGGTA

_20_T1

CAC

AGC

AT_19_T3_V82A_

TAGATCGG_12_T2

ACCTCAAA_26

_T1

CTGCTATT_5_T1

CTTAAAGT_14_T3

TTTAACTC_25_T3_L90M_

CCGC

CATC

_9_T

2

CTAGTTCC_11_T3_V82A_

ACTG

CTCT

_17_

T3

TCCAAACT_5_T3_V82A_

AAGTTACA_4_T3

GCCTCTTG_15_T2

CCAACCCC_4_T2

CTTATGTG

_10_T3_V82A_

GCCT

GTGC

_18_

T2

GG

GTA

GG

A_22

_T1

CGTGTCTC_19_T2

CCCC

GTAG

_13_

T2

TCTTTAG

A_20_T3_V82A_

AAGTCAAG_11_T1

AATCTGGA_8_T3_V82A_

CCG

GG

CAC_

18_T

2

ACGAATAA_15_T3_V82A_

TATT

CAAT_

8_T3

_V82

A_

CACGGACA_6_T3_

V82A_

TACCGGCG_3_T1

AAATAGAA_22_T3_I84V_

ACATTTAA_13_T3_I84V_

CAAGGCCC_17_T2

CCTCACTA_5_T3_V82A_

CGAATATT_35_T1

TTTA

CAAC

_5_T

3

GAG

GG

GCG

_5_T3_V82A_

TGAACGCT_6_T3_V82A_

GTTCTT

AA_21_T3

_V82A

_

ACACCCAC_17_T3_V82A_

ACAAG

TCG

_18_T3_V82A_

CGGTGCAC_13_T2

ACTA

GTC

A_11

_T3

TAAACTAA_3_T3_L90M_

CTGCTCCC_9_T2

TACAG

TCA_14_T3_V82A_

ACTTC

CTT_17_T3_V82A_

CTCATGCC_5_T2

GCGCACAT_11_T2

CGTACGAT_16_T2

AAGTATAT_12_T3_L90M_

CAACGAT

C_10_

T3_V

82A_

AGACCGAA_11_T3_I84V_

GTCAACCA_21_T2

GTGGCGAC_6_

T3_V82A

_

CCGGCGCC_3_T2

CTTCCCCA_25_T2

CCTT

AAAG

_15_

T2

CCAGATAT_21_T3_L90M_CTG

CGTAA_9_T3_V82A_

TGACG

GG

T_24_T2

CTTC

TCTG

_4_T1

AAACTAAG_6_T3

GTTAC

CCT_13_

T3_V

82A_

TCGA

ACCC

_24_

T1

ATAAGTGA_9_T3_V82A_CCCCCTCC_14_T1

ATCTGG

GA_12_T3_V82A_

AGTTCAAA_22_T3

GTCCGCAT_7_T3_V82A_

TACT

ACAT

_13_

T3_L

90M

_

GCGCG

GGG_19_T1

GTTCTCTC_4_T2

TGACCCCA_5_T2

TTAGTTAC_10_T3_L90M_

TTGGAACT_7_T3_I84V_

AAGA

TGTT

_7_T

1

Fig. 3. Phylogenetic representation of protease population derived from deepsequencing with a Primer ID. A Neighbor-Joining tree was constructed fromsequences derived from all three time points and colored based on susceptibilityto ritonavir. Blue colored taxa represent susceptible variants (defined as notV82A/I/L/F, I84V, or L90M). Red colored taxa represent variants containing themajor ritonavir resistant variant, V82A. Pink colored taxa represent the minorresistant variants V82I/L/F. Green and orange colored taxa represent the minorresistant alleles L90M and I84V, respectively. Within a color, color brightness iscorrelated with sample time. Dark green and red arrows point to pre-RTV low-abundance sequences that clonally amplified to their respective clades.

Jabara et al. PNAS | December 13, 2011 | vol. 108 | no. 50 | 20169

MICRO

BIOLO

GY

Dow

nloa

ded

by g

uest

on

May

30,

202

0

Page 5: Accurate sampling and deep sequencing of the HIV-1 ...Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID Cassandra B. Jabaraa,b,c, Corbin D. Jonesa,d,

the number of pathogen genomes in the sample is limited, and theuse of PCR can obscure the quality of the sampling by creating alarge amount of DNA from a relatively small number of startingtemplates. This can create artificial homogeneity, inflate estimatesof segregating genetic variation, skew the distribution of alleles inthe population, and introduce artificial diversity.We have developed a strategy that allows each sampled tem-

plate to be tagged with a unique ID by a primer that has a de-generate sequence tag incorporated during the primer oligonu-cleotide synthesis (Fig. S7). This tag can then be followed throughthe PCR and the deep sequencing protocol to identify sequenc-ing over-coverage (resampling) of the individual viral templates.Because the Primer ID allows for the identification of over-cov-erage, this can then be used to create a consensus sequence foreach template, avoiding both PCR-related errors and sequencingerrors (Fig. S8). In addition, the number of different Primer IDsreflects the number of templates that were actually sampled. Thisallows a realistic assessment of the depth of population samplingand makes it possible to apply a more rigorous analysis of minorvariants by correcting the allelic skewing during the PCR.We tested the Primer ID approach by sequencing the HIV-1

protease coding domain at three time points in a subject who wasintermittently exposed to a protease inhibitor between the secondand third time points. A key feature of our approach is the re-moval of fortuitous errors and accounting for resampling, whichresults in a dramatic reshaping of the original data set of 72,162reads. Other approaches that rely on statistical modeling havebeen developed to deal with the problem of high sequencing errorrates associated with deep sequencing technologies (49–51). Theuse of the Primer ID to create consensus sequences resulted inthe removal of 80% of the unique sequence polymorphisms(defined as a change in the consensus without regard to frequencyof appearance) in the data set. Similarly, allelic skewing wasdramatic among the sampled sequences, in most cases rangingfrom 2- to 15-fold but going up to nearly 100-fold. Although thePrimer ID reveals such skewing and helps correct it, this is clearlya poorly controlled feature of PCR amplifications that can dra-matically affect the observed abundance of complex populations,especially the minor variants. Allelic skewing may still persist ifthe cDNA primer or the upstream PCR primer binds differen-tially among the templates, or if cDNAs enter the PCR amplifi-cation in later rounds and are discarded because they do notresult in at least three reads to allow a consensus sequence to beformed. Also, residual misincorporation errors by RT and in thefirst round of PCR synthesis still limit the interpretation ofmutations that occur in the range of 0.01–0.1%. This problem isnot overcome with larger numbers of sequences. Given the lowdiversity in these samples, we removed all substitutions thatappeared once because their number approximated the expectednumber of residual sequence errors, and this resulted in a sensi-tivity of detection in the range of 0.1% for SNPs that appearedabove the frequency of the residual sequence error rate.Using the Primer ID approach, we were able to describe a

number of features of the protease sequence population, howeverour results are from a single individual and therefore cannot begeneralized. First, a pooled analysis of two time points six monthsapart showed that the variants present at greater than 0.5% inabundance made up two-thirds of the total population but rep-resented only 4% of unique genome sequences and containedonly 7% of the total unique sequence polymorphisms. About 60%of the diversity was stable over both time points, with synonymousSNPs maintained at a significantly higher proportion in the twotime points than nonsynonymous SNPs. Only 18% of the totaldiversity represented nonsynonymous SNPs that were present atboth time points. However, our ability to assess persistence ofthese sequences is limited by the depth of sampling, although wefeel we are approaching the practical limit of sampling with thistechnology. We observed nonviable substitutions and estimatethat most of the SNPs that appeared once were the result of re-maining method error. We found no pattern of conserved linkage

among these SNPs, consistent with high levels of recombinationacross the population.Although the overall measurement of diversity (π) was similar

between the first two time points, we noted that the biggestchanges in SNP abundance between the two time points were inthree synonymous codon positions (L24L, K70K, and G73G).These dynamic increases made these SNPs part of a larger groupof SNPs that accounted for 51% of the total sequences that wereotherwise identical to the consensus sequence (Q18Q, L19I, L24L,K70K, G73G, and Q18Q/L19I/L24L’). These SNPs also over-lapped the major SNPs that defined subgroups of the resistantvariants (L19I; L19V; G16G/L19V). We considered the possibilitythat there was a unifying feature of these SNPs. We found such afeature in that all of these SNPs, both coding and noncoding, resultin changes in two relatively large alternative ORFs that lie at the5′ and 3′ ends of the pro gene. Alternative reading frames havebeen suggested to generate cryptic CTL epitopes (52–54). In thisscenario, these abundant SNPs would represent various escapemutants. Such selective pressures could explain the dynamic be-havior of several of these SNPs between the first two time points.After intermittent exposure to the protease inhibitor ritonavir,

we were able to identify six independent lineages of drug re-sistance mutations. With the intermittent exposure in this par-ticular subject, it was possible to see the major V82A lineagemost often seen with ritonavir resistance, but also significantpopulations of I84V and L90M. We also saw minor populationsof V82I, V82L, and V82F. This mixed population of resistantlineages likely represents the early stages of the evolution ofresistance, a conclusion supported by the minor appearance ofthe L63P compensatory mutation and the complete absence ofI54V, which is an often seen compensatory mutation for V82A.We saw few examples of genomes with multiple resistancemutations, although these would be expected after more exten-sive selection (55, 56). We and others have previously examinedviral sequences that have been collected in large databases.Typically, these sequences represent the single predominant se-quence within an individual, and the use of these sequencesallows for assessment of interperson diversity. In the future, itwill be an interesting exercise to compare the conclusionsreached by examining viral diversity within a person to viral di-versity between people; however more intraperson diversityneeds to be measured at this level of detail to allow comparisonof inter- versus intraperson diversity.The presence of preexisting drug-resistant variants and their

role in therapy failure is of great interest, and accurate, deepsampling of a viral population can add significantly to our un-derstanding of this question. We were able to detect severalexamples of drug-resistance mutations but only at a very lowlevel. Our ability to reliably detect these mutations is limited tothose that appear at a frequency of 0.1–0.2%, limited in part bythe low overall diversity in the population. We were able to seeexamples of mutations that are typically seen only in the pres-ence of drug selection. However, the detection was usually as onegenome at two time points or two genomes at one time point.This was also the level of detection of active site mutations in theprotease and of termination codons, which must represent eithertransient viral genomes or residual misincorporation errors. Intwo cases, we were able to observe the resistance mutation(V82A and L90M) at pretherapy time points linked to the samepolymorphisms that were present on the variant that grew outduring drug exposure. Thus, although it is likely that we aredetecting relevant preexisting drug-resistant variants, these are atthe limit of detection and, if they are maintained at a steady-statelevel, it is well under 0.5% abundance.Most protocols of high throughput sequencing technologies

still require an initial quantity of DNA that necessitates anupfront PCR step for many applications. The use of a Primer IDwill help clarify the sequencing products in any strategy that usesan initial PCR step with its attendant error rate, recombination,and resampling. In an independent effort Kinde et al. have de-scribed an analogous approach in another deep sequencing

20170 | www.pnas.org/cgi/doi/10.1073/pnas.1110064108 Jabara et al.

Dow

nloa

ded

by g

uest

on

May

30,

202

0

Page 6: Accurate sampling and deep sequencing of the HIV-1 ...Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID Cassandra B. Jabaraa,b,c, Corbin D. Jonesa,d,

setting (36). We believe a strategy that allows an initial taggingof individual templates before PCR and subsequent sequenceanalysis will be essential for understanding the true complexityand diversity of genetically dynamic populations.

Materials and MethodsViral RNA was isolated from blood plasma using the QIAmp Viral RNA kit(Qiagen). cDNA was generated using SuperScript III Reverse Transcriptase (Invi-trogen) using the primer (with Primer ID) as described. Following the reaction,RNA in hybrid was removed by RNaseH treatment (Invitrogen). Unincorporated

cDNAprimerwas removed, and the cDNAproduct amplifiedby PCR. Sequencingwas done using the 454 platform (Roche). Detailed methods for cDNA tagging,amplification, and analysis are presented in the SI Materials and Methods.

ACKNOWLEDGMENTS. C.B.J. thanks Jesse Walsh for training on the GS FLXplatform. We also thank Dr. Dale Kempf of Abbott Laboratories for makingclinical samples available. This work was supported by National Institutes ofHealth (NIH) Awards GM P01 GM066524 (with subcontract to R.S.) and R37AI44667 (to R.S.). In addition, we received support from University of NorthCarolina (UNC) Center For AIDS Research NIH Award P30 AI50410 and UNCLineberger Comprehensive Cancer Center NIH Award P30 CA16086.

1. Margulies M, et al. (2005) Genome sequencing in microfabricated high-density pico-litre reactors. Nature 437:376–380.

2. Eid J, et al. (2009) Real-time DNA sequencing from single polymerase molecules.Science 323:133–138.

3. Bentley DR, et al. (2008) Accurate whole human genome sequencing using reversibleterminator chemistry. Nature 456:53–59.

4. Metzker ML (2010) Sequencing technologies - the next generation. Nat Rev Genet 11:31–46.

5. Fischer W, et al. (2010) Transmission of single HIV-1 genomes and dynamics of earlyimmune escape revealed by ultra-deep sequencing. PLoS ONE 5:e12303.

6. Wang C, Mitsuya Y, Gharizadeh B, Ronaghi M, Shafer RW (2007) Characterization ofmutation spectra with ultra-deep pyrosequencing: Application to HIV-1 drug re-sistance. Genome Res 17:1195–1201.

7. Hoffmann C, et al. (2007) DNA bar coding and pyrosequencing to identify rare HIVdrug resistance mutations. Nucleic Acids Res 35:e91.

8. Bushman FD, et al. (2008) Massively parallel pyrosequencing in HIV research. AIDS 22:1411–1415.

9. Varghese V, et al. (2009) Minority variants associated with transmitted and acquiredHIV-1 nonnucleoside reverse transcriptase inhibitor resistance: Implications for theuse of second-generation nonnucleoside reverse transcriptase inhibitors. J AcquirImmune Defic Syndr 52:309–315.

10. Mitsuya Y, et al. (2008) Minority human immunodeficiency virus type 1 variants inantiretroviral-naive persons with reverse transcriptase codon 215 revertant muta-tions. J Virol 82:10747–10755.

11. Gunthard HF, Wong JK, Ignacio CC, Havlir DV, Richman DD (1998) Comparativeperformance of high-density oligonucleotide sequencing and dideoxynucleotide se-quencing of HIV type 1 pol from clinical samples. AIDS Res Hum Retroviruses 14:869–876.

12. Van Laethem K, et al. (1999) Phenotypic assays and sequencing are less sensitive thanpoint mutation assays for detection of resistance in mixed HIV-1 genotypic pop-ulations. J Acquir Immune Defic Syndr 22:107–118.

13. Palmer S, et al. (2006) Persistence of nevirapine-resistant HIV-1 in women after single-dose nevirapine therapy for prevention of maternal-to-fetal HIV-1 transmission. ProcNatl Acad Sci USA 103:7094–7099.

14. Flys TS, et al. (2006) Quantitative analysis of HIV-1 variants with the K103N resistancemutation after single-dose nevirapine in women with HIV-1 subtypes A, C, and D.J Acquir Immune Defic Syndr 42:610–613.

15. Cai F, et al. (2007) Detection of minor drug-resistant populations by parallel allele-specific sequencing. Nat Methods 4:123–125.

16. Beck IA, et al. (2008) Optimization of the oligonucleotide ligation assay, a rapid andinexpensive test for detection of HIV-1 drug resistance mutations, for non-NorthAmerican variants. J Acquir Immune Defic Syndr 48:418–427.

17. Johnson JA, et al. (2008) Minority HIV-1 drug resistance mutations are present inantiretroviral treatment-naive populations and associate with reduced treatmentefficacy. PLoS Med 5:e158.

18. Johnson JA, et al. (2007) Simple PCR assays improve the sensitivity of HIV-1 subtype Bdrug resistance testing and allow linking of resistance mutations. PLoS ONE 2:e638.

19. Metzner KJ, et al. (2003) Emergence of minor populations of human immunodefi-ciency virus type 1 carrying the M184V and L90M mutations in subjects undergoingstructured treatment interruptions. J Infect Dis 188:1433–1443.

20. Paredes R, Marconi VC, Campbell TB, Kuritzkes DR (2007) Systematic evaluation ofallele-specific real-time PCR for the detection of minor HIV-1 variants with pol andenv resistance mutations. J Virol Methods 146:136–146.

21. Li JZ, et al. (2011) Low-frequency HIV-1 drug resistance mutations and risk of NNRTI-based antiretroviral treatment failure: A systematic review and pooled analysis. JAMA305:1327–1335.

22. Metzner KJ, et al. (2009) Minority quasispecies of drug-resistant HIV-1 that lead toearly therapy failure in treatment-naive and -adherent patients. Clin Infect Dis 48:239–247.

23. Halvas EK, et al. (2006) Blinded, multicenter comparison of methods to detect a drug-resistant mutant of human immunodeficiency virus type 1 at low frequency. J ClinMicrobiol 44:2612–2614.

24. Mansky LM, Temin HM (1995) Lower in vivo mutation rate of human immunodefi-ciency virus type 1 than that predicted from the fidelity of purified reverse tran-scriptase. J Virol 69:5087–5094.

25. Perelson AS, Neumann AU, Markowitz M, Leonard JM, Ho DD (1996) HIV-1 dynamicsin vivo: Virion clearance rate, infected cell life-span, and viral generation time. Science271:1582–1586.

26. Hughes JP, Totten P (2003) Estimating the accuracy of polymerase chain reaction-based tests using endpoint dilution. Biometrics 59:505–511.

27. Kanagawa T (2003) Bias and artifacts in multitemplate polymerase chain reactions(PCR). J Biosci Bioeng 96:317–323.

28. Meyerhans A, Vartanian JP, Wain-Hobson S (1990) DNA recombination during PCR.Nucleic Acids Res 18:1687–1691.

29. Yang YL, Wang G, Dorman K, Kaplan AH (1996) Long polymerase chain reactionamplification of heterogeneous HIV type 1 templates produces recombination ata relatively high frequency. AIDS Res Hum Retroviruses 12:303–306.

30. Liu SL, et al. (1996) HIV quasispecies and resampling. Science 273:415–416.31. Judo MS, Wedel AB, Wilson C (1998) Stimulation and suppression of PCR-mediated

recombination. Nucleic Acids Res 26:1819–1825.32. Salazar-Gonzalez JF, et al. (2008) Deciphering human immunodeficiency virus type 1

transmission and early envelope diversification by single-genome amplification andsequencing. J Virol 82:3952–3970.

33. Palmer S, et al. (2005) Multiple, linked human immunodeficiency virus type 1 drugresistance mutations in treatment-experienced patients are missed by standard ge-notype analysis. J Clin Microbiol 43:406–413.

34. Simmonds P, Balfe P, Ludlam CA, Bishop JO, Brown AJ (1990) Analysis of sequencediversity in hypervariable regions of the external glycoprotein of human immuno-deficiency virus type 1. J Virol 64:5840–5850.

35. Edmonson PF, Mullins JI (1992) Efficient amplification of HIV half-genomes fromtissue DNA. Nucleic Acids Res 20:4933.

36. Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B (2011) Detection andquantification of rare mutations with massively parallel sequencing. Proc Natl AcadSci USA 108:9530–9535.

37. Cameron DW, et al. (1998) Randomised placebo-controlled trial of ritonavir in ad-vanced HIV-1 disease. The Advanced HIV Disease Ritonavir Study Group. Lancet 351:543–549.

38. Potter J, Zheng W, Lee J (2003) Thermal stability and cDNA synthesis capability ofSuperScript III reverse transcriptase. Focus (Invitrogen, Carlsbad, CA), pp 19–24.

39. Barnes WM (1992) The fidelity of Taq polymerase catalyzing PCR is improved by anN-terminal deletion. Gene 112:29–35.

40. Shriner D, Liu Y, Nickle DC, Mullins JI (2006) Evolution of intrahost HIV-1 genetic di-versity during chronic infection. Evolution 60:1165–1176.

41. Johnson VA, et al. (2005) Update of the drug resistance mutations in HIV-1: Fall 2005.Top HIV Med 13:125–131.

42. Shafer RW, Jung DR, Betts BJ (2000) Human immunodeficiency virus type 1 reversetranscriptase and protease mutation search engine for queries. Nat Med 6:1290–1292.

43. Carrillo A, et al. (1998) In vitro selection and characterization of human immunode-ficiency virus type 1 variants with increased resistance to ABT-378, a novel proteaseinhibitor. J Virol 72:7532–7541.

44. Eastman PS, et al. (1998) Genotypic changes in human immunodeficiency virus type 1associated with loss of suppression of plasma viral RNA levels in subjects treated withritonavir (Norvir) monotherapy. J Virol 72:5154–5164.

45. Drake JW, Holland JJ (1999) Mutation rates among RNA viruses. Proc Natl Acad SciUSA 96:13910–13913.

46. Duffy S, Shackelton LA, Holmes EC (2008) Rates of evolutionary change in viruses:Patterns and determinants. Nat Rev Genet 9:267–276.

47. Onafuwa-Nuga A, Telesnitsky A (2009) The remarkable frequency of human immu-nodeficiency virus type 1 genetic recombination. Microbiol Mol Biol Rev 73:451–480.

48. Shafer RW (2009) Low-abundance drug-resistant HIV-1 variants: Finding significancein an era of abundant diagnostic and therapeutic options. J Infect Dis 199:610–612.

49. Eriksson N, et al. (2008) Viral population estimation using pyrosequencing. PLOSComput Biol 4:e1000074.

50. Zagordi O, Klein R, Daumer M, Beerenwinkel N (2010) Error correction of next-generationsequencingdataandreliableestimationofHIVquasispecies.NucleicAcidsRes38:7400–7409.

51. Zagordi O, Geyrhofer L, Roth V, Beerenwinkel N (2010) Deep sequencing of a ge-netically heterogeneous sample: Local haplotype reconstruction and read error cor-rection. J Comput Biol 17:417–428.

52. Cardinaud S, et al. (2004) Identification of cryptic MHC I-restricted epitopes encodedby HIV-1 alternative reading frames. J Exp Med 199:1053–1063.

53. Bansal A, et al. (2010) CD8 T cell response and evolutionary pressure to HIV-1 crypticepitopes derived from antisense transcription. J Exp Med 207:51–59.

54. Berger CT, et al. (2010) Viral adaptation to immune selection pressure by HLA class I-restricted CTL responses targeting epitopes in HIV frameshift sequences. J Exp Med207:61–75.

55. Resch W, Parkin N, Watkins T, Harris J, Swanstrom R (2005) Evolution of human im-munodeficiency virus type 1 protease genotypes and phenotypes in vivo under se-lective pressure of the protease inhibitor ritonavir. J Virol 79:10638–10649.

56. HanceAJ, et al. (2001) Changes in human immunodeficiency virus type 1populations aftertreatment interruption in patients failing antiretroviral therapy. J Virol 75:6410–6417.

Jabara et al. PNAS | December 13, 2011 | vol. 108 | no. 50 | 20171

MICRO

BIOLO

GY

Dow

nloa

ded

by g

uest

on

May

30,

202

0