newlyarisen dna repeats in primatephylogeny - pnas dnarepeats in primatephylogeny ... single copy...

5
Proc. Natl. Acad. Sci. USA Vol. 86, pp. 9360-9364, December 1989 Evolution Newly arisen DNA repeats in primate phylogeny (a-fetoprotein gene/DNA transposition/evolutionary novelties/human-gorilla divergence) SUSAN C. RYAN AND ACHILLES DUGAICZYK Department of Biochemistry, University of California, Riverside, CA 92521 Communicated by Frank W. Putnam, September 5, 1989 (received for review July 3, 1989) ABSTRACT We discovered the presence of an Alu and an Xba repetitive DNA element within introns 4 and 7, respec- tively, of the human a-fetoprotein (AFP) gene; these elements are absent from the same gene in the gorilla. The Alu element is flanked by 12-base-pair direct repeats, AGGATGTTGTGG ... (Alu) ... AGGATGTTGTGG, which presumably arose by way of duplication of the intronic target site AGGATGT- TGTGG at the time of the Alu insertion. In the gorilla, only a single copy of the unoccupied target site is present, which is identical to the terminal repeat flanking the human Alu ele- ment. There are two copies of an Xba repeat in the human AFP gene, apparently the only two in the genome. Xbal and Xba2, located within introns 8 and 7, respectively, differ from each other at 3 of 303 positions. Xbal is referred to as the old (ancestral) repeat because it lacks direct repeats. The new (derived) Xba2 is flanked by direct repeats, TTTC lTT ... (Xba) ... TTTCTTCTT, and is thought to have arisen as a result of transposition of Xbal. The ancestral Xbal and a single copy of the Xba2 target site are present at orthologous positions in the gorilla, but the new Xba2 is absent. We conclude that the Alu and Xba DNA repeats emerged in the human genome at a time postdating the human-gorilla divergence and became established as genetic novelties in the human lineage. We submit that the chronology of divergence of primate lines of evolution can be correlated with the timing of insertion of new DNA repeats into the genomes of those primates. We have previously reported that the albumin gene family has been invaded by numerous elements of repetitive DNA (1). These elements have been inserted at random sites within the gene family and, by inference, at different times during their evolutionary history. It would be of interest to find out if a correlation can be established between the emergence of new DNA repeats and the divergence of species, particularly in the time frame of primate evolution. The most prominent among the various DNA repeats are members of the Alu family. These repeats are considered to be pseudogenes that have arisen by retroposition of 7SL RNA (2, 3) or 4.5S RNA (4), and they are presently found at an estimated 500,000 copies per genome in the primates and other vertebrates (5, 6). Although there is some evidence of interspecies differences in the number of Alu sequences among closely related primates (7), studies in the a-, and ,3-globin gene regions revealed no differences in the location of specific Alu repeats between humans and two other primates. In the a-globin locus, seven members of the Alu family were found in identical positions in human and chim- panzee (8). Furthermore, in the f3-globin locus, seven addi- tional Alu repeats were also found in identical positions in human and orangutan (9). All of the 14 repeats were thus uninformative with regard to phylogenies of the above pri- mates. However, Trabuchet et al. (10) reported the presence of a new member of the Alu family in the gorilla ,,S-globin region. This same Alu repeat is absent in human, chimpan- zee, and macaque, thus suggesting a recent insertion of this element in the gorilla DNA. It is not clear from the report (10) whether this Alu repeat is fixed or polymorphic in the gorilla lineage. If more examples are found of species-specific differences in repetitive DNA elements at orthologous sites among higher primates, this would provide the best evidence for the chronology of their insertion and would help in discerning the phylogeny of the species involved. In determining the complete structure of the human a- fetoprotein (AFP) gene (11), we identified the positions of two Xba repetitive elements. We concluded from this work that Xbal and Xba2 were entirely novel repetitive elements. Due to the high degree of sequence similarity between Xbal and Xba2 (99%), we further concluded that their existence in the human AFP gene was of relatively recent evolutionary origin (1, 11). Presently, we have extended our work on the AFP gene to other primates and wish to report a specific difference in an Xba and Alu repetitive element between the human and gorilla lineages. In addition, we submit that such DNA elements fulfill the criterion for molecular markers of phy- logeny. MATERIALS AND METHODS Cloning. A gorilla genomic library cloned in a Charon 40 phage vector was graciously provided to us by Jerry Slightom (Division of Molecular Biology, The Upjohn Company, Kal- amazoo, MI). Using human AFP gene fragments (11) as probes, gorilla phage clones AGAFP9 and AGAFP22 were isolated and characterized by restriction endonuclease diges- tion. EcoRI fragments of the two phage clones were sub- cloned into the plasmid vector pBR322. DNA sequence was determined by the method of Maxam and Gilbert (12), frequently strand-separating labeled DNA fragments prior to the chemical degradation. Southern Hybridization Analysis of Human and Gorilla AFP Genes. Cloned human 9.9-kilobase (kb) and gorilla 9.6-kb EcoRI DNA fragments were separated electrophoretically in a 1% agarose gel followed by Southern (13) hybridization to an Alu probe, BLUR8 (14). Similarly, 10 pug of human and gorilla genomic DNA was digested with restriction endonu- cleases, separated electrophoretically in a 1% agarose gel, and hybridized to either a human 6.0-kb EcoRI-HindIII probe (11) or a gorilla 5.4-kb HindIII-EcoRI probe. RESULTS BLUR8 Hybridization. Preliminary sequencing results in- dicated that a gorilla 9.6-kb fragment aligned with the same EcoRI sites as the human 9.9-kb fragment, even though it was shorter by 300 base pairs (bp). Both fragments contain a unique BamHI site. BamHI digestion of human 9.9-kb DNA yielded 5.5-kb and 4.4-kb fragments, whereas BamHI diges- tion of gorilla 9.6-kb DNA yielded 5.5-kb and 4.1-kb frag- Abbreviation: AFP, a-fetoprotein. 9360 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Upload: lamhanh

Post on 27-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Proc. Natl. Acad. Sci. USAVol. 86, pp. 9360-9364, December 1989Evolution

Newly arisen DNA repeats in primate phylogeny(a-fetoprotein gene/DNA transposition/evolutionary novelties/human-gorilla divergence)

SUSAN C. RYAN AND ACHILLES DUGAICZYKDepartment of Biochemistry, University of California, Riverside, CA 92521

Communicated by Frank W. Putnam, September 5, 1989 (received for review July 3, 1989)

ABSTRACT We discovered the presence of an Alu and anXba repetitive DNA element within introns 4 and 7, respec-tively, of the human a-fetoprotein (AFP) gene; these elementsare absent from the same gene in the gorilla. The Alu elementis flanked by 12-base-pair direct repeats, AGGATGTTGTGG... (Alu) ... AGGATGTTGTGG, which presumably aroseby way of duplication of the intronic target site AGGATGT-TGTGG at the time of the Alu insertion. In the gorilla, only asingle copy of the unoccupied target site is present, which isidentical to the terminal repeat flanking the human Alu ele-ment. There are two copies of an Xba repeat in the human AFPgene, apparently the only two in the genome. Xbal and Xba2,located within introns 8 and 7, respectively, differ from eachother at 3 of 303 positions. Xbal is referred to as the old(ancestral) repeat because it lacks direct repeats. The new(derived) Xba2 is flanked by direct repeats, TTTClTT ...(Xba) ... TTTCTTCTT, and is thought to have arisen as aresult of transposition ofXbal. The ancestral Xbal and a singlecopy of the Xba2 target site are present at orthologous positionsin the gorilla, but the new Xba2 is absent. We conclude that theAlu and Xba DNA repeats emerged in the human genome at atime postdating the human-gorilla divergence and becameestablished as genetic novelties in the human lineage. Wesubmit that the chronology of divergence of primate lines ofevolution can be correlated with the timing of insertion of newDNA repeats into the genomes of those primates.

We have previously reported that the albumin gene familyhas been invaded by numerous elements of repetitive DNA(1). These elements have been inserted at random sites withinthe gene family and, by inference, at different times duringtheir evolutionary history. It would be of interest to find outif a correlation can be established between the emergence ofnew DNA repeats and the divergence of species, particularlyin the time frame of primate evolution.The most prominent among the various DNA repeats are

members of the Alu family. These repeats are considered tobe pseudogenes that have arisen by retroposition of 7SLRNA (2, 3) or 4.5S RNA (4), and they are presently found atan estimated 500,000 copies per genome in the primates andother vertebrates (5, 6). Although there is some evidence ofinterspecies differences in the number of Alu sequencesamong closely related primates (7), studies in the a-, and,3-globin gene regions revealed no differences in the locationof specific Alu repeats between humans and two otherprimates. In the a-globin locus, seven members of the Alufamily were found in identical positions in human and chim-panzee (8). Furthermore, in the f3-globin locus, seven addi-tional Alu repeats were also found in identical positions inhuman and orangutan (9). All of the 14 repeats were thusuninformative with regard to phylogenies of the above pri-mates. However, Trabuchet et al. (10) reported the presenceof a new member of the Alu family in the gorilla ,,S-globin

region. This same Alu repeat is absent in human, chimpan-zee, and macaque, thus suggesting a recent insertion of thiselement in the gorilla DNA. It is not clear from the report (10)whether this Alu repeat is fixed or polymorphic in the gorillalineage. If more examples are found of species-specificdifferences in repetitive DNA elements at orthologous sitesamong higher primates, this would provide the best evidencefor the chronology of their insertion and would help indiscerning the phylogeny of the species involved.

In determining the complete structure of the human a-fetoprotein (AFP) gene (11), we identified the positions oftwoXba repetitive elements. We concluded from this work thatXbal and Xba2 were entirely novel repetitive elements. Dueto the high degree of sequence similarity between Xbal andXba2 (99%), we further concluded that their existence in thehuman AFP gene was of relatively recent evolutionary origin(1, 11). Presently, we have extended our work on the AFPgene to other primates and wish to report a specific differencein an Xba and Alu repetitive element between the human andgorilla lineages. In addition, we submit that such DNAelements fulfill the criterion for molecular markers of phy-logeny.

MATERIALS AND METHODSCloning. A gorilla genomic library cloned in a Charon 40

phage vector was graciously provided to us by Jerry Slightom(Division of Molecular Biology, The Upjohn Company, Kal-amazoo, MI). Using human AFP gene fragments (11) asprobes, gorilla phage clones AGAFP9 and AGAFP22 wereisolated and characterized by restriction endonuclease diges-tion. EcoRI fragments of the two phage clones were sub-cloned into the plasmid vector pBR322. DNA sequence wasdetermined by the method of Maxam and Gilbert (12),frequently strand-separating labeled DNA fragments prior tothe chemical degradation.

Southern Hybridization Analysis ofHuman and Gorilla AFPGenes. Cloned human 9.9-kilobase (kb) and gorilla 9.6-kbEcoRI DNA fragments were separated electrophoretically ina 1% agarose gel followed by Southern (13) hybridization toan Alu probe, BLUR8 (14). Similarly, 10 pug of human andgorilla genomic DNA was digested with restriction endonu-cleases, separated electrophoretically in a 1% agarose gel,and hybridized to either a human 6.0-kb EcoRI-HindIIIprobe (11) or a gorilla 5.4-kb HindIII-EcoRI probe.

RESULTSBLUR8 Hybridization. Preliminary sequencing results in-

dicated that a gorilla 9.6-kb fragment aligned with the sameEcoRI sites as the human 9.9-kb fragment, even though it wasshorter by 300 base pairs (bp). Both fragments contain aunique BamHI site. BamHI digestion of human 9.9-kb DNAyielded 5.5-kb and 4.4-kb fragments, whereas BamHI diges-tion of gorilla 9.6-kb DNA yielded 5.5-kb and 4.1-kb frag-

Abbreviation: AFP, a-fetoprotein.

9360

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Proc. Natl. Acad. Sci. USA 86 (1989) 9361

k b2

7.4

5.85.6

4.8

3.5

2 3 4 15 2 3 4 15

9.9Kb<9.6Kb

- 5.5 Kb

fl- 4.4Kb-W-_ 4.1 Kb

FIG. 1. Hybridization analysis of Alu repetitive DNA in thehuman and gorilla AFP genes. Cloned DNA fragments were sepa-rated electrophoretically in a 1% agarose gel followed by Southern(13) hybridization to a BLUR8 (14) probe. (Left) Lane 1, DNA sizestandards; lane 2, human 9.9-kb EcoRI DNA fragment; lane 3, gorilla9.6-kb EcoRI DNA fragment; lane 4, partial BamHI digest of human9.9-kb DNA, giving rise to 5.5-kb and 4.4-kb fragments; lane 5,partial BamHI digest of gorilla 9.6-kb DNA, giving rise to 5.5-kb and4.1-kb fragments. (Right) Autoradiogram of the same gel, showinghybridization to a nick-translated human BLUR8 probe.

ments (Fig. 1). Since the Alu repeat within intron 4 of thehuman AFP gene is located in the 4.4-kb fragment, the 0.3-kbsize difference in the small BamHI fragments (4.4 vs. 4.1)between the two species could have been due to an Alurepeat. To test this hypothesis, BamHI-digested as well asundigested human 9.9-kb DNA and gorilla 9.6-kb DNA werehybridized to the Alu probe BLUR8 (14) (Fig. 1). As ex-pected, the 9.9-kb EcoRI fragment and the 4.4-kb BamHI-EcoRI fragment from the human AFP gene hybridized to theBLUR8 probe, thus confirming the presence of Alu repeats.

The BLUR8 probe did not hybridize to either the 9.6-kbEcoRI or the 4.1-kb BamHI-EcoRI fragment from the gorillaAFP gene, indicating the absence of Alu repeats (Fig. 1).

Sequence Analysis in the Alu Region. The sequence of a5.4-kb HindIII-EcoRI fragment from the gorilla AFP genewas determined, which overlaps the expected site ofinsertionof the missing Alu repeat. The DNA of this region was foundto contain a single copy of a 12-bp target site, AGGATGT-TGTGG, which is identical to the direct repeats flanking theAlu residing within intron 4 of the human AFP gene (Fig. 2).This 12-bp repeat is thought to have arisen during the eventof the Alu insertion into the human genome and hence thehuman gene has two copies of this sequence. Full-size Aluelements are known to be flanked by direct repeats that arebelieved to have arisen by means of repair of staggeredbreaks in genomic DNA at the target site ofAlu insertion (15,16). There are also numerous examples ofAlu elements beingdeleted from the human genome, but these events produceconcomitant deletions of adjacent DNA and are recognizedas genetic defects (17-22) when coding DNA is deleted. TheAlu element in the human AFP gene is flanked by perfectrepeats of 12 bp, AGGATGTTGTGG ... (Alu) ... AG-GATGTTGTGG, whereas there is no evidence for DNAdeletion in the entire intron 4 in the gorilla AFP gene, ascompared to the same intron in the human genome. Wesubmit that this Alu element was inserted into the humangenome at a time postdating the divergence of the human andgorilla lineages. It is interesting to note that in the neighboringintron 3 of these AFP genes, a 226-bp Kpn repeat is found inthe orthologous position in the human and the gorilla (Fig. 2).Such an identity suggests that the Kpn repeat was insertedinto the genome before the divergence of these two species.The gorilla Kpn repeat differs in 5/226 nucleotide positions(2.2%) from the human Kpn repeat.Sequence Analysis in the Xba Region. We reported earlier

(11) that the two Xba elements differ in only 3 of303 positions(1%) and that only Xba2 is flanked by direct repeats, TT-TClll-- ... (Xba2) ... TTTCTTCTT. This suggests thatXbal, located in intron 8, was duplicated and reinserted intointron 7 of the human AFP gene, giving rise to Xba2 (Fig. 3).To our knowledge, an ancestral source DNA has not been

AluI ~>

0/* 1 86/ * 918 1548 A

I 129 52 1133 212 133 98 1V3

EcoRI Hind ]Iu Kpn EcoRI

D80 9 82 2290 a2( ® 01Gori I962wGrla129 62 133 212 133 98 130

Target Site

/4|' Alu BoundariesHuman Sequence: Term Repeat Tenn RepeatCAAAGGMACTGAACACTTAGGATGTTGTGGGGCCGGGCG ......C(Al) ...A... GGATGTTGTGGAAACATGTCTGCTTGCACAG

Gorilla Sequence: Target SiteCAAAGGTTTACTGAACACTTAGGATGTTGTGG (contiguous) AAACATGTCTGCTTGCACAG

FIG. 2. Comparison of restrictionmaps for the human and gorilla AFPgenes (exons 1-7). The map of the gorillagene shows the absence ofthe Alu repeat.The human AFP map is taken from Gibbset al. (11). Exons are shown as boxes andintrons are shown as heavy lines. Thesizes (bp) of exons and introns are alsoindicated. The gorilla Kpn repeat is ofidentical size and orientation as the hu-man Kpn repeat. Also shown is the Alutarget site in the gorilla gene. In the lowerhalf, the DNA sequence surrounding theAlu element is shown for both genes. Theterminal repeats (*) flanking the Alu ele-ment in one species (human) can be rec-

ognized as the unoccupied target site (*)in the other species (gorilla); a 12-bptarget site in the gorilla is identical to theterminal repeats surrounding the humanAlu sequence.

ATG

Human m 2 962

Kpn

.- - - I

Evolution: Ryan and Dugaiczyk

1- 9 m ~ ~ ~ l m 0 0

9362 Evolution: Ryan and Dugaiczyk

identified together with its derived copy previously. Furtherevidence was provided by a search of GenBank (March 1989)and genomic Southern blot analysis. These results suggestthat Xbal and Xba2 are the only two members of this familyin the human genome. The direct repeats surrounding Xba2are thought to have arisen during the event ofXba2 insertioninto its new position in the genome. Since then, the 5' directrepeat has undergone a single base mutation so that presentlythe two repeats are identical at 8 of 9 positions.

In the human AFP gene, both Xba repeats can be isolatedon a 5.5-kb EcoRI-HindIII fragment spanning from an EcoRIsite 5' of exon 8 to a HindIII site 3' of exon 11. Thecorresponding EcoRI-HindIII fragment in the gorilla AFPgene is only 5.2 kb in size (Fig. 3). This 300-bp size differencecould be explained by the absence of a single Xba repeat inthe gorilla AFP gene. Upon sequencing this 5.2-kb DNA, wediscovered that the gorilla AFP gene lacked the Xba2 repeatbut contained a single copy of the unoccupied target siteTTTC7TCTT, which is identical to the 3' direct repeatflanking Xba2 in the human AFP gene (Fig. 3). Sequenceanalysis has further shown an additional seven nucleotides,TGTCTTC, located 5' to the unoccupied target site in thegorilla AFP gene that are not present in the human AFP gene.A possible explanation is that the 7-bp sequence was orig-inally present in the human AFP gene but was deleted duringthe insertion ofXba2 into the new site. An alternative is that

Humn Sequence:

.4 Xb& Boundaries -* .

(303 bp)

the TGTCTTC arose by tandem duplication (in the gorillalineage) of part of the target sequence followed by a pointmutation (T -- G). These Xba repeats provide yet anotherexample ofa molecular marker that is present at a specific sitein the genome of one primate line but absent from another.

Population Studies. It is reasonable to assume that a newlyarisen genetic marker is originally polymorphic in a popula-tion, and it becomes established or eliminated only afterpassage of time and generations. Two examples of polymor-phic Alu elements have been reported in humans. One is inthe tissue plasminogen activator gene (23); the other is in theMlvi-2 locus in a B-cell lymphoma cell line (24). Although thelast two Alu repeats have been concluded to be of recentevolutionary origin (25), based on their sequence similarity tothe youngest Alu subfamily, it is not clear whether the Mlvi-2Alu repeat represents true polymorphism in human DNA ora somatic rearrangement related to tumor induction.

If the present Alu and Xba repeats are to be used asphylogenetic time markers, their established (or polymor-phic) state in the species must be known. To this end, weanalyzed restriction digests of genomic DNA from individualhumans and gorillas. The presence or absence of repetitiveDNA would be reflected in the sizes of such restrictionfragments. EcoRV digests of genomic DNA from 25 unre-lated Caucasians yielded a 3.6-kb fragment containing the Alurepeat located within intron 4 of the AFP gene; a 3.3-kb

A1TIGCATACAGTAr.TTTGTmAATACACT.ATAG ...... (XbT 1) ..... ATTATGTAAGCTAGAATAAAGTTCAGATTTAGGAGACA

Gorilla Sequence:AMGCAGTACAGTAGTTTGTTTTMAATACMCTGATAG ...... (Xba 1) ...... ATTATGTA.GCTAGAATAAAGTTCAGATTTAGGAGACA

/4 Xba Boundaries

Terminal (303 bp) Terinal

Hiwan Sequence: Repeat RepeatAAACGGGAGACAAGTTAAAAMTTCTTTTAATTGATAG ...... (Xba 2) ..... ATTATGTAAMCTTCTTCMCCTTCCT(:CTTCCCCC

TargetGorilla Sequence: Site

AAACGGGAGACAAGTT ¶CTTCTT (Contiguous) CMCCTTCCTCCTTCCCCC

TGTCTTC

FIG. 3. Comparison of restrictionmaps for the human and gorilla AFPgenes (exons 8-13). The map of the go-rilla gene shows the absence of Xba2.The human AFP map is taken from Gibbset al. (11). Exons are shown as boxes andintrons are shown as heavy lines. Thesizes (bp) of exons and introns are alsoindicated. The upper half shows the DNAsequence surrounding Xbal in bothgenes. The lower half shows the DNAsequence surrounding Xba2 in the humanlineage, including the flanking terminalrepeats (*), which can be recognized asthe unoccupied target site in the gorillalineage. The transposition of Xbal to anew location (Xba2) is indicated. Thenumbering of the human Xba repeats isreversed relative to our original publica-tion (11).

Proc. Natl. Acad. Sci. USA 86 (1989)

Proc. Natl. Acad. Sci. USA 86 (1989) 9363

1 2 3 4 5 6 7 8 9 10 11 12 STD Kb

*a-6.76.4- * t - - L6.4

* -4.4

3.6- W - - - _ .e *- * f -3.6-'-3.3

peats was used as the nick-translated probe. As can be seenfrom the resulting autoradiogram (Fig. 4), all 6 Caucasiansanalyzed possess both Xba repeats in their AFP genes. Also,all four gorillas analyzed (including the gorilla from whichsequence data were obtained) possess only Xbal in their AFPgenes. From these results, it appears that the Alu repeat andthe two Xba repeats are established (fixed) markers in thehuman lineage and so is Xbal in gorilla. A larger samplingmay be required to rigorously prove that they are trulypanspecific characters.

- -2.3

2 3 4 5 6 7 8 9 STD K b

-5.8-5.65.5-

5.2-d -4.8

_-3.5

FIG. 4. Southern hybridization of restriction digests of genomicDNA from unrelated individuals. (Upper) Autoradiogram ofEcoRVdigests of human genomic DNA hybridized to a nick-translatedgorilla (HindIII-EcoRI) 5.4-kb probe. All 12 individuals show hy-bridization in the 3.6-kb rather than the 3.3-kb region, indicative ofthe presence of the Alu repeat. The probe also recognizes an adjacentregion on the AFP map, giving rise to the 6.4-kb band seen in allindividuals. (Lower) Autoradiogram ofEcoRI plus HindlIl digests ofgenomic DNA hybridized to a nick-translated human 6.0-kb probe.Lanes 1-6, 10 jig of digested human genomic DNA; all six individualsshow hybridization in the 5.5-kb region, indicative of two Xbarepeats. Lanes 7-9, 10 tug of digested gorilla genomic DNA; the threeindividuals show hybridization in the 5.2-kb region, indicative of oneXba repeat. The upper bands in lanes 6-9 are due to incompleteHindI11 digests. STD, DNA size standards.

fragment would be expected if the Alu repeat was absent. Weused the gorilla 5.4-kb HindIII-EcoRI DNA fragment, whichencompasses the unoccupied Alu target site as the nick-translated probe. The resulting autoradiogram (Fig. 4) shows12 of the 25 individuals analyzed in this experiment. Withregard to the Xba repetitive elements, we performed a similarexperiment using EcoRI plus HindIII genomic digests of bothgorilla and human DNA. This double digest should yield a5.5-kb fragment if both Xbal and Xba2 repeats are present ora 5.2-kb fragment if only the ancestral sequence Xbal ispresent. A 6.0-kb human fragment encompassing both re-

DISCUSSIONTime of Insertion of Repeats. Our conclusion about a recent

origin ofthe Alu repeat in the human AFP gene draws supportfrom work by Willard et al. (26), Britten et al. (27), and Jurkaand Smith (28). These authors recognized that Alu repeatscan be subdivided into subfamilies, each being inserted intothe host genome at different times during evolution. Based ondiagnostic substitutions that are shared among Alu subfam-ilies, Britten et al. (27) published a consensus sequence ofwhat they consider the youngest (class IV) subfamily of Alurepeats. The Alu repeat within the human AFP gene differsfrom this class IV consensus by only 4 of 283 positions(1.4%), excluding 5 of 283 mutations in noninformative CpGhot spots (29), and it has retained 12 of the 13 mutationsconsidered diagnostic for this youngest Alu subfamily (Fig. 5)From this high degree of sequence similarity between Aluclass IV consensus and the present Alu-1, it can be inferredthat both belong to the same subfamily, whereas the conclu-sion about their young age has been reached from twoindependent considerations.The Xbal and Xba2 repeats in the human AFP gene appear

to be the only copies found in the human genome (Fig. 4).Since Xbal lacks direct repeats, it is reasonable to assume thatXbal was duplicated and reinserted into the human genome,resulting in Xba2. To our knowledge, both the source (ances-tor) DNA and the derived repeat have not been identifiedpreviously. The target site for Xba2 is at a sharp boundarybetween a purine- and a pyrimidine-rich region, but we do notknow what triggered the movement of the Xba element orwhether it was an RNA-mediated or a DNA-mediated trans-position. The repeats represent a single duplication event ofunknown and possibly very novel nature. If it was not for thepresence of Xba2, Xbal would be difficult to discern fromnonmobile parts of chromosomal DNA. Unlike the randomretroposition of Alu sequences, amplification of the Xbaelement is restricted to the same gene as the parental se-quence, although this might be the first event in the rise of anovel repeat element. The human Xba elements differ in 3 of303 positions, whereas human Xbal differs from gorilla Xbalin 2 of 303 positions, indicating a recent separation ofthe threesequences. The degree of homology (99%o) between the twohuman Xba elements suggests that Xbal gave rise to Xba2perhaps less than 6 million years ago, and postdating the timeof human-gorilla divergence. It is, a priori, possible that Xba2also existed in the gorilla lineage and was subsequently de-leted, but the young age ofXba2 argues against this possibility.

Alu-1 20 40 60 80 100GGCCGGGCGCGGTGGCTCACGCTTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCTGGCTAACATGG

GGCCGGGCGCGGTGGCTCACGCCTGTMATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCTGGCTAACACGGCon-IV * * * * *

120 140 160 180 200TGMACCCCGTCTCTACTAAAAATACAAAAAA--TTAGCCGGGCGTGATGGTGGGCGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATGGC

TGAAACCCCGTCTCTACTMAMATACMAMAA- -TTAGCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATGGC* ** * **

220 240 260 280GTGAACCCTGGAGGCGGAGCTTGCAGTGAGCCGAGATTGCGCCACTGCACTCCCGCCTGGGCCACAGAGCGAGACTCCGTCTC

GTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCGCGCCACTGCACTCCAGCCTGGGCGACAGAGCGAGACTCCGTCTC

FIG. 5. Comparison of the AluDNA sequence from intron 4 ofthe human AFP gene (Alu-1) withthe youngest Alu subfamily con-sensus sequence (Con-IV) as de-scribed by Britten et al. (27). As-terisks (*) show positions of the 13diagnostic mutations used to dis-tinguish a class IV repeat.

Evolution: Ryan and Dugaiczyk

9364 Evolution: Ryan and Dugaiczyk

However, it is quite possible that the Xba duplication tookplace prior to the human-chimpanzee separation, rather thanbeing unique to the human lineage.

Reconstructing Phylogenies from Differences in DNA. Thereis a need for identifying nonrecurrent, irreversible events inthe genomes of species that would serve as informativemarkers of evolution. After years of effort in reconstructingthe phylogeny of higher primates from DNA mutations, anumber of conflicting phylogenies have been published,leading to an impasse (for review, see refs. 30 and 31). Whycan't we decide? is the question of the time (32). The answerappears to be: Because of the mutations' reversibility. Givenenough time, consecutive mutations occur at the same site inone species, and parallel mutations occur independently atthe same (orthologous) site in two species. This multiplicityof changes makes it difficult to decide whether in a contem-porary DNA sequence a given nucleotide represents theancestral or the derived state of the character. Phylogenetictrees reconstructed from such reversible events are hencestatistical trees, and so the uncertainty remains as to whetherstatistics can reveal the true phylogeny of species.

Mobility of Repetitive DNA Elements. There are transpos-able elements-e.g., THE-1 (33)-that transfer themselvesfrom one chromosomal site to another. This is accomplishedin a controlled mode of movement, in that an enzyme(transposase), encoded by the element, and the element'slong (350 bp) terminal repeats participate in the reversibleintegration/excision mechanism. Alu elements belong to adifferent category. They are pseudogenes that appear to haveoverrun the genome by means of retroposition from a con-served source gene (27). Over the last 60 million years or so,an impressive number of 0.5 million copies have accumulatedin the genome of the human lineage. Multiple examples ofdeletions due to unequal crossover at two Alu repeats havebeen reported, but we have yet to find a deletion of a singleAlu element. Although an excision of a circularized Alusequence by means of recombination at its terminal repeatsis plausible, it is probably a rare event compared to the vastnumber of interspersed Alu elements in chromosomal DNA.The terminal repeats are short (-5-15 bp) and diverge in timeand apparently are not very efficient recombination sites. Theinfrequency or lack of single Alu deletions can be alsoinferred from studies in the globin gene locus of numerousprimates (34): when a specific Alu repeat could be identifiedas an ancient insertion in an ancestor of apes and humans, itwas found to be present at the same chromosomal site incontemporary primates, such as human, chimpanzee, gorilla,and orangutan. Thus, the globin Alu element resides at thesame position for some 40 million years in at least four linesof evolution. Based on the above arguments, and untilevidence to the contrary is found, we submit that inserted Alurepeats become rather frozen in the recipient genome. Theymay be subsequently deleted in unequal crossover events,but it appears they are not excised in a reversible manner.Another intriguing feature of the Alu spread is an apparent

lack of control in the process. Viral infections are under thesurveillance of the immune system; other transposable ele-ments appear to be under a feedback control that limits theirnumber. The spread of Alu repeats (and probably otherpseudogenes) seems to be driven by the propensity of DNAto interact with itself. Indications exist (26) that the 0.5million copies were transposed not at a constant rate in timebut rather in sporadic transposition bursts. As we pointed outpreviously (1), such bursts could have been cataclysmic inthat the burden of their mutagenic load could become a wayof death in the extinction of species, or perhaps a way ofbranching off new lines of evolution.

Spread of Repeats as Time Markers in Evolution. In thepresent work we have demonstrated that two repeated DNAelements are present in the human genome but absent from

orthologous sites in the gorilla. We postulate that spreadingof most, if not all, repetitive Alu elements through thegenomes of species is a unidirectional process, stemmingfrom an irreversible mechanism of their integration into newsites. These rather infrequent (as compared to mutations) andirreversible events may provide reliable records of thebranching order in primate evolution.

We thank Rita Zielinski for help in DNA sequence analysis. Wealso thank Dr. Richard Gatti and Patrick Charmley for providinghuman genomic DNAs and for performing some of the Alu hybrid-izations with a probe provided by us. In addition, we appreciate thecritique of this manuscript and the assistance in the GenBank searchby Drs. Peter Gibbs and Phillip Minghetti. This work was supportedby Grant iRO1 HD19281 from the National Institutes of Health.

1. Ruffner, D. E., Sprung, C. N., Minghetti, P. P., Gibbs, P. E. M. &Dugaiczyk, A. (1987) Mol. Biol. Evol. 4, 1-9.

2. Jelinek, W. R., Toomey, T. P., Leinwand, L., Duncan, C. H., Biro,P. A., Choudary, P. V., Weissman, S. M., Rubin, C. M., Dein-inger, P. L. & Schmid, C. W. (1980) Proc. NatI. Acad. Sci. USA 77,1398-1402.

3. Ullu, E. & Tschudi, C. (1984) Nature (London) 312, 171-172.4. Schoeninger, L. 0. & Jelinek, W. R. (1986) Mol. Cell. Biol. 6,

1508-1519.5. Schmid, C. W. & Shen, C.-K. (1986) in Molecular Evolutionary

Genetics, ed. MacIntyre, R. J. (Plenum, New York), pp. 323-358.6. Weiner, A. M., Deininger, P. L. & Efstratiadis, A. (1986) Annu.

Rev. Biochem. 55, 631-661.7. Hwu, H. R., Roberts, J. W., Davidson, E. H. & Britten, R. J.

(1986) Proc. Natl. Acad. Sci. USA 83, 3875-3879.8. Sawada, I., Willard, C., Shen, C.-K. J., Chapman, B., Wilson,

A. C. & Schmid, C. W. (1985) J. Mol. Evol. 22, 316-322.9. Koop, B. F., Miyamoto, M. M., Embury, J. E., Goodman, M.,

Czelusniak, J. & Slightom, J. L. (1986) J. Mol. Evol. 24, 94-102.10. Trabuchet, G., Chebloune, Y., Savatier, P., Lachuer, J., Faure, C.,

Verdier, G. & Nigon, V. M. (1987) J. Mol. Evol. 25, 288-291.11. Gibbs, P. E. M., Zielinski, R., Boyd, C. & Dugaiczyk, A. (1987)

Biochemistry 26, 1332-1343.12. Maxam, A. M. & Gilbert, W. (1980) Methods Enzymol. 65,499-560.13. Southern, E. M. (1975) J. Mol. Biol. 98, 503-517.14. Deininger, P. L., Jolly, D. J., Rubin, C. M., Friedmann, T. &

Schmid, C. W. (1981) J. Mol. Biol. 151, 17-33.15. Van Ardsell, S. W., Denison, R. A., Bernstein, L. B. & Weiner,

A. M. (1981) Cell 26, 11-17.16. Jagadeeswaran, P., Forget, B. G. & Weissman, S. M. (1981) Cell

26, 141-142.17. Jagadeeswaran, P., Tuan, D., Forget, B. G. & Weissman, S. M.

(1982) Nature (London) 296, 469-470.18. Lehrman, M. A., Russel, D. W., Goldstein, J. L. & Brown, M. S.

(1987) J. Biol. Chem. 262, 3354-3361.19. Myerowitz, R. & Hogikayan, N. D. (1987) J. Biol. Chem. 262,

153%-19399.20. Markert, M. L., Hutton, J. J., Wiginton, D. A., States, J. C. &

Kaufman, R. E. (1988) J. Clin. Invest. 81, 1323-1327.21. Rouyer, F., Simmler, M.-C., Page, D. C. & Weissenbach, J. (1987)

Cell 51, 417-425.22. Huang, L.-S., Ripps, M. E., Korman, S. H., Deckelbaum, R. J. &

Breslow, J. L. (1989) J. Biol. Chem. 264, 11394-11400.23. Friezner Degen, S. J., Rajput, B. & Reich, E. J. (1986) J. Biol.

Chem. 261, 6972-6985.24. Economou-Pachnis, A. & Tsichlis, P. N. (1985) Nucleic Acids Res.

13, 8379-8387.25. Deininger, P. L. & Slagel, V. K. (1988) Mol. Cell. Biol. 8, 4566-

4569.26. Willard, C., Nguyen, H. T. & Schmid, C. W. (1987) J. Mol. Evol.

26, 180-186.27. Britten, R. J., Baron, W. F., Stout, D. B. & Davidson, E. H. (1988)

Proc. Natl. Acad. Sci. USA 85, 4770-4774.28. Jurka, J. & Smith, T. (1988) Proc. Natl. Acad. Sci. USA 85,

475-478.29. Bains, W. (1986) J. Mol. Evol. 23, 189-199.30. Lewin, R. (1988) Science 241, 1598-1600.31. Lewin, R. (1988) Science 241, 1756-1759.32. Holmquist, R., Miyamoto, M. M. & Goodman, M. (1988) Mol. Biol.

Evol. 5, 201-216.33. Paulson, K. E., Deka, N., Schmid, C. W., Misra, R., Schindler,

W., Rush, M. G., Kadyk, L. & Leinwand, L. (1985) Nature(London) 316, 359-361.

34. Fitch, D. H. A., Mainone, C., Slightom, J. L. & Goodman, M.(1988) Genomics 3, 237-255.

Proc. Natl. Acad. Sci. USA 86 (1989)